-
Notifications
You must be signed in to change notification settings - Fork 0
D Annex Documentation of referential and questionnaires
The referential
is the central piece of the {SurveyDesigner}. It is centrally managed to preserve its integrity and includes the documentation of both indicators, linked questions and geographic referential.
The referential
is then translated into different context
for each country. This encompasses the adjustement of labels for both questions and modalities. A context can also includes ad-hoc questions and indicators.
Questionnaires
are essentially generated out of the context
matching the country and includes one or more than one XlsForm
, i.e the definition of a form that can be used within multiple types of data collection servers
Indeed, there might be different series of distinct referential
, each matching very different data collection methodology (type of sampling, representativeness of the person interviewed).
referential_id
type
description
For instance:
-
household_survey
- a set of indicator to be measure on stock population -
flow_monitoring
- a set of indicators specific to population in transit or in the move - not planning to establish an habitual residence for more than a year -
key_informant
- a set of indicators (mostly qualitative) collected from "persons with knowledge" -
beneficiary_monitoring
- a set of indicators to be collected on regular basis from the beneficiaries of a specific programme
For the initial prototype, we will focus on household_survey
. Though the business logic should work independently from the type of referential
, making the application re-usable and adaptable in multiple contexts and potentially different organizations.
Once loaded, the referential should be searchable in a similar way than the International Household Survey Question Bank (see also UNHCR Question Bank node) but with a "ready to use" solution for the design of survey instruments (i.e. already adapted to the specific implementation context).
Below is a description of each worksheet within the spreadsheet. A link to an existing standard is included whenever possible.
For the questions part, the referential will match the standard xlsform structure with additional column
referential_id
-
type
xlsform Ref -
name
unique id for the variable - defined as a standard global codebook -
label
- This is the master version for the label - defined in English per convention and defaults - The labels can be translated and contextualized for each country based on predefined instructions -
hint
xlsform Ref - This is the master version for the hint label - defined in English -
required
xlsform Ref -
required_message
xlsform Ref - defined in English - The labels can be contextualized for each country based on predefined instructions -
constraint
xlsform Ref -
constraint_message
xlsform Ref defined in English - The labels can be contextualized for each country based on predefined instructions -
relevant
xlsform Ref -
appearance
xlsform Ref -
calculation
xlsform Ref -
trigger
xlsform Ref -
parameters
xlsform Ref -
repeat_count
xlsform Ref -
default
xlsform Ref -
read_only
xlsform Ref -
choice_filter
xlsform Ref -
media::image
xlsform Ref -
contextualize
boolean indicates if contextualizing the label is allowed -
contextualize_instruction
string in case contextualizing the label is allowed, instruction to follow to guide the contextualization -
block
This a way to note how different questions should remain grouped together - See also https://xlsform.org/en/#grouping-questions -
block_sequence
integer defining the sequence for the this block within the interview flow -
sequence
integer defining the sequence for the variable within the block -
mode
factor with possible value being ALL, CAPI, CATI, and CAWI, which stand for Computer Assisted Personal Interviewing, Computer Assisted Telephone Interviewing, and Computer Assisted Web Interviewing. cf an explanation here -
check
define what type, if any, of High Frequency Check should be applied using this variable -
accuracy
expected level of accuracy for the indicators - can be used to prioritize the indicators over multiple data collection waves -
chapter
string defines a high level research question - used to group variables together when generating the automatic data exploration report with kobocruncher -
subchapter
string defines a high level sub research question - used to group variables together when generating the automatic data exploration report with kobocruncher -
labelReport
- This should be a short label for the variable - less than 80 char - will be used for reporting purpose so that it displays well in a chart - used by kobocruncher -
hintReport
Can be used to provide a longer description of the variable - as well as potential green, orange, red standard threshold value for the variable when interpreting it.. - used by kobocruncher -
keyword
list of associated keyword as defined in RIDL - cf schema
Also many responses options should map established classification
referential_id
-
list_name
this should be referenced withintype
when type starts withselect_one
orselect_multiple
-
name
unique id for the modality - defined as a standard global codebook -
label
- This is the master version for the label - defined in English - The labels can be contextualized for each country based on predefined instructions -
order
integer - if null the variable is not considered as ordinal - used by kobocruncher -
labelReport
- This should be a short label for the modality - less than 40 char - will be used for reporting purpose so that it displays well in a chart - used by kobocruncher -
contextualize
boolean indicates if contextualizing the label is allowed -
contextualize_instruction
string in case contextualizing the label is allowed, instruction to follow to guide the contextualization
indicators represent variables that are calculated from the variables directly collected. indicators can be the final metrics used for analysis or be simply auxiliary, meaning used fro disaggregation cf Question Bank
referential_id
-
type
should be eitherselect_one
ornumeric
-
name
unique id for the indicator - defined as a standard global codebook -
labelReport
- This should be a short label for the indicators - less than 80 char - will be used for reporting purpose so that it displays well in a chart - used by kobocruncher -
hintReport
Can be used to provide a longer description of the indicators - as well as potential green, orange, red standard threshold value for the indicator when interpreting it.. - used by kobocruncher -
list_name
in case indicator results is discrete, make a refence within the choices elements of the labels to use -
repeatvar
in case of multiple frame in the dataset, indicate in wich frame the indicator should be appended -
ind_type
defines if the indicator defines apopulation
, adisaggregation
, afinal
or only anauxiliary
(meaning an intermediate calculation done to build the final indicators) -
sequence
integer - use to define an order - important to ensure that auxiliary variables are created first in order to calculate the final indicators -
block
This a way to note how different indicators should be consistently calculated together - it will ease quick selection of multiple indicators with one single instructions - for instance all indicators that linked to the same selected impact or outcome -
chapter
string defines a high level research question - used to group variables together when generating the automatic data exploration report with kobocruncher -
subchapter
string defines a high level sub research question - used to group variables together when generating the automatic data exploration report with kobocruncher -
calculation
R statement used to create the indicator based on the standard global codebook and assuming that the data object that will be build from the dataset exported from kobo is a kobcrunhcer datalist -
unit
string unit for the indicator -
accuracy
expected level of accuracy for the indicators - can be used to prioritize the indicators over multiple data collection waves -
mode_CAPI
bolean indicates if the indicator can be collected with CAPI - Computer Assisted Personal Interviewing- then requires that there's an entry inchoices[["mode"]]
with this specific mode -
mode_CATI
bolean indicates if the indicator can be collected with CATI - Computer Assisted Telephone Interviewing - then requires that there's an entry inchoices[["mode"]]
with this specific mode -
mode_CAWI
bolean indicates if the indicator can be collected with CAWI - Computer Assisted Web Interviewing - then requires that there's an entry inchoices[["mode"]]
with this specific mode -
metadata
indicates limitation for the indicator - this provides in-depth documentation on the indicator concept and methodology -
link
provides a link to any established official documentation on the indicator -
keyword
list of associated keyword as defined in RIDL - cf schema
This table allows to ensure that we have all required variables (aka survey questions) to calculate the indicator:
referential_id
-
name
unique id for theindicator
-
name_survey
unique id for the variable as defined insurvey
This table allows to ensure that we have all required modalities (aka response options) for the questions used to calculate the indicator:
referential_id
-
name
unique id for the indicator -
name_choices
unique id concatenating thelist_name
andname
fromchoices
-
This table maps the relation one to many between one indicator and all the population group that the indicator can apply to.
referential_id
-
name
unique id for the indicator -
name_population
factor - for instance for household survey this can be either "Refugees (REF)", "Asylum seekers (ASY)" , "Internally displaced persons (IDP)", "Other people in need of international protection (OIP)", "Stateless Persons (STA)", "Others of concern to UNHCR (OOC)" or "Host community (HCT)"
This table maps the relation one to many between one indicator and all the potentially expected disaggregation variables (aka another indicator or a survey name) that the indicator can apply to.
referential_id
-
name
unique id for the indicator -
name_disaggregation
reference either anindicator
or directly a variable fromsurvey
. Should be a factor - for instance in household survey could be amongAge
,Gender
,Disability
,Site
A context reflects the implementation of the referential within a specific country or operation. context are expected to enforce full data integrity with the main referential
, meaning that all centrally defined indicators should have their corresponding survey questions and response available in all context
context_id
-
region
tag the region associated with the country - used to identify the relevant Regional Survey Support -
country
country iso code alpha3 - also defined in the choices table -
geo
additional geographic_id
This table maps the relation one to many between one country and all the languages that can be used in that context.
context_id
-
language
should comply with language referential from Internet Assigned Numbers Authority (IANA) - once defined here - this same language options should be available forsurvey
andchoices
Once language requirement have been defined for each country - then it should be contextualized, in line with Global Recommendations, and through a dialog between the relevant Regional Survey Support and the Operation Survey Focal Point. This can include the addition of ad-hoc context specific questions.
context_id
name
-
language
the suffix indicates the language and should comply with language referential from Internet Assigned Numbers Authority (IANA) - this column can be repeated for as many language as needed label
hint
required_message
constraint_message
-
question_type
defines if this an ad-hoc context specific questions that was created during the context creation through a dialog between the regional survey support and the operation survey coordinator contextualization_note
-
duration
integer represent the number of second necessary to read the questions and the linked answers (if the question is type select). This can be estimated by a function like interview_duration. This variable is used to define if the total form interview remains within an acceptable total duration - aka 40 to 50 minutes and eventually suggest to spit the questionnaire within multiple data collection wave
Once language requirement have been defined for each context - then it should be contextualized, in line with Global Recommendations, and through a dialog between the relevant Regional Survey Support and the Operation Survey Focal Point.
Note that the geographic referential is managed under choices - it can be filtered with the 2 dedicated list_name
: country
& admin1
. Geographic referential name should align with the Common Operational Dataset Pcode
context_id
list_name
name
-
language
the suffix indicates the language and should comply with language referential from Internet Assigned Numbers Authority (IANA) - this column can be repeated for as many language as needed label
contextualization_note
They represent an object with one or more than one fully compliant xlsform objects
, created out of the referential.
All of the xlsform objects
should be valid - Validation can be made using a dedicated python package pyxform
Each single xlsform objects
should at least the combination of data collection mode and wave - for instance CATI_wave1 , CAPI_wave1, CAPI_wave2, etc.
In order to build the questionnaires
from the referential , the survey manager will need to set up a series of filters and include some additional information for contextualization:
-
select which
context
to use - i.e. select onecountry
- this will filter the questions and the choices options accordingly as well as the languages translation -
select target
population
- This will filters what indicators can be calculated -
select
topic
and then linked multipleindicators
or indicatorsblock
or survey questions - cf above definition of indicators - this will select only from non -auxiliary
indicators -
select data collection
mode
-
indicate how many data collection
waves
can be organized. This shall be done based on the analysis of questionnaireduration
and indicatoraccuracy
-
Define the
settings
- see xlsform Ref within each sub-questionnaires -
Adjust the defaults
block_sequence
Once finalized, a summary is generated to document the Annual Survey Management Cycle. The summary can highlight the main customization by doing a final comparison between the xlsform and the referential using xlsform_compare
A series of xlsform files, each of them paired with their pretty-print word version, using render_prettyprint can then be exported in order to be piloted and revised in the operation.