Household survey often results in dataset with over 300 variables to process & explore. In Humanitarian Environment, deadlines to get insights from those dataset are often tight. Manual processing is very lengthy and can be done only for a limited part of the dataset. Often, because of those challenges, a lot of potential insights are not discovered. This package is developped to support this challenge around data crunching. It’s part of the initiative around an Integrated Framework for Household Survey (IFHS): A toolkit to facilitate design, collection & analysis.
KoboloadeR packages aims at separating “input”, “processing” and “output” within the data crunching phase of the data analysis worklfow.
The “output” will be one or multiple Rmd (Rmarkdown) file(s) than will generate word, pdf or html reports and the configuration file includes references to all “input”:
Path to raw data files collected using OpenDataKit, Kobotoolbox or ONA
Path to form (defined using the standard format xlsform) in order to build a data dictionary
Path to the sample weight for each observation (based on cluster or strata…)
Path to the data cleaning log
Path to the indicator calculation sheet
Productivity: Once the configuration file is written, run the script in Rstudio to get the output
Training: No need to write R instruction – limited knowledge of R is required
Iteration: Check the output, adjust the various input files & re-run the script till you get a satisfying report
Reproducibility: all analysis input are de facto documented
KoboLoadeR takes care of the processing component so that the technical team can focus on the interpretation.
The koboloadeR
package allows to:
connect to the KoBo API (v1) for the KoBo Toolbox project.
compute a data dictionnary based on xlsform. It implies ot have a few additionnal column in the xlsform in order to better define how data shoudl be analysid (cf infra). as it based on a standard, this part and the following should work for any xlsform compatible server such as OpenDataKit or ONA
generate automatically of a series of charts & maps based on the data dictionnary
generate automatically of a series of charts & maps based on a formatted data analysis plan
access to a series of dedicated Shiny application to set up your project
support the full range of data science level
Frequency tables & Bar chart for select type questions
Frequency tables & Histogram for numeric questions
Frequency table for text questions
Cross-tab & graph (if 2 categorical: bar chart, if 1 categoric + 1 numeric: boxplot & if 2 numeric: scatterplot)
Chi-squared test & corrplot presentation
Mapping if geographic field are configured (still in development)
and more to come…