Below are recommendations for data processing and analysis needs for complex data transformations. 

Every CommCare project must eventually interpret the Case and Form data collected by mobile workers. There are two key questions embedded in the design of a data pipeline: the method of export (eg. basic export interface) and the automation of analysis (eg. VBA queries). Both of these assumptions should be regularly revisited in projects that are planning to do large-scale data analysis (~50,000+ rows exported at a time). 

Method of Export

 Export MethodScaleRequirements
E1Basic export interface0 - 50,000 rowsNone
E2Daily Saved Exports*0 - 500,000 rowsNone
E3CommCare Data Export Tool writes to an Excel file10,000 - 500,000 rows
  • Excel license
  • CommCare Data Export Tool set up
E4CommCare Data Export Tool writes to a database50,000 - 1,000,000+ rows
  • CommCare Data Export Tool set up
  • Database installed and configured
  • Best practice: Dedicated server to run CommCare Data Export Tool

*Note: Daily Saved Exports are pre-compiled data exports. This means that when you go to CommCare HQ, you can download fresh data immediately instead of waiting for a new file to be generated.


Automation of analysis

 Analysis approachExport methodScaleRequirements
A1Export into Excel for manual analysis*E1, E2, E30 - 200,000 rows
  • Excel license
A2Export into Excel and use macros for analysisE1, E2, E31,000 - 200,000 rows
  • Excel license
  • VBA expertise
  • Helps to have a general sense of computational complexity to avoid performance concerns
A3

Export into a CSV and use either: a scripting language (Python, Ruby, Perl, etc), stats package (Stata, SPSS, SAS, R, MATLAB, etc), or business intelligence software (Tableau, Google Fusion Tables) for analysis

 

E1, E2, E350,000 - 1,000,000+ rows
  • Analysis software or programming language installed
  • Programming background
A4Export into a database and use database queries (SQL, etc) for analysisE450,000 - 1,000,000+ rows
A5Export into a database and use a web service to dynamically query the databaseE450,000 - 1,000,000+ rows
  • Database installed and configured
  • CommCare Data Export Tool set up
  • Web service installed and running on servers
  • Dedicated software engineer

*Note: Depending on the complexity of indicators being calculated, this option does go beyond pivot table capabilities and not be a viable option regardless of the number of rows.