Research Topics

Publication Date: 01/16/2019
Type: Data Analysis

Self-Help Groups (SHGs) in Sub-Saharan Africa can be defined as mutual assistance organizations through which individuals undertake collective action in order to improve their own lives. “Collective action” implies that individuals share their time, labor, money, or other assets with the group. In a recent EPAR data analysis, we use three nationally-representative survey tools to examine various indicators related to the coverage and prevalence of Self-Help Group usage across six Sub-Saharan African countries. EPAR has developed Stata .do files for the construction of a set of self-help group indicators using data from the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA), Financial Inclusion Index (FII), and FinScope.

We compiled a set of summary statistics for the final indicators using data from the following survey instruments:

  • Ethiopia:
    • Ethiopia Socioeconomic Survey (ESS), Wave 3 (2015-16)
  • Kenya:
    • Kenya FinScope, Wave 4 (2015)
    • Kenya FII, Wave 4 (2016)
  • Nigeria
    • Nigeria FII, Wave 4 (2016)
  • Rwanda:
    • Rwanda FII, Wave 4 (2016)
  • Tanzania:
    • Tanzania National Panel Survey (TNPS), Wave 4 (2014-15)
    • Tanzania FinScope, Wave 4 (2017)
    • Tanzania FII, Wave 4 (2016)
  • Uganda:
    • Uganda FinScope, Wave 3 (2013)
    • Uganda FII, Wave 4 (2016)

The raw survey data files are available for download free of charge from the World Bank LSMS-ISA website, the Financial Sector Deepening Trust website, and the Financial Inclusion Insights website. The .do files process the data and create final data sets at the household (LSMS-ISA) and individual (FII, FinScope) levels with labeled variables, which can be used to estimate summary statistics for the indicators.

All the instruments include nationally-representative samples. All estimates from the LSMS-ISA are household-level cluster-weighted means, while all estimates from FII and FinScope are calculated as individual-level weighted means. The proportions in the Indicators Spreadsheet are therefore estimates of the true proportion of individuals/households in the national population during the year of the survey. EPAR also created a Tableau visualization of these summary statistics, which can be found here.

We have also prepared a document outlining the construction decisions for each indicator across survey instruments and countries. We attempted to follow the same construction approach across instruments, and note any situations where differences in the instruments made this impossible.

The spreadsheet includes estimates of the following indicators created in our code files:


  • Proportion of individuals who have access to a mobile phone
  • Proportion of individuals who have official identification
  • Proportion of individuals who are female
  • Proportion of individuals who use mobile money
  • Proportion of individuals who have a bank account
  • Proportion of individuals who live in a rural area
  • Individual Poverty Status
    • Two Lowest PPI Quintiles
    • Middle PPI Quintile
    • Two Highest PPI Quintiles

Coverage & Prevalence

  • Proportion of individuals who have interacted with a SHG
  • Proportion of individuals who have used an SHG for financial services
  • Proportion of individuals who depend most on SHGs for financial advice
  • Proportion of individuals who have received financial advice from a SHG
  • Proportion of households that have interacted with a SHG
  • Proportion of households in communities with at least one SHG
  • Proportion of households in communities with access to multiple farmer cooperative groups
  • Proportion of households who have used an SHG for financial services

In addition, we produced estimates for 29 indicators related to characteristics of SHG use including indicators related to frequency of SHG use, characteristics of SHG groups, and individual/household trust of SHGs.

EPAR Technical Report #354
Publication Date: 11/29/2018
Type: Research Brief

Precise agricultural statistics are necessary to track productivity and design sound agricultural policies. Yet, in settings where intercropping is prevalent, even crop yield can be challenging to measure. In a systematic survey of the literature on crop yield in low-income settings, we find that scholars specify how they estimate the yield denominator in under 10% of cases. Using household survey data from Tanzania, we consider four alternative methods of allocating land area on plots that contain multiple crops, and explore the implications of this measurement decision for analyses of maize and rice yield. We find that 64% of cultivated plots contain more than one crop, and average yield estimates vary with different methods of calculating area planted. This pattern is more pronounced for maize, which is more likely than rice to share a plot with other crops. The choice among area methods influences which of these two staple crops is found to be more calorie-productive per ha, as well as the extent to which fertilizer is expected to be profitable for maize production. Given that construction decisions can influence the results of analysis, we conclude that the literature would benefit from greater clarity regarding how yield is measured across studies.

EPAR Technical Report #240
Publication Date: 07/28/2016
Type: Data Analysis

There is a wide gap between realized and potential yields for many crops in Sub-Saharan Africa (SSA). Experts identify poor soil quality as a primary constraint to increased agricultural productivity. Therefore, increasing agricultural productivity by improving soil quality is seen as a viable strategy to enhance food security. Yet adoption rates of programs focused on improving soil quality have generally been lower than expected. We explore a seldom considered factor that may limit farmers’ demand for improved soil quality, namely, whether farmers’ self-assessments of their soil quality match soil scientists’ assessments. In this paper, using Tanzania National Panel Survey (TZNPS) data, part of the Living Standards Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA), we compare farmers’ own assessments of soil quality with scientific measurements of soil quality from the Harmonized World Soil Database (HWSD). We find a considerable “mismatch” and most notably, that 11.5 percent of survey households that reported having “good” soil quality are measured by scientific standards to have severely constrained nutrient availability. Mismatches between scientific measurements and farmer assessments of soil quality may highlight a potential barrier for programs seeking to encourage farmers to adopt soil quality improvement activities. 

EPAR Technical Report #331
Publication Date: 06/20/2016
Type: Data Analysis

Labor is one of the most productive assets for many rural households in developing countries. Despite the importance of labor—and time use more generally—little research has empirically examined the quality of time-use data in household surveys. Many household surveys rely on respondent recall, the reliability of which may decrease as recall length increases. In addition, respondents often report on time allocation for the entire household, which they may not know or recall as clearly as their own time allocation. Finally, simultaneous activities such as tending children while preparing dinner, may lead to the systematic underestimation of certain activities, particularly those that tend to be performed by women. This paper examines whether the identity of the survey respondent affects estimates of time allocation within the household. Drawing on the Ugandan LSMS-ISA household survey, we find that individuals responding for themselves report higher levels of time use over the previous week than when responding for other household members. Moreover, male respondents tend to underreport time allocation for females over the age of 15 as compared to female respondents, especially time spent on domestic activities. In addition, an analysis of the effects of two economics shocks—having a baby and floods or droughts—suggests that the identity of the respondent can affect substantive conclusions about the effects of shocks on household time use.


EPAR Research Brief #242
Publication Date: 01/08/2014
Type: Data Analysis

The purpose of this analysis is to provide a measure of marketable surplus of maize in Tanzania. We proxy marketable surplus with national-level estimates of total maize sold, presumably the surplus for maize producing and consuming households. We also provide national level estimates of total maize produced and estimate “average prices” for Tanzania which allows this quantity to be expressed as an estimate of the value of marketable surplus. The analysis uses the Tanzanian National Panel Survey (TNPS) LSMS – ISA which is a nationally representative panel survey, for the years 2008/2009 and 2010/2011. A spreadsheet provides our estimates for different subsets of the sample and using different approaches to data cleaning and weighting. The total number of households for Tanzania was estimated with linear extrapolation based on the Tanzanian National Bureau of Statistics for the years 2002 and 2012. The weighted proportions of maize-producing and maize-selling households were multiplied to the national estimate of total households. This estimate of total Tanzanian maize-selling and maize-producing households was then multiplied by the average amount sold and by the average amount produced respectively to obtain national level estimates of total maize sold and total maize produced in 2009 and 2011.