Where Does Data Come From?

August 17, 2017

by Kelly Dale

I have started to think of a perfect data set as a beautiful cake. When the consumer eats the cake, she rarely thinks about the individual ingredients or the work that went into creating the final product. She doesn’t think about the quality of the flour or where the eggs came from. She doesn’t think about the amount of time that went in to perfecting the recipe or the person who tediously iced the cake once it was finished baking. Like this consumer, I did not think much about my data when I first started using Stata. I just ate the cake. I did not think about how each variable was created, the quality of the surveys used to obtain the data, or the process used to determine the sample. Only late in the year did I learn what it took to “ice the cake”, or clean the data before it could be used for analysis.

I have spent the past 8 weeks learning how to bake a beautiful data cake.

Children play soccer in the street.
Playing soccer outside of the SWEDD office.

This summer, I am in Côte d’Ivoire with the World Bank Gender Innovation Lab working on a randomized impact evaluation for the Sahel Women’s Economic Empowerment and Demographic Dividend (SWEDD) project. We are about to launch the baseline assessment in 280 villages in the northern part of the country. We will survey over 5,000 adolescent girls on their aspirations, economic activities and access to finance, reproductive health status and knowledge, relationships and attitudes towards gender-based violence, and education. Creating a survey that will be able to fully and accurately capture the current situation and eventual impact of the program is just one ingredient in the cake. However, this one ingredient took many hours and many people to ensure that our questions are posed in a culturally relevant way, that what we ask cannot be misinterpreted, and that we ask enough questions to get a full picture without taking up too much of the girls’ time. We will test and refine this ingredient before baking the real cake, by piloting our questionnaires in the field and making any needed adjustments.

Another ingredient, and one that I have worked on extensively, is the sampling and targeting of program beneficiaries. This part of the cake is crucial for impact evaluations to ensure that any identified impact at follow up is due to the program, and not other factors. I was given a long list of villages which contained information such as general location (region), population size, schooling rates, prevalence of early marriage and teen pregnancy, etc.; a separate list of villages and coordinates which only sometimes matched the original list; shapefiles for some of the big cities but not the whole country; and the number of villages to select, based off of previous power calculations. With this information, I had to merge data, create new variables, learn how to use shapefiles, learn how to use R, map the villages, track down missing data at various government entities, select a sample based off of a number of characteristics, and then re-do this sample multiple times after conversations with government officials. After the sample was set, we had to randomize the schools and villages into various treatment and control groups. I got to learn about the importance of selecting the best stratifying variables and about maintaining balance along a number of characteristics. Finalizing this ingredient was tedious (it took weeks) but vital, as this sample is not only for the study, but also determines the locations of an entire component of the SWEDD project.

I have been able to learn about various other ingredients as well, such as contracting a survey firm, training enumerators, writing concept notes, and obtaining IRB approval. I have also come to realize how essential it is to pre-heat the oven while you prepare these ingredients. The oven, in this case, is the government- if you do not have their approval, the survey cannot happen. You simply cannot bake your cake. If you wait to heat the oven until the batter is finished, you will have delays which could jeopardize your timeline and the overall quality of your cake. However, including the government as partners and getting their input and approval along the way, will prove incredibly useful once you are ready to start data collection. I have worked closely with our government counterparts, involved them in processes, and often deferred to their superior knowledge of the terrain. I have done presentations on impact evaluations and our RCT to ensure that they understand spillover effects and respect the selection of treatment and control villages. I have learned, though, that you cannot mitigate every hiccup, and ovens can sometimes be slow or even refuse to heat, so patience and persistence are vital.

Woman standing in front of projection screen talking into microphone.
Presenting impact evaluations to representatives from the Ministries of Health, Education, and Gender.

All cakes have a purpose. For us, the purpose is to measure the impact of safe spaces and accompanying measures in Côte d’Ivoire. SWEDD is a regional World Bank project, spanning 6 countries across the Sahel. It aims to reduce gender inequalities and accelerate the demographic transition by addressing both supply and demand constraints to family planning and reproductive and sexual health. In Cote d’Ivoire, the Government is implementing multiple women’s/girl’s empowerment initiatives, including safe spaces for adolescent girls. These safe spaces, which have emerged as an effective and cost-effective way to expand girls’ opportunities and empower them, are the subject of the first rigorous randomized impact evaluation in Côte d’Ivoire- the impact evaluation that I have been working on this summer. The safe spaces will be both school- and community-based and will cover topics such as health, gender equality, self-confidence, respect for oneself and others, interpersonal skills, emotional management, personal responsibility, conflict management, communication, cooperation and teamwork, creative thinking, critical thinking, and problem solving. In a subset of communities, the safe space intervention for adolescent girls will be accompanied by similar mentor-led group meetings for boys and men, and in others there will be livelihood support (support for income-generating activities). This impact evaluation aims to measure the heterogeneous effects of the different variations of the safe space interventions on adolescent girls’ and young women’s social and economic empowerment.

Men and women sit in rectangle on wooden benches in outdoor shelter.
Observing a “husband school”, which is the model for the SWEDD mens’/boys’ groups that will be part of the impact evaluation.

I have been incredibly inspired by my colleagues and the commitment from the Government of Côte d’Ivoire to empower women, spur the demographic transition, and ultimately take advantage of the demographic dividend. I have loved attending workshops where senior male government officials stress the importance of family planning, where the minister of education talks about not only getting girls to school, but keeping them there, and helping them learn, and where community members express an earnest desire to end forced and early marriage in their villages. The study that I have had the privilege of working on will help fill crucial knowledge gaps on what works for empowering women and delaying marriage and childbearing in the Sahel region. We just need to bake a really good cake…