By Shahana Bilalova
As researchers, we frequently find ourselves delving deeply into the body of scientific literature in an effort to find the answers to the questions we have. Recently, I conducted a case survey, which is a kind of meta-analysis that synthesizes qualitative case narratives in published literature. During this process, coping with missing data became both a challenging and insightful journey, which led me to take a strategic approach.
With an aim of understanding how well water governance systems function in addressing various water-related issues, my goal was very clear: to compile data from empirical studies in our sample. I was using cases that we identified systematically at the beginning of our project. These cases were from one of the two problem contexts (in our paper termed problématiques), namely “groundwater exploitation in agriculture” and “surface water pollution”. While in our previous studies, we had clustered cases into five problématiques, I selected the mentioned two problématiques, as they combined a large share of cases and represented diverse natures of problem contexts.
My initial plan was straightforward: to extract data for each case from the original articles in our sample. However, it did not go as planned, and I ran into a problem that all researchers face: incomplete data. Only 12% of cases (10 out of 86 cases) provided complete data. Before addressing the missing data, with careful consideration, I strategically chose 20 cases for each problématiques—40 being the minimum number necessary for QCA in our study—selecting those with the most complete data. Considering the amount of time and resources needed to complete the missing data, this stage was necessary. I then looked into other innovative ways to filling these gaps.
(1) Tapping into expert insight: the first strategy
My first idea was to get in touch with the main authors of the articles, considering their knowledge of the case. This approach was successful in some cases, so I gathered data for another 10 out of 40 cases (33% of the missing information). Authors shared their expert knowledge through surveys, helping with the missing information. Yet, some authors did not respond at all, and with articles published decades ago, authors often struggled to recall details about the cases. Additionally, one of the cases for which we received an author survey still had one missing piece of information.
(2) The hunt for alternative sources: grey or academic literature
As a response to the situation, I delved into the literature, both grey and academic, searching for alternative sources to to fill the remaining missing information. The goal was to identify further publications on the given cases that could help with the missing data. I included either literature published within +/- 5 years of the original article’s publishing date or publications that studied governance within the timeframe of the original article. Although this step required a close look, it was often successful in finding the information I was looking for. As a result, I was able to fill in 69% of the missing information (completing information for 16 cases).
(3) The third strategy: replacement cases
However, all these steps still left some cases with missing data. As a last step, I took a pragmatic approach—replacing those remaining 5 cases with still incomplete data with others from our sample that shared the same problématique and contained more complete data compared to the cases in the whole sample. After replacing the cases, I filled in the missing information for these new cases using the alternative sources.
In conclusion…
Working with the missing information in a case survey demands flexibility, creativity, and perseverance. Although reaching out to authors, looking into alternative sources, and using replacement cases aren’t guaranteed solutions, they are effective ways to get around the problems caused by gaps in the data. During this process, I have come to appreciate research’s dynamic nature, where adaptability is just as important as the data itself. Of course, the important thing is that we are transparent with it. Even though handling missing data might be challenging and difficult at times, we should remember that each missing piece presents an opportunity to improve our problem-solving abilities.