In response to the article: A promising study on the long-term effects of deworming, by Jake Marcus (Give Well)

Currently Giving What We Can supports two charities working on deworming – the well-established Schistosomiasis Control Initiative (SCI), which primarily targets Schistosomiasis. and the promising but less established Deworm the World (DtW), which aims to enable countries to implement deworming programs relating to Soil Transmitted Helminths.

In October 2014 GiveWell suggested that a new unpublished study , by Croke (2014, 'The long run effects of early childhood deworming on literacy and numeracy: Evidence from Uganda'), provided promising data on the long-term effects of deworming, indicating that it represented a major update in this field of research. The cluster randomised trial compared the literary and numeracy test scores of children 7-8 years after mass deworming (22%-66% of the children in the parishes) for soil-transmitted helminths of pre-school children in Eastern Uganda parishes to the scores of a control population which had not undergone mass deworming during this time period. The author found improved test scores in those treated with deworming, and concluded that this strengthens the evidence for cognitive benefits for mass deworming in areas of high worm prevalence. Whilst appreciating the strengths of the study, as highlighted by Jake Marcus in GiveWell’s analysis, I analysed some of the limitations of the study.

The Limitations of Croke’s 2014 study

Cluster vs. Individual Randomised control trials

One of the main criticisms of Croke’s study lies in the study design adopted - a cluster randomised trial. A cluster randomised trial is used when one cannot randomise individual participants, so instead randomises groups. For instance, a cluster randomised trial might look at the effects of different teaching styles (A & B) on school classes. This is appropriate because school teachers are not individual tutors but can only provide for the collective when giving a lesson. Since the single intervention affects a large group and it cannot be specifically directed at individuals, the important outcome is the effect on the whole class.

The outcome being measured in the Croke study is collective improvement in cognitive ability, which may seem reasonable. However the use of cluster randomisation is less intuitive here because children are individually being dewormed. Therefore randomising individuals would be a better study design.

On the other hand, cluster randomisation could account for any spillover effect - deworming having additional benefit to children in close proximity to those dewormed. Individually randomised trials cannot take into account spillover effect, so they could underestimate the effect of deworming. Further, randomising individuals would have been substantially more expensive.

Nonetheless, by being unable to follow up participants directly as individuals and instead relying on population-based data, the study is subject to a greater propensity for confounding factors to influence the results.The propensity for confounding in cluster randomised trials is based on baseline differences and on the hidden effects of unmeasured changes to the groups during the study. For instance, the two groups may differ substantially in baseline characteristics which could affect outcome. In the example above, if children in class A are more clever than those in class B, finding that class A improved after a different teaching style is confounded by the children being more amenable to learning.

Croke tries to control for potential confounders by showing that there is no difference in socioeconomic status amongst groups. But there is a flaw in this method. The socioeconomic data (which is collected independently from the study and only includes 22 of the 48 parishes in the original study) does not apply to the original sample (intended for deworming or control) at the beginning of the study. So it only shows that socioeconomic differences were not present at the end of the study (2010) when the cognitive ability studies were performed. Whilst it is plausible that there were not socioeconomic differences at the beginning of the study, this has not been directly shown in his paper.

Robustness of data

When considering the data Croke acknowledges the limits of a cluster randomisation and so applies analysis to show that his data is “robust” to obvious potential confounding factors. Croke attempts to show “robustness” by analysing the data where certain factors were controlled, such as restricted water access, along with other measures. This involves checking if the results still stand despite standardising a potential confounding variable between the two groups.

For instance, if one was looking at death rates between two places in the country they may find that in place A this is much greater than place B. However this could be because of the confounding factor that the majority of people living in A are much older than B. Therefore one can standardise the data to compare all those in the same age range categories to test if that result still stands despite the differences in average age being controlled in post data analysis.

In Croke’s study, one way in which robustness of data is illustrated is by showing that the youngest and oldest children in the test classes would have smaller differences in test scores than the intermediate age groups, as they would not have been in the population eligible for deworming 7 years earlier. Whilst intuitive that those at the extremes of age would have had reduced number of deworming protocols, and so may have less differences in cognitive ability, this has actually not been evidenced. The graph provided shows no confidence intervals and did not indicate any statistics used to compare the subgroups, so cannot be assessed appropriately.Even if those statistics were used, the study was not powered to detect differences in this subgroup population. In addition, standardising the data cannot account for confounders that are not tested for and may be less obvious.

Other limitations

Further to these limitations, the study assumes that the populations surveyed during deworming and 7-8 years later are comparable,the latter containing the population that was originally dewormed. However, the 2002 Census revealed that a total of 3.1 million out of 23 million persons (13 percent) were enumerated outside their district of birth and hence classified as internal migrants.3 This data suggests that the population sampled in the parishes randomised for deworming or control may not correspond to the populations from which the cognitive function data were collected.

Another issue is that the endpoints for the study – differences in standard error of Maths and English tests - are limited measures of cognitive ability. Further, there is no reasonable suggestion as to why deworming would greater effect Maths tests than English tests, as the results indicate. It is also unclear what these differences actually correspond to in absolute terms, on a practical level for the populations analysed. In addition, other studies have shown increased school attendance, which this study fails to show. The reason for this inconsistency is unclear.

Finally, crucial to the analysis is an absence of evidence for combination deworming - there is no data on whether Schistosomiasis deworming occurred in the interim between Soil Transmitted Helminth deworming and cognitive testing of the populations, or on what other public health interventions occurred.

In conclusion, the study has significant limitations. But this should be seen in the context of the wider difficulties of researching the effects of mass deworming. Cognitive benefits are difficult to measure, especially over significant periods of time. This is made even more difficult when looking at substantial benefits to groups (such as spillover effects), as studying group data introduces various confounding factors. Functional outcomes and the ability of improved cognitive abilities to dramatically affect the prospects of a child in this context are difficult to demonstrate, as abundant difficult to measure factors contribute. These difficulties explain why controversy in the field of deworming is rife. This study has significant limitations with data subject to confounders and therefore it may not be as “promising” as we hoped. Further blog posts regarding how we might re-evaluate the evidence will be posted soon.