Background: Even a well-designed randomized control trial (RCT) study can produce ambiguous results. This paper highlights a case in which full-sample results from a large-scale RCT in the United Kingdom (UK) differ from results for a sub-sample of survey respondents. Objectives: Our objective is to ascertain the source of the discrepancy in inferences across data sources and, in doing so, to highlight important threats to the reliability of the causal conclusions derived from even the strongest research designs. Research design: The study analyzes administrative data to shed light on the source of the differences between the estimates. We explore the extent to which heterogeneous treatment impacts and survey non-response might explain these differences. We suggest checks which assess the external validity of survey measured impacts, which in turn provides an opportunity to test the effectiveness of different weighting schemes to remove bias. The Subjects included 6,787 individuals who participated in a large-scale social policy experiment. Results: Our results were not definitive but suggest non-response bias is the main source of the inconsistent findings. Conclusions. The results caution against overconfidence in drawing conclusions from RCTs and highlight the need for great care to be taken in data collection and analysis. Particularly, given the modest size of impacts expected in most RCTs, small discrepancies in data sources can alter the results. Survey data remain important as a source of information on outcomes not recorded in administrative data. However, linking survey and administrative data is strongly recommended whenever possible. |