Confounder Parsing for Text Matching
Abstract
In observational studies for policy evaluation, matching is used in service of causal
inference to simulate randomization and thus reduce selection bias that might occur
when treatment assignment differs systematically. This is done by balancing the
distribution of confounding covariates measured before treatments. Matching on
numerical covariates has been done for decades. In recent years, matching on tex tual covariates has gained popularity. By matching on text data, one can potentially
observe confounding information that cannot be observed in tabular data. Further more, when combined with numerical data, matching on text data can potentially
improve the balance of numerical covariates. However, confounder parsing, defined
as the process of removing treatment text from documents to only end up with con founding text, is nontrivial in policy evaluation. This is because policy documents
come in the form of PDFs and typically vary a lot in terms of quality and layout.
There are many different ways in which one could approach confounder parsing and
each approach comes with its own trade-offs. We have investigated whether different
confounder parsing methods influence covariate balance differently. We applied our
methodology to labor issue policies of the International Monetary Fund and mea sured the impact of these policies on population health. To ensure the relevancy of
our inquiry, we also investigated whether text matching improves covariate balance
on numerical covariates. We find that the covariate balance of our text matching
procedures is relatively unchanged by the different confounder parsing methods.
Moreover, text matching within propensity score calipers improves the covariate
balance, compared to merely using propensity score matching or matching on text
covariates alone. Our results demonstrate that text matching can be valuable in
establishing causal inferences in the domain of policy evaluation. In addition, our
results also suggest that the flexibility regarding which confounder parsing method
researchers can choose among increases
Degree
Student essay
Collections
View/ Open
Date
2021-07-06Author
Reichl, Jannes
Rönkkö, Johan
Keywords
Text matching
confounder parsing
causal inference
covariate balance
International Monetary Fund
policy evaluation
Language
eng