I Am More Significant Than a P-value
“The way we work in public health is, we make the best recommendations and decisions based on the best available data.” —Thomas R. Frieden, Former Director of the US Center for Disease Control and Prevention
Data, both qualitative and quantitative, influences policy decisions surrounding population health (especially minority health). As public health researchers, we rely on evidence-based interventions to reduce the burdens of disease in our communities. In order to classify an intervention as “evidence-based,” data usually has to produce a statistically significant outcome or effect. Epidemiology, often times considered the science of public health, teaches you that you can form conclusions based on arbitrary measurements, such as the great p-value. In epidemiology, a p-value is defined as “the probability of obtaining a result at least as extreme as the one that was actually observed in the biological or clinical experiment or epidemiological study, given that the null hypothesis is true”. (Another way to understand this: reject the null if your p-value is less than 0.05).
P-values are one of the most misused and misinterpreted measures of health outcomes. Yet, it is one of the “easier” statistical methods to use if you want to get your work published or recognized. In the United States, we tend to rely more heavily on what a number can tell us instead of the true lived experiences of the people we’re sampling. As implicit as scientific racism can be, we must be aware of our research foundations. When forming conclusions about minority health, in particular, researchers tend to under sample the minority population or lack appropriate measurement tools for that sample population.
So why is this important? Let’s use as an example the risk of postpartum depression (PPD) between racial groups. Our null hypothesis is that there is an associated difference in risk of postpartum depression between being a White woman compared to being a Black woman. All things are considered equal in our assumption and our sample shares similar demographics, geographical location, etc. If I’m interested in observing the association between race and PPD, having a p-value under 0.05 (5%) does not direct me to the “true” relationship between PDD and race if my sample only includes 50 Black women and 200 White women. However, I can now convince you that being a Black woman does not have a significant effect on the development of postpartum depression and that this is the reality of Black women in the United States (generalizing to the population). Yet, I haven’t incorporated any social or environmental factors that can influence the risk of developing postpartum depression, or how these factors influence the data. Race is a social construct, not a biological one. Therefore, a person’s health can (and is) influenced by how a person interacts with their social environment. But remember I’m telling you that statistics showed that there is no difference in risk. When we test our hypothesis using the p-value as our method of measurement, we have to be conscious of the conditions surrounding our study design, analysis, inclusion/exclusion criteria, and so on.
With more public health issues centering health disparities, we’re seeing a surge of ways to change the narrative of minority health. Dr. Thomas LaVeist, an expert health disparities researcher, gave a lecture on how to incorporate a cultural lens into public health research. Researchers oftentimes exclaim that their data is representative, when in actuality it under samples the minority comparison population. Yet, this inclusive dataset produces publications that in turn create inefficient interventions. This is problematic because we are not taking into account the lived experiences of our study participants. Black women living in Detroit, Michigan are different than Black women living in Los Angeles, California.
There are different social and environmental factors affecting those cities that in turn shape those women’s lived experiences and produce different outcomes. So for research to suggest the use of a new evidence-based intervention based on a study conducted on Black women in Denver, Colorado (that produced statistically significant results!), feels like an injustice to other communities—communities who really need interventions tailored around their current environment. In research, the idea is that scholarly publications make the world go round. However, due to the ease of using a p-value, we develop a bunch of interventions that don’t actually reduce negative health outcomes in minority communities.
So what’s the point of this piece? I am by no means a statistician, I barely consider myself an epidemiologist. I love doing research, I consider myself a research geek, but I want to encourage researchers everywhere to think more critically about the display of data and the effects it can have on the community. Public health researchers have to dive deep into the questions: “who is in your sample?” and “what does this really mean?” My p-value “proves” that this is the causal association, but what now? We as researchers should challenge ourselves and ask ourselves “can I really publish this paper based on XX number of people in my comparison group, when there are actually XX number of (insert minority group) in the country?”