You are viewing the site in preview mode

Skip to main content

The use of race and ethnicity in sickle cell disease research

Abstract

This study explores practices surrounding the operationalization of ethno-racial categories (ERCs) as confounders in biomedical research, with a focus on sickle cell disease (SCD) as a model. ERCs, often aggregate labels encompassing diverse individuals which raises questions about their relevance as confounders. Given SCD’s racialization as a “Black” disease, understanding ERC utilization is crucial. This study analyzed 1,105 SCD studies published globally. Data were collected on whether ERC adjustment was employed, regional variations in ERC-adjustment rates, labels used for ERCs, rationales provided for ERC matching, and methods used for ERC determination. 28% of the studies utilized ERC adjustment, with significant regional disparities (p < 0.001). Notably, Western studies showed higher rates of ERC adjustment compared to other regions. However, crucial details such as ERC labels and methodology were frequently missing. Commonly used labels included “African” or “Black.” Only 7% of studies provided explicit rationales for ERC matching, and 70% did not specify the method used for ERC determination. The findings underscore the need to adhere to guidelines on ERC operationalization in biomedicine. The lack of standardized practices raises concerns about potential biases and misinterpretations in research outcomes. Adhering to clear guidelines can mitigate the risk of perpetuating racial stereotypes and inequalities while ensuring research integrity.

Clinical trial number

Not applicable.

Peer Review reports

Key terms:

ERC

Ethno-racial category

SCD

Sickle Cell Disease

One-country studies

Studies conducted within a single country, with a sample of the population based in that country

Cross-national (comparative) studies

Studies using study populations from different countries

Operationalization of race and ethnicity as confounders

The way very contextualized concepts such as race and ethnicity are being put to use as confounders (i.e., controlled for in the analysis)

Introduction

Race and ethnicity are demographic categories often used in biomedical research. Especially in areas that have been referred to as “the Western world,” they serve as increasingly important axes along which differences in health outcomes are stratified [1]. Biological, socioeconomical, and other environmental exposures play a role in creating and perpetuating (racial) health disparities. ERCs are contingent upon time, place and sociopolitical context. Therefore, unlike other demographic categories such as age and sex assigned at birth, ethno-racial categories (ERCs) have proved hard to harmonize internationally [2]. For example, the label “Black” in the UK versus South Africa likely includes individuals or groups with distinct genetic backgrounds, varying access to resources, and exposure to different environmental variables that influence their health. In highly admixed populations, such as in Brazil, operationalizing ERCs as biological proxies, poorly translates to genetic ancestry [3]. However, biomedical research studies appear to give the contextual nature of ERCs little consideration [4, 5].

Reporting on race- and ethnicity-based differences without context, can lead to unintended social and biological reification of these population descriptors. Since the rise of the Black Lives Matter (BLM) movement and health disparities accentuated by the COVID-19 pandemic, medical journals are increasingly discussing this topic [6, 7]. A recent systematic analysis of UptoDate® articles demonstrated the biologicalization of race in 93·3% of all documents [8]. “Black race” was assumed to correlate with genetics or clinical phenotype, discarding race as a social determinant of health. Aside from being used as demographic descriptors, ERCs are also sometimes considered as confounders and therefore included in the analysis of a clinical study.

Whether an ERC is a confounder, depends on the research question [8, 9]. In particular, when considering an ERC as a confounder instead of a mere demographic descriptor, providing context to clarify its association with the study outcome, becomes crucial. Several systematic reviews on the use of ERCs in major epidemiological journals, described that 29% of 329 studies that used ERCs and published between 1995 and 2018, provided a rationale for their use. ERCs used as analytical variables also enter worldwide clinical practice as correction factors in clinical race-adjusted algorithms [5]. A well-known example is the “Black” race correction factor in estimated glomerular filtration rate (eGFR), which has been used in clinical practice worldwide [7]. However, the validity of this eGFR formula, which was originally derived from a study involving Black American participants, becomes more questionable when applied to individuals not identified as African or Black American. These resignations have recently led to the publication of recommendations that suggested removing Black race as a factor in the eGFR formula and the most recent eGFR formula is indeed “race-free.” [10,11,12,13].

Up until recently, the decision of whether and how to report on race and ethnicity in biomedical literature was at the discretion of the authors. However, publishers have put forward recommendations advising authors to exercise care and consideration in reporting race and ethnicity. Without offering a standard approach, these guidelines underline the importance of describing categorization methodology, and interpreting race and ethnicity-related study results [14,15,16]. Furthermore, The National Academies of Sciences, Engineering, and Medicine (NASEM) also published guidelines on the operationalization of ethnicity and race in biomedical research. Researchers should justify their use of ethnic and racial categories (ERCs), remain transparent, and critically evaluate their approach. Key recommendations include disaggregating data, identifying confounding factors often conflated with ERCs, and adjusting study designs accordingly [17].

In Western countries, a historically racialized disease is sickle cell disease (SCD). SCD is a hereditary hemoglobinopathy characterized by the formation of dysfunctional erythrocytes [18]. The two hallmarks of SCD are increased hemolysis that results in chronic anemia and vaso-occlusion, resulting in painful episodes and multisystem organ damage [18]. SCD patients experience an average life expectancy of 54 years in high-resource countries, as well as life-long disabilities [19]. Approximately 90% of the SCD patient population lives in three countries: Nigeria, India, and the Democratic Republic of Congo [20,21,22]. The prevalence of SCD is mostly limited to malaria-endemic regions or the diaspora of these areas, since carriership of the HbS gene confers a survival advantage when infected with Plasmodium Falciparum malaria. In the United States and Europe, it is estimated that SCD prevalence is three per 10,000 individuals and one per 10,000, respectively. Outside Europe and the USA, the significance and relevance of ERCs may differ, as the geographical survival advantage of the HbS gene tends to include various ethnic groups living malaria-endemic areas [23, 24]. In the Western context, SCD researchers are faced with a case group racialized as “non-white” [25]. These SCD patients often have a migration background from Sub-Saharan Africa or are descendants from victims of the transatlantic slave trade. Outside Europe and the USA, the significance and relevance of ERCs may differ, as the survival advantage of the HbS gene tends to include various ethnic groups living inside these malaria-endemic areas. This context could dilute the relevance of including specific ERCs in SCD research from these countries.

Historically, individuals with SCD have faced structural discrimination across multiple domains. In science and science policy, insufficient research funding has hindered progress, delaying the development of novel treatments [26]. In American clinical care, there is a considerable shortage of comprehensive care centers for SCD, which would have the capacity to drastically improve care outcomes by providing holistic care [27]. Furthermore, stigmatization and scrutiny from healthcare professionals are widespread problems, for example around opioid use during pain crises [28, 29]. This is compounded by the fact that SCD symptoms and associated struggles are often not visible to others [30]. Beyond science and healthcare, SCD patients experience interpersonal discrimination based on race and disability, leading to systemic inequities [31]. These include educational disadvantages, employment discrimination, and gaps in insurance coverage, which collectively restrict access to high-quality healthcare on an individual level [29, 32,33,34,35].

All in all, individuals with SCD face significant marginalization, and science must not exacerbate this. The process of essentialization, which reduces complex identities to fixed traits, underpins and reinforces discrimination and is a risk when using race and ethnicity uncritically. Using race-adjusted kidney function estimators, for instance, might overestimate kidney function in individuals racialized as black affected by SCD nephropathy. This might potentially delay referrals to specialist care or consideration for kidney transplantation [7, 36]. Researchers must approach the use of ERCs carefully to prevent reinforcing biases and inequities.

SCDs’ high global prevalence spans diverse ethnicities and races. Furthermore, as a multisystem disorder, its impact has been studied across various clinical specialties. These characteristics make SCD an exemplary case for investigating the practice of ERC operationalization as a confounder in biomedical research from a global perspective. It is unknown whether this diversity of ERCs, contained by the SCD patient population is accounted for in biomedical research.

In this study, we analyze patterns about the use of ERCs as confounders (i.e., that ERCs were controlled for in the analysis) in SCD research. Furthermore, possible influences surrounding confounder adjustment of ERCs are explored. We set out to determine the prevalence of ERC-confounder adjustment and its correlation with the adjustment for other covariates. Furthermore, we explore if and how ERCs are contextualized for use as covariates in SCD research publications. As we stand at an important juncture in reporting racial and ethnic categories in the field of biomedicine, this retrospective analysis serves as a critical baseline measurement for evaluating race and ethnicity as categorical constructs in this field.

Methods

Search strategy and screening process

We systematically searched for original, peer-reviewed publications in Embase (via Ovid) and MEDLINE (via PubMed) published between January 1, 2011 and November 8, 2022. The search strategy was created in collaboration with information specialists. The following keywords were used for this search: “sickle cell,” together with descriptions of (specific parts of) study designs, such as “cohort analysis” and “control group” (Supplementary Table 1). This search yielded 5,033 results after the removal of duplicates. Inclusion criteria were original research in the English language and a comparison of cases with controls, where cases are individuals diagnosed with SCD and the controls were not. We focused on case-control studies to isolate instances where authors had the opportunity to make deliberate choices regarding ERC adjustment. Records were excluded if they were letters, abstracts, or brief reports. Exclusions were independently screened by two researchers. Any articles where there was uncertainty about inclusion or exclusion were reviewed and discussed by the research team. This resulted in 1,105 articles which were used for data extraction (Fig. 1). The majority of the selected studies consisted of one-country studies (n = 1,085), i.e., studies conducted within one single country, with the study population originating from that country. For this analysis, complicated contexts such as: cross-national studies (n = 20), studies including Sub-Saharan African populations and publications that mention multiracial or multi-ethnic individuals, were described separately.

Fig. 1
figure 1

PRISMA ™ flow diagram of the selection process for the quantitative and qualitative literature analysis of SCD research. Exclusion criteria were articles published before January 1, 2011 of after November 8, 2022, non-English manuscripts, designs other than case-control, and letters or brief reports. Figure created with BioRender

Data extraction

The following outcome data were extracted: year of publication, country, whether confounder adjustment took place based on age, gender or sex, socioeconomic status, ERCs or other categories, whether an explanation was provided for ERC-confounder adjustment, labels used for ERCs, methods used to determine the ERCs of study participants and whether participants with a mixed racial or ethnic background were annotated. The citation rate and CiteScore percentile per article were extracted from Scopus on the 2nd of March in 2023.

Global regions

The one-country studies were categorized into global regions. We separated high-resource world regions, such as Europe and North America, from others because of the distinct challenges faced by racial and ethnic communities in these settings these “Western contexts”. The Caribbean were considered a separate region because of the self-identification of SCD patients as Caribbean. The South Asian Region only represented SCD studies with an Indian study population. There were no eligible studies from other world regions such as East Asia and Oceania. For an overview of the grouping of countries of origin of the various studies under specific geographical regions, see Supplementary Table 2. Supplementary Table 3 provides an overview of the number of studies per country of origin.

Data analysis

Prior to analysis, data preprocessing and cleaning steps were performed. Associations between the manuscript being published in a Q1 journal (yes/no), (non-) demographic covariates and the country of origin of the manuscript were analyzed using chi-square tests, or the Cochran-Armitage Trend Test for ordinal variables. Binomial Generalized linear mixed models were used with a logit link function (with R package lme4 version 1.1.32) to examine factors associated with ERC adjustment (yes/no). Fixed effects were presence of adjustment for socioeconomic status (SES), adjustment for other potential confounders, and the geographical region from which the study participants were recruited. The journal was entered as random effect. Analyses were performed in R (version 4.2.3) and the package ggplot2 was used for visualization purposes [37].

Results

Characteristics of reviewed literature

Among the 1,105 included articles, 1,085 were single-country studies, and 20 were cross-national studies. The countries the study population was sourced from, are presented in the Supplementary material. The dataset and the analyses can be accessed on GitHub via the following link: https://github.com/AidaasinAyda/SCD_conf_correction.

Prevalence of ERC-confounder adjustment

27% (298/1,085) of one-country studies adjusted for ERCs, with no significant changes during the period 2011–2022 (Cochran-Armitage Trend Test, p = 0·53) (Fig. 2). We also did not find differences in the frequency of ERC-confounder adjustment before and after the instigation of the BLM movement in 2013 (chi square p = 0·25) or surrounding the increased awareness in health disparities during the COVID-19 pandemic in 2020 (chi square p = 0·79).

Fig. 2
figure 2

Percentage of studies adjusted for ethno-racial categories(ERCs) per year. Percentage of single-country studies that adjusted for ethno-racial categories out of the total number of studies per year

Studies with a North American study population adjusted for ERCs in 175/302 (57%) of the articles, as well as 45% (58/129) of the studies reporting on a European study population. In contrast, among 23% (7/30) of Caribbean, 11% (5/40) of South Asian, 16% (23/144) of South American, 8% (16/196) of Middle Eastern and North African, and 6% (14/239) of Sub-Saharan African study populations ERC-confounder adjustment was performed (Table 1; Fig. 3). We found a significant association between global region and ERC-confounder adjustment (chi-square test p < 0·0001). The odds ratio (OR) for the association between papers coming from a Western country (Europe and North America) and ERC-confounder adjustment was 10·66 (95% confidence interval [CI] 7·75 to 14·66, p < 0·0001), compared to non-Western regions (all regions except the defined Western region) indicating a significant association between the global region classified as a Western country and ERC-confounder adjustment.

Fig. 3
figure 3

Percentage of studies adjusted for ethno-racial category (ERC) per geographical region. Each country indicates the recruitment country. White represents 0% of studies adjusted for ethno-racial category (ERC), while red represents 100%. Countries shown in grey had no included studies

The Q1 CiteScore was associated with ERC-confounder adjustment. 1,017 of the 1,085 articles were ranked in Scopus with a CiteScore. Out of the 1,017 studies, 451 were published in a Q1 journal. The OR for a paper being published in a Q1 journal was 2·96 (95% CI 2·25–3·90, chi square test p < 0·001) for papers performing ERC confounders adjustment compared with those who did not.

ERC covariate adjustment compared to other covariates

12% of the one-country studies (127/1,085) adjusted for multiple covariates other than demographic variables. This contained covariates such as height, BMI, smoking habit, and parity. Confounder adjustment for these non-demographic variables was less frequent than ERC adjustment (chi-square p-value < 0·001).

We compared ERC-confounder adjustment relative to other demographic confounders: age, gender/sex, and socioeconomic status. Globally, 65% of papers on single-country studies adjusted for age (605/1,085), 41% for gender or sex (446/1,085) and 5% for socioeconomic status (SES) (59/1,085). In Europe, ERCs were the most frequently used characteristic for confounder adjustment and in North America second most frequent, after age-adjustment. In Sub-Saharan Africa, ERC was numerically less frequently used as a confounder than SES (chi square p = 0·01) (Fig. 4 and Supplementary Table 4).

Fig. 4
figure 4

Categories of covariate adjustment across geographical regions Regions refer to recruitment regions. NA = North America, EUR = Europe, SA = South America, MENA = Middle East North Africa, SSA = Sub-Saharan Africa, CAR = Caribbean, SAI = South Asia, SES = Socioeconomic Status, ERC = ethno-racial categories, Other covariates = e.g., height, BMI, smoking, parity. Manuscripts can adjust for multiple covariates; bar lengths do not represent the number of manuscripts per region

Adjusting for SES (OR = 4·32, 95% CI 2·17–8·60, p < 0·001) was a significant predictor for ERC-confounder adjustment, whereas adjusting for other variables than before mentioned demographic variables was not significant (OR = 1·55, 95% CI 0·97–2·49, p = 0·069). Furthermore, studies with participants originating from North America (OR = 4·42, 95% CI 1·77–11·08, p = 0·002) or Europe (OR = 2·56, 95% CI 1·00–6·62, p = 0·049), were more likely to adjust for ERCs than study groups from Sub-Saharan Africa (OR = 0·54, 95% CI 0·061–0·49, p < 0·001). In this analysis, the Caribbean was used as a reference group. The conditional R-squared of the mixed-effects model was 0·27.

Contextualization of ERCs

We investigated the contextualization of ERC usage as confounders in the included articles by analyzing their definitions, the rationale for their use, and the context provided for ERCs.

Reporting of ERC ascertainment and labeling

In 76% (226/298) of the studies that adjusted for ERCs, the classification criteria were not described. In the papers that did describe their methods (n = 72), 61% (44/89) used ERCs from pre-existing databases and 38% (27/72) used self-reported ERCs and 1% (1/72) used a combination of these methods (Supplementary Table 5). Moreover, if ERC-confounder adjustment was performed, 28% (83/298) of the studies did not mention which ERC labels were controlled for. Only a statement that ERC-confounder adjustment occurred, was included. However, the majority of the studies that specified ERCs (69% 206/298) used the label “African”, “Black”, or a derivative.

Rationale for ERC adjustment

Of the studies that adjusted for race or ethnicity, 19% (56/298) provided a reason. Of the studies that did not adjust for ERCs, 2% gave an explanation (14/787), for example, the contested background of using race as a proxy for biological differences, or previous literature pointing out that race is not a relevant confounder for their research question. For the specific rationales, extracted from the manuscripts, see Supplementary Tables 6 and 7.

ERC adjustment within more complicated contexts

We also explored contextualization of ERCs in more complex situations.

Cross-national comparative studies

Twenty studies included study populations from several different countries. Two studies were collaborations within Europe, six in Africa, and 13 studies were inter-continental collaborations (Supplementary Table 8). Six of these studies used confounder adjustment. In five publications, the control group was sourced from the same country as the SCD cases. One study used a healthy Congolese reference population as a control cohort while the SCD patients were from France, originating from West or Central Africa, or the West Indies.

ERC-adjusted studies with sub-saharan Africa study populations

Of all included geographic regions, studies with Sub- Saharan African study populations were the least likely to adjust for ERCs. Out of the 14 one-country studies with Sub-Saharan African study populations that adjusted for ERCs, five used the label Black, African, or a derivative: “African, Black, African ancestry, Black African, and Indigenous African. (Supplementary box 2) Further refinement into more specific ethnic labels for confounder adjustment, did not occur in any article. Labels such as Yoruba and Igbo in publications with a Nigerian study populations or Akan and Ewe in Ghanaian publications did appear within our search but were only used descriptively. Nine papers did not describe the specific ethnic labels used for confounder adjustment in their studies.

Multiracial or multi-ethnic individuals

In 2% (23/1,085) of all papers, a “mixed-race” category was used. However, in none of these papers this category was used as an ethno-racial label to adjust for confounding. In 57% (13/23) of these papers, ERC adjustment on other ERCs was performed. Most of the study groups that did include multiracial or multi-ethnic labels, were sourced from Brazil (52% [12/23]), eight from the United States of America (35% [8/23]), and three from the United Kingdom (13% [3/23]).

Table 1 ERC-Confounder adjustment per geographical region (n = 1,085)

Discussion

We found that race and ethnicity were operationalized as confounders in SCD research in nearly one-third of all one-country studies. Studies with a Western study population, were more likely to adjust for ERCs, compared to studies with an African study populations. Describing the method through which race and ethnicity were determined and the rationale for their use, is increasingly being encouraged by medical journals [9, 15, 16, 38, 39]. However, our analysis showed that this methodological practice was scarcely applied. In 76% (226/298) of ERC-adjusted studies, the classification criteria were not described. Our findings are especially relevant in SCD research in Western countries, since patients are more frequently racialized as non-white, compared to the general population [40]. We postulate that authors are aware of demographic differences but are often in doubt on whether and how they should apply these differences in their research. Correspondingly, we found that race and ethnicity are often not replaced by more specific covariates but are instead included in parallel with other potential confounding variables. This approach does not suggest a deliberate effort in deconstructing and replacing ERCs for alternative and possibly more suitable covariates

Since ERCs are highly contextual, it is imperative to provide the context when operationalizing them. Our analysis showed that this was not regarded standard practice. Unfortunately, we did not find an illustrative example in which ERC operationalization was performed in full accordance with current NASEM guidelines. Justification for the use of ERCs as confounders was often not provided. Also, our analysis showed that research articles from Western countries were more likely to correct for ERCs than research articles from non-Western countries. This might be related to historical and/or sociopolitical influences that have shaped biomedical practice [41,42,43]. For example, the historical impact of colonialism and slavery, as well as the impact of current-day cultural movements such as BLM might be of influence on current methodological approaches. It seems that the use of ERCs as confounders is reinforced in two ways. First, we found that, whenever manuscripts describe a rationale for ERC adjustment, they often referred to related literature which showed a correlation between ERCs and similar study outcomes. Second, studies that perform ERC adjustment are three times more likely to be published in high-impact journals. The decontextualized use of ERCs was also found in reviews examining a variety of general clinical and epidemiological journals [4, 5, 44, 45]. This underlines the prevailing uncertainty with which authors navigate this topic.

In SCD research, the relevance of ERCs is often considered since patients with SCD are often racialized as non-white. Nevertheless, in 72% of all papers, and more specifically in 46% of papers with a Western study population, ERCs were not applied as confounders. Assessing ERCs as a relevant confounder, might be complicated by the fact that individuals of non-European descent are often understudied in biomedical research [46, 47]. This results in a lack of knowledge about the implication of race and ethnicity in SCD outcome measures. SCD researchers also experience ethical challenges. There is a risk of biological and social reification of already minoritized populations, when reporting and operationalizing ethnic and racial background. ERCs identified as confounders, are often converted into race correction factors and applied in clinical algorithms [7, 48, 49]. Even if the use of ERCs in research is performed with context, this danger is always present.

Covariates that are responsible for variation in health outcomes are also dependent on the context. Biomedical researchers should therefore engage with social scientists when designing and reporting research.

Confounders, by definition, may distort study outcomes. However, the current, decontextualized use of ERCs in SCD, obscures the specific confounding pathway. Furthermore, this often leads to reinforcing ERCs as biological labels, such as using them as a proxy for genetic ancestry. In 2022, NASEM issued guidelines saying that race and ethnicity are inadequate proxies for human genetic variability. Researchers should try to pinpoint the specific information relevant to their research questions [50]. When studying health disparities and trying to control for genetics, NASEM recommends using the term “genetic similarity” instead of an ER label [50]. Genetic ancestry is defined as the population origin of a person’ alleles at polymorphic sites and can be estimated against a global reference of diverse individuals [51]. However, in the absence of genetic data, race and ethnicity are very poor proxies of genetic ancestry. Without relevant genetic information, there is an increased risk of defaulting to racial categories [52, 53]. ERCs have been shown to be particularly inadequate for recently admixed populations, such as those as labelled “Hispanic or Latino” by the US census [3].

In absence of more granular data, such as genetic data to support one’s assumptions, it becomes more crucial to consult additional sources that support population-level differences, relevant to study outcomes and exposures, such as documented population differences in reference lab values [54]. If this data is not available, researchers should perform literature reviews, sensitivity analyses, or propose well-supported hypotheses for population differences. Secondly, race and ethnicity often intersect significantly with environmental variables. Prior knowledge of how the study population is being affected by these variables (e.g. interpersonal racism, resources-deprived neighborhoods and air quality, guides researchers towards collecting relevant data. Standardization and harmonization are nearly impossible when relying on race and ethnicity, globally, but become more feasible with the implementation of granular measures and the availability of validated methods. The PhenX toolkit offers standard data collection protocols, including questionnaires on perceived discrimination and air quality data extraction [55]. The Neighborhood Atlas provides open data on neighborhood disadvantage in the United States, which can be used in models to study systemic racism [41]. Using validated means of data collection as well as population data itself can be combined with more sophisticated models that examine ethno-racial health disparities through the lens of systemic racism, as have been proposed in the literature [56,57,58]. It is essential to advance the field of research by moving beyond the limitations of ethnic or racial categorizations and focus on the underlying determinants driving health disparities.

For the operationalization of ERC in multiracial or multiethnic participants, it may be relevant to apply different categorization schemes when comparing results in one study. The way in which multiracial participants are processed in the study can change outcome estimates for other ERCs identified as well. In cases with multiracial populations, it is important to try to hypothesize the mechanism that might drive the outcome of interest and how multiracial identity in a particular study population might tie into it [17]. In this context, NASEM has proposed new methods for categorizing multiracial identity in biomedical research [17].

This analysis of the operationalization of ERCs as confounders in a racialized disease as SCD, is a novel contribution to the existing body of literature on the application of ERCs in biomedical research. Previous research on this topic, mainly focused on high-income countries or publications in high-impact journals [13]. By examining methodological practices in SCD research, we were able to analyze this from a broader perspective.

The challenge with operationalizing race, ethnicity, or similar factors such as genetic similarity lies fundamentally in the act of categorization itself. In a research environment with limited resources, there is always a trade-off between quality, financial cost, and time investment. Especially when studying study participants of color, which is the case in SCD research, this challenge is unlikely to be resolved. Even if genetic ancestry data were available and accessible for research participants, there may still be an albeit small residual component reflecting other biological-biological and biological-environment interactions, such as metabolomics, epigenetics, and the microbiome. While the scientific method may never achieve perfection, it must strive to be responsible. If ERCs are considered relevant, providing context should be a requirement.

In conclusion, the heterogeneity in the use of ERCs in SCD research has been shown in this review. These findings might have consequences for ERC-confounder adjustment in biomedical research in general. It is of the utmost importance to consider a more precise variable which is better suited to the research question, before using ERCs.

By redirecting the focus toward researching more specific, qualitatively better health determinants, we depart from the troubling trend of continuously relying on race and ethnicity. In this way, biomedicine mitigates the unintended perpetuation of health disparities and draws closer to contributing to outcomes in a more equitable way.

Data availability

Our data is now accessible on GitHub under the GNU General Public License version 3.0 (GPL 3.0), along with the analyses we’ve conducted. This license permits users to freely access, modify, and distribute both the data and analyses. However, any modifications or derivative works must also be released under the GPL 3.0 license. This framework encourages collaboration and transparency while upholding licensing terms. Feel free to explore and utilize our GitHub data repository.URL: https://github.com/AidaasinAyda/SCD_conf_correction.git.

References

  1. Kanakamedala P, Haga SB. Characterization of clinical study populations by race and ethnicity in the biomedical literature. Ethn Dis. 2012;22(1):96.

    PubMed  Google Scholar 

  2. Mauro M, Allen DS, Dauda B, Molina SJ, Neale BM, Lewis ACF. A scoping review of guidelines for the use of race, ethnicity, and ancestry reveals widespread consensus but also points of ongoing disagreement. Am J Hum Genet. 2022;109(12):2110–25.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Gouveia MH, Meeks KAC, Pua VOB et al. Subcontinental genetic diversity in the all of Us research program: implications for biomedical research. BioRxiv 2025: 2025.01. 09.632250.

  4. Lynn-Green EE, Ofoje AA, Lynn-Green RH, Jones DS. Variations in how medical researchers report patient demographics: a retrospective analysis of published articles. eClinicalMedicine. 2023;58:101903.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Martinez RAM, Andrabi N, Goodwin AN, Wilbur RE, Smith NR, Zivich PN. Conceptualization, operationalization, and utilization of race and ethnicity in major epidemiology journals, 1995–2018: A systematic review. Am J Epidemiol. 2022;192(3):483–96.

    Article  PubMed Central  Google Scholar 

  6. Nguyen TC, Gathecha E, Kauffman R, Wright S, Harris CM. Healthcare distrust among hospitalised black patients during the COVID-19 pandemic. Postgrad Med J. 2022;98(1161):539–43.

    Article  PubMed  Google Scholar 

  7. Vyas DA, Eisenstein LG, Jones DS. Hidden in plain Sight — Reconsidering the use of race correction in clinical algorithms. N Engl J Med. 2020;383(9):874–82.

    Article  PubMed  Google Scholar 

  8. Cerdeña JP, Asabor EN, Plaisime MV, Hardeman RR. Race-based medicine in the point-of-care clinical resource uptodate: A systematic content analysis. eClinicalMedicine. 2022;52:101581.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Bhopal R. Race and ethnicity: responsible use from epidemiological and public health perspectives. J Law Med Ethics. 2006;34(3):500–7.

    Article  PubMed  Google Scholar 

  10. van Deventer HE, George JA, Paiker JE, Becker PJ, Katz IJ. Estimating glomerular filtration rate in black South Africans by use of the modification of diet in renal disease and Cockcroft-Gault equations. Clin Chem. 2008;54(7):1197–202.

    Article  PubMed  Google Scholar 

  11. Bukabau JB, Yayo E, Gnionsahé A, et al. Performance of creatinine- or Cystatin C–based equations to estimate glomerular filtration rate in sub-Saharan African populations. Kidney Int. 2019;95(5):1181–9.

    Article  PubMed  CAS  Google Scholar 

  12. Zanocco JA, Nishida SK, Tiveron Passos M, et al. Race adjustment for estimating glomerular filtration rate is not always necessary. Nephron Extra. 2012;2(1):293–302.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Delgado C, Baweja M, Crews DC, et al. A unifying approach for GFR estimation: recommendations of the NKF-ASN task force on reassessing the inclusion of race in diagnosing kidney disease. J Am Soc Nephrol. 2021;32(12):2994–3015.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Journals A, Disparities Research G. 2021. https://www.ahajournals.org/disparities-research-guidelines?cookieSet=1 (accessed 07-01-2023.

  15. Flanagin A, Frey T, Christiansen SL. Updated guidance on the reporting of race and ethnicity in medical and science journals. JAMA. 2021;326(7):621.

    Article  PubMed  Google Scholar 

  16. Editorial. Why nature is updating its advice to authors on reporting race or ethnicity. Nature. 2023;616(7956):219.

    Article  Google Scholar 

  17. National Academies of Sciences E. Medicine. Rethinking race and ethnicity in biomedical research. Washington, DC: National Academies; 2024.

    Google Scholar 

  18. Houwing ME, de Pagter PJ, van Beers EJ, et al. Sickle cell disease: clinical presentation and management of a global health challenge. Blood Rev. 2019;37:100580.

    Article  PubMed  CAS  Google Scholar 

  19. Lubeck D, Agodoa I, Bhakta N, et al. Estimated life expectancy and income of patients with sickle cell disease compared with those without sickle cell disease. JAMA Netw Open. 2019;2(11):e1915374.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Piel FB, Hay SI, Gupta S, Weatherall DJ, Williams TN. Global burden of sickle cell anaemia in children under five, 2010–2050: modelling based on demographics, excess mortality, and interventions. PLoS Med. 2013;10(7):e1001484.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kadima BT, Gini Ehungu JL, Ngiyulu RM, Ekulu PM, Aloni MN. High rate of sickle cell anaemia in Sub-Saharan Africa underlines the need to screen all children with severe anaemia for the disease. Acta Paediatr. 2015;104(12):1269–73.

    Article  PubMed  CAS  Google Scholar 

  22. Tshilolo L, Kafando E, Sawadogo M, et al. Neonatal screening and clinical care programmes for sickle cell disorders in sub-Saharan Africa: lessons from pilot studies. Public Health. 2008;122(9):933–41.

    Article  PubMed  CAS  Google Scholar 

  23. Prevention USCfDCa. Data & Statistics on Sickle Cell Disease. 02-05-2023. https://www.cdc.gov/ncbddd/sicklecell/data.html#:~:text=In%20the%20United%20States%26text=SCD%20affects%20approximately%20100%2C000%20Americans,sickle%20cell%20trait%20(SCT) (accessed 08-01-2023.

  24. Orphanet. Prevalence of rare diseases: Bibliographic data, 2023.

  25. Smith WR, Valrie C. Structural racism and impact on sickle cell disease: sickle cell lives matter. Hematology/Oncology Clin. 2022;36(6):1063–76.

    Article  Google Scholar 

  26. Farooq F, Strouse JJ. Disparities in foundation and federal support and development of new therapeutics for sickle cell disease and cystic fibrosis. American Society of Hematology Washington, DC; 2018.

  27. Walia R, Fertrin KY, Sabath DE. A winding road to health care equity in sickle cell disease. Clin Lab Med. 2024;44(4):693–704.

    Article  PubMed  Google Scholar 

  28. Bulgin D, Tanabe P, Jenerette C. Stigma of sickle cell disease: A systematic review. Issues Ment Health Nurs. 2018;39(8):675–86.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Hood AM, Crosby LE, Hanson E, et al. The influence of perceived Racial bias and health-related stigma on quality of life among children with sickle cell disease. Ethn Health. 2022;27(4):833–46.

    Article  PubMed  Google Scholar 

  30. Berghs M, Dyson S, Greene A-M, Atkin K, Morrison V. They can replace you at any time!’:(In) visible hyper-ableism, employment and sickle cell disorders in England. Scandinavian J Disabil Res. 2021;23(1):348–59.

    Article  Google Scholar 

  31. Berghs MJ, Horne F, Yates S, et al. Black sickle cell patients’ lives matter: healthcare, long-term shielding and psychological distress during a racialised pandemic in England–a mixed-methods study. BMJ Open. 2022;12(9):e057141.

    Article  PubMed  Google Scholar 

  32. Inusa BPD, Jacob E, Dogara L, Anie KA. Racial inequalities in access to care for young people living with pain due to sickle cell disease. Lancet Child Adolesc Health. 2021;5(1):7–9.

    Article  PubMed  Google Scholar 

  33. Anderson D, Lien K, Agwu C, Ang PS, Abou Baker N. The bias of medicine in sickle cell disease. J Gen Intern Med. 2023;38(14):3247–51.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Kunz JB, Cario H, Grosse R, Jarisch A, Lobitz S, Kulozik AE. The epidemiology of sickle cell disease in Germany following recent large-scale immigration. Pediatr Blood Cancer. 2017;64(7):e26550.

    Article  Google Scholar 

  35. Lee L, Smith-Whitley K, Banks S, Puckrein G. Reducing health care disparities in sickle cell disease: A review. Public Health Rep. 2019;134(6):599–607.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Eneanya ND, Yang W, Reese PP. Reconsidering the consequences of using race to estimate kidney function. JAMA. 2019;322(2):113–4.

    Article  PubMed  Google Scholar 

  37. Team RC. RA language and environment for statistical computing, R Foundation for Statistical. Computing 2020.

  38. Lee C. Race and ethnicity in biomedical research: how do scientists construct and explain differences in health? Soc Sci Med. 2009;68(6):1183–90.

    Article  PubMed  Google Scholar 

  39. Valles SA. Why race and ethnicity are not like other risk factors. Philos Med 2021; 2(1).

  40. Campbell AD, Colombatti R, Andemariam B, et al. An analysis of Racial and ethnic backgrounds within the CASiRe international cohort of sickle cell disease patients: implications for disease phenotype and clinical research. J Racial Ethnic Health Disparities. 2021;8:99–106.

    Article  Google Scholar 

  41. Amutah C, Greenidge K, Mante A, et al. Misrepresenting Race — The role of medical schools in propagating physician Bias. N Engl J Med. 2021;384(9):872–8.

    Article  PubMed  Google Scholar 

  42. Hoberman J, Black. and Blue: University of California Press; 2012.

  43. Braun L. Breathing Race into the Machine: The Surprising Career of the Spirometer from Plantation to Genetics < b > Breathing Race into the Machine: The Surprising Career of the Spirometer from Plantation to Genetics University of Minnesota Press, 2014. Science 2021; 373(6558): 972-.

  44. Buttery SC, Philip KEJ, Alghamdi SM, Williams PJ, Quint JK, Hopkinson NS. Reporting of data on participant ethnicity and socioeconomic status in high-impact medical journals: a targeted literature review. BMJ Open. 2022;12(8):e064276.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Alegria M, Sud S, Steinberg BE, Gai N, Siddiqui A. Reporting of participant race, sex, and socioeconomic status in randomized clinical trials in general medical journals, 2015 vs 2019. JAMA Netw Open. 2021;4(5):e2111516.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Bentley AR, Callier S, Rotimi CN. Diversity and inclusion in genomic research: why the uneven progress? J Community Genet. 2017;8(4):255–66.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Bentley AR, Callier SL, Rotimi CN. Evaluating the promise of inclusion of African ancestry populations in genomics. Npj Genomic Med 2020; 5(1).

  48. J. Q. Ras Vanuit Een medisch-sociologisch perspectief. Ned Tijdschr Geneeskd 2019; 163.

  49. van Weel C, van den Muijsenbergh M. Populatiegerichte Eerstelijnszorg in internationaal perspectief. Tijdschrift Voor Gezondheidswetenschappen. 2019;97(1–2):32–5.

    Article  Google Scholar 

  50. National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. https://doiorg.publicaciones.saludcastillayleon.es/10.17226/26902

  51. Mathieson I, Scally A. What is ancestry? PLoS Genet. 2020;16(3):e1008624.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Gois MFB, Sinha T, Spreckels JE, et al. Role of the gut Microbiome in mediating lactose intolerance symptoms. Gut. 2022;71(1):215–7.

    Article  Google Scholar 

  53. Fujimura JH, Rajagopalan R. Different differences: the use of ‘genetic ancestry’versus race in biomedical human genetic research. Soc Stud Sci. 2011;41(1):5–30.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Tahmasebi H, Trajcevski K, Higgins V, Adeli K. Influence of ethnicity on population reference values for biochemical markers. Crit Rev Clin Lab Sci. 2018;55(5):359–75.

    Article  PubMed  CAS  Google Scholar 

  55. Krzyzanowski MC, Ives CL, Jones NL, et al. The phenx toolkit: measurement protocols for assessment of social determinants of health. Am J Prev Med. 2023;65(3):534–42.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Howe CJ, Bailey ZD, Raifman JR, Jackson JW. Recommendations for using causal diagrams to study Racial health disparities. Am J Epidemiol. 2022;191(12):1981–9.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Duan N, Meng XL, Lin JY, Chen Cn, Alegria M. Disparities in defining disparities: statistical conceptual frameworks. Stat Med. 2008;27(20):3941–56.

    Article  PubMed  PubMed Central  Google Scholar 

  58. Swilley-Martinez ME, Coles SA, Miller VE, et al. “We adjusted for race”: now what? A systematic review of utilization and reporting of race in American Journal of Epidemiology and Epidemiology, 2020–2021. Epidemiologic reviews 2023; 45(1): 15-31.

Download references

Acknowledgements

The authors wish to thank— Wichor Bramer and Sabrina Meertens-Gunput from the Erasmus MC Medical Library for developing and updating the search strategies; Judith Gulpers from the Erasmus University Library and Suat Tuzgöl from researchsoftware.com for making it possible to merge the literature database with their recent citation rates and the recent CiteScores of the journals; Ross Williams for support with creating the figures and for providing critical feedback on the article; Colin Spence for his support in screening a part of the initial results; and Karima Shata and Alana Helberg-Proctor for inspirational and very necessary conversations around the operationalization of race and ethnicity in a biomedical research context. Additionally, the authors are grateful to Yolanda L. Jones, National Institutes of Health Library, for editing assistance.

Funding

AK is supported by a grant from “Het Sikkelcelfonds”. KACM is supported by the NIH Pathway to Independence Award (K99/R00) award DK131018 and by the Intramural Research Program of the National Institutes of Health in the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, and the Office of the Director at the National Institutes of Health (1ZIAHG200362). The funding sources did not play a role in the creation or development of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

AR, AM, KACM, and AK were involved in the original study design, and drafting the original manuscript. AK provided the original idea and conducted literature screening. MR edited the manuscript, conducted literature screening, and verified the original data. EvW and TP edited the manuscript and reviewed/designed the statistical analyses used in this study. MC contributed by interpreting the results and critically reviewing the manuscript.

Corresponding author

Correspondence to Aida S. Kidane Gebremeskel.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kidane Gebremeskel, A.S., Rab, M.A., van Werkhoven, E.D. et al. The use of race and ethnicity in sickle cell disease research. BMC Med Res Methodol 25, 63 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12874-025-02513-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12874-025-02513-5

Keywords