Saudi Journal for Health Sciences

: 2021  |  Volume : 10  |  Issue : 3  |  Page : 155--159

Bias in early coronavirus disease 2019 research

Fatmah Mahmoud Othman 
 Department Research, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences; King Abdullah International Medical Research Center, Riyadh, Saudi Arabia

Correspondence Address:
Fatmah Mahmoud Othman
College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences. Mail Code 3159 3129, P.O. Box. 3660, Riyadh 11481
Saudi Arabia


In the context of the ongoing global pandemic of coronavirus disease 2019 (COVID-19), most scientific evidence related to disease transmission and clinical outcomes, especially in the first wave, originated from observational studies. Those studies have provided a basic understanding of various aspects of this disease, including clinical manifestations, pathogenesis, diagnosis, and treatment. However, the accuracy and credibility of some such studies have been questioned because of the presence of bias, which is occasionally addressed in the published research. In this review, the principal types of bias in COVID-19 research are discussed, namely selection and misclassification bias. For this mini literature search, Medline database had used to identify the related articles. Many studies have shown some selection bias in sampling their population, leading to an over-or underestimation of true results. Understanding the effect of bias in the context of COVID-19, research is important for two reasons. First, it enables a discussion of the findings of such biases, especially those that occurred in studies contributing to evidence-based medicine. Second, it helps ensure that researchers avoid such bias in future research and any subsequent infectious pandemic. The key points in avoiding such bias are study design and the need for care in collecting information on both exposure and outcome; however, in the real world, these are very challenging matters.

How to cite this article:
Othman FM. Bias in early coronavirus disease 2019 research.Saudi J Health Sci 2021;10:155-159

How to cite this URL:
Othman FM. Bias in early coronavirus disease 2019 research. Saudi J Health Sci [serial online] 2021 [cited 2022 Jan 22 ];10:155-159
Available from:

Full Text


In infectious disease epidemiology, the presence of bias in the design of observational studies distorts the validity of the results.[1],[2] In the context of the coronavirus disease 2019 (COVID-19) pandemic, as it is an infectious disease, most of the scientific evidence originating from observational studies related to disease transmission and clinical outcomes, especially in the first wave.[3],[4],[5],[6] The appearance of a tremendous number of publications has provided a base for public health decision-making and policy. Science and biomedical journals responded to the increased number of COVID-19-related papers with fast-track and preprint publication of such reports.[7] In fact, many papers at the beginning of this pandemic were opinion-based; either they did not contain original data or they had an inappropriate study design or analysis plan. Although this type of publication seemed appropriate given the novelty of the virus and the time pressure to produce data for health policy, publishing this type of paper resulted in either a rush to share data that might have been based on methodological error or the misinterpretation of such data.[8],[9],[10],[11]

Many research studies raised concerns about the quality of COVID-19 research, especially at the beginning of the pandemic, as well as the risk of expedited science.[10],[12],[13] For example, some quality-related publication standards were relaxed by some journals,[11] as has been discussed in many papers reporting methodological concerns in the COVID-19 literature.[14] For example, in the area of COVID-19 pharmacoepidemiology, the retraction and correction of COVID-19 hydroxychloroquine papers[15] had direct consequences for public health at a very critical time. A review study assessed published articles from four high-impact clinical journals, comparing the methodological and reporting quality of COVID-19 papers with non-COVID-19 papers published during the first 6 months of the pandemic.[4] The authors demonstrated that published articles during the first wave of COVID-19 may have been of lower quality, and therefore at higher risk of bias than contemporaneous non-COVID research.[4] This potential lack of quality and risk of bias are mainly due to small numbers of study participants, the use of unrepresentative sample selections, or flawed interpretation of the data. Therefore, it is important to understand and assess the bias which induces spurious associations in observational data and leads to such low-quality studies. Similarly, Raynaud et al. evaluated COVID-19-related medical research and found that most such research is composed of publications without original data. Peer-reviewed original articles with data showed a high risk of bias and included a limited number of patients. Of the 713 original articles that have been evaluated, the high proportion of low-quality articles was concerning, as few studies showed low risk of bias.[16]

Since the quality and accuracy of the data related to the COVID-19 literature have been questioned, and due to the dynamic nature of this infectious disease, this paper aims to review the list of possible biases related to COVID-19 evidence. For this mini literature search, the Medline database had used to identify the related articles. Although our understanding of COVID-19 epidemiology has progressed well, the main challenges to relying on observational studies remain; therefore, understanding the potential for bias and its role is necessary for appropriate decision-making. The observational studies mentioned above are particularly susceptible to selection bias (mainly nonrandom sampling bias) and information bias (mainly misclassification bias).

 Definition of Bias

In general, in epidemiology, researchers are interested in describing or examining an association (or lack of association) between exposure and outcomes in a population.[1],[2] The interpretation of the results obtained from such studies should consider the effect of chance (sampling error), bias, and confounding. Bias is a consequence of an error in the design or execution of an epidemiological study. Therefore, the resulting summary measure of effect can be over-or underestimated, depending on how the bias acts. Bias can occur in a study at different stages: While planning the study, collecting the information, analyzing the data, or publishing the results.[2] It is necessary to consider the effect of any potential bias during the design stage of the study and ensure the entire study is conducted properly as it is very difficult to correct for bias in the analysis. There are many different types and classifications of bias;[17],[18] however, in the context of observational studies related to the COVID-19 pandemic, sampling bias in seroprevalence studies, misclassification bias, and bias in meta-analysis studies are discussed and highlighted in this report.

The most commonly discussed bias is selection bias, whereby the participant population of the study is systematically different from the target population. Difference (s) can be related to either the exposure or outcome of interest. Earlier COVID-19 research suffered from various types of bias related to selection strategies, which may have contributed to the inconsistency of estimates presented in the studies. For instant, epidemiological surveillance studies published at the beginning of the COVID-19 pandemic have been criticized for the presence of sampling bias. For any new emerging infection disease pandemic, epidemiological surveillance studies are considered the foundation of immediate and long-term strategies for combating the infectious disease and ensuring better reporting systems, both nationally and internationally.[19],[20] However, bias in surveillance studies occurs when the selected study population is systematically different from the target population or because of increased testing and monitoring in the selected population, whereby the outcome is diagnosed more frequently in the sample population than in the general population. It has been noticed that individuals with severe or obvious COVID-19 symptoms are more likely to be tested than individuals with mild or no symptoms; therefore, the seroprevalence estimate in the surveillance study may be overestimated. Therefore, the estimate from the surveillance studies does not actually reflect the true prevalence of the infection or infection fatality rate in the population.[21],[22] Although this type of bias applies to all epidemiological studies, it has become particularly concerning for COVID-19 studies which do have not clear inclusion or exclusion criteria, and this type of bias cannot be corrected by statistical models as it occurs at the study design stage.

The issue of selection bias in COVID-19 surveillance studies has been discussed in many research papers.[23] Researchers have emphasized that some countries initiated wider testing strategies than others due to variation in testing strategies and protocols among countries.[22],[24] Thus, some countries implement vigorous testing strategies while others focus on severely ill individuals. This bias in sampling can introduce variation in estimates of infection, recovery, and fatality rates between countries. Thus, these can be overestimated if the study uses a more focused approach and underestimated by studies which use a very broad approach.[22] An accurate estimate of the true dynamics and infection, recovery, and fatality rates is necessary to plan public health policy. However, biased sampling in some early COVID-19 observational research misled the epidemiological modeling of the epidemic and calculation of key metrics, as most of the surveillance was biased toward symptomatic patients.[25] Griffith et al. discussed collider bias within the context of COVID-19 studies: Restricting analyses to a specific population who have experienced the event, have tested positive, or self-report during voluntary participation introduced bias into the estimates.[3] For example, if analysis is restricted to patients who have COVID-19, then the risk factor for infection will appear to be associated with any other variable that influences both infection and progression.

In addition, bias can be raised in surveillance studies when the population who take part in or respond to the study are different from those who did not take part, in which case there is a risk of volunteer and/or responder bias. Numerous observational studies have sampled from patients admitted to the hospital, people tested for active infection or people who volunteered to participate. This bias can lead to either underestimating or overestimating the prevalence of infection. Thus, a high prevalence estimate may result because individuals are more likely to take the test if they think they may be positive for severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2).[23] Even though sampling bias is encountered in the study design, changing the data collection modality has become an issue. Thus, many research studies have shifted to web-based or online surveys to collect data, which can lead to sampling bias and underrepresentation of high-risk COVID-19 groups.

Furthermore, the accuracy of SARS-CoV-2 antibodies testing contributes to the wide variation in seroprevalence estimates.[26] Multiple serological assays have been approved and adapted with a range of test performance indicators (sensitivity and specificity), in which the accuracy of the serological test is unclear for mild or asymptomatic infections. A particular concern is that tests which may have imperfect sensitivity lead to the possibility of underestimating the actual prevalence of positive cases, while tests with imperfect specificity lead to an overestimation of the cumulative incidence.[26],[27] Imperfect test performance can cause misclassification bias as individuals without a prior COVID-19 infection can be misclassified as positive while individuals with a prior COVID-19 infection can be classified as negative. Some studies account for this bias by adjusting for the test sensitivity and specificity using available estimates for the used serological test. However, the data should be interpreted with care because of the existing variation between the population in which the tests were valid and the population involved in the study.[23]

Travel ban bias has also been noticed during the COVID-19 pandemic, especially since outbound travel from Wuhan was banned on January 23 2020. Thus, studies that analyzed the number of cases exported from Wuhan internationally were suitable for selection bias, whereby the results of such studies were underestimated. For instance, one paper found that the epidemic doubling time was 6.4 days (95% credible interval [CrI ] 5·8–7·1)[28] while another estimated that the epidemic was doubling in size every 2.9 days (95% CrI, 2 days–4.1 days).[29] Although the two papers utilized the same information on coronavirus cases who had traveled from Wuhan, the later model took into account the restrictions on outbound travel from Wuhan since January 23.[29] Therefore, studies during the early stages of the COVID-19 pandemic did not account for domestic travel measures, which is likely to have led to biased estimates.[30]

Another bias that was discussed in early COVID-19 observational research is misclassification bias. This bias occurs when errors in classification of exposure status affect people with the disease and without the disease differently, or when errors in classification of outcome status affect exposed and unexposed individuals differently. This can result in either an over-or underestimation of the true measure of effect. Thus, at the beginning of the COVID-19 pandemic, different countries dealt differently with the uncertainty about the nature of the SARSCOv2 virus and its related risk. Some countries decided to take the strictest measures, while others initially considered COVID-19 a new kind of influenza and did not introduce mandatory quarantine or isolation of cases.[31],[32],[33]

In addition, many forms of misclassification bias have been discussed as limitations of observational studies. For instance, some studies used polymerase chain reaction (PCR)-confirmed cases to identify their population when testing was insufficient, which led to a varied kind of direction and magnitude of bias. Thus, if symptomatic individuals do not seek healthcare or cannot be diagnosed due to insufficient testing availability, which occurred in some countries at the beginning of the COVID-19 pandemic, misclassification bias can underestimate the infection prevalence. This was observed in studies carried out in countries whose healthcare system was overwhelmed and, therefore, patients were diagnosed with COVID-19 without PCR if they satisfied several other criteria. This type of change in diagnostic criteria happened in the Wuhan area of China.[34],[35] Thus, many studies have used different criteria, whether the case definition of the World Health Organization or different definitions, which may cause inconsistency. This bias could alter the estimates of both the infection and fatality rates.[36]

Another example is studies that only include cases detected early in an epidemic, as this can lead to underestimation of the incubation period. People with a longer incubation period may not show symptoms until the end of that period and be excluded from the study, which leads to right-truncation bias.

 Challenging and Solutions

As highlighted in the previous sections, since the onset of the SARS-CoV-2 outbreak, several studies have been conducted on various aspects of this disease to share knowledge and medical experiences at the beginning of the pandemic. Many such studies were considered the initial point for further research and basis for health care policies. However, the data sets utilized for such studies were liable to selection bias in their sample or misclassification bias in their case ascertainment, both of which led to misleading estimates. This issue has grave consequences for public health policies and future research and, subsequently, the translation between medical research and clinical practice. Many healthcare professionals depend on what is currently published, and the issue of lack of reliability in article quality may cause inadequate decision-making.

In order to overcome the issue of bias in emerging infectious diseases, whether future COVID-19 research or research on any further pandemic, representative population surveys or sampling strategies are required to provide reliable evidence. As correction of bias in the analysis is challenging, spending the time and effort to design a good study is rewarded with results that are more likely to approximate the truth. While bias may not be avoidable, the researcher should be aware of the direction in which such bias could affect results, as this will affect the quality of the published articles. In addition, using standard methods throughout the study to collect all exposure and outcome information would help to avoid problems with bias. Collecting extra information in the study is also advisable to help establish the likely extent and direction of the bias.

The issue of emerging new infectious diseases requires emergency measures to adopt policies that save lives. Those policies usually depend on surveillance and medical communication through research; therefore, it is important to promote a thorough evaluation of medical publications and the careful interpretation of data despite the urgency of the pandemic. These recommendations should be considered in any future infectious disease pandemic.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.


1Delgado-Rodríguez M, Llorca J. Bias. J Epidemiol Community Health 2004;58:635-41.
2Krämer A, Akmatov M, Kretzschmar M. Principles of infectious disease epidemiology. Mod Infect Dis Epidemiol. New York: Springer; 2009. p. 85-99.
3Griffith GJ, Morris TT, Tudball MJ, Herbert A, Mancano G, Pike L, et al. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat Commun 2020;11:5749.
4Quinn TJ, Burton JK, Carter B, Cooper N, Dwan K, Field R, et al. Following the science? Comparison of methodological and reporting quality of covid-19 and other research from the first wave of the pandemic. BMC Med 2021;19:46.
5Takahashi N, Abe R, Hattori N, Matsumura Y, Oshima T, Taniguchi T, et al. Clinical course of a critically ill patient with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). J Artif Organs 2020;23:397-400.
6Sun Y, Koh V, Marimuthu K, Ng OT, Young B, Vasoo S, et al. Epidemiological and clinical predictors of COVID-19. Clin Infect Dis 2020;71:786-92.
7London AJ, Kimmelman J. Against pandemic research exceptionalism. Science 2020;368:476-7.
8Horbach S. Pandemic publishing: Medical journals drastically speed up their publication process for covid-19. Quant Sci Stud 2020;1:1-16.
9Casigliani V, De Nard F, De Vita E, Arzilli G, Grosso FM, Quattrone F, et al. Too much information, too little evidence: Is waste in research fuelling the covid-19 infodemic? BMJ 2020;370:m2672.
10Glasziou PP, Sanders S, Hoffmann T. Waste in covid-19 research. BMJ 2020;369:m1847.
11Karmakar S, Dhar R, Jee B. Covid-19: Research methods must be flexible in a crisis. BMJ 2020;370:m2668.
12Nowakowska J, Sobocińska J, Lewicki M, Lemańska Ż, Rzymski P. When science goes viral: The research response during three months of the COVID-19 outbreak. Biomed Pharmacother 2020;129:110451.
13Gianola S, Jesus TS, Bargeri S, Castellini G. Characteristics of academic publications, preprints, and registered clinical trials on the COVID-19 pandemic. PLoS One 2020;15:e0240123.
14Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal BMJ 2020;369:m1328.
15Alexander PE, Debono VB, Mammen MJ, Iorio A, Aryal K, Deng D, et al. COVID-19 coronavirus research has overall low methodological quality thus far: Case in point for chloroquine/hydroxychloroquine. J Clin Epidemiol 2020;123:120-6.
16Raynaud M, Zhang H, Louis K, Goutaudier V, Wang J, Dubourg Q, et al. COVID-19-related medical research: A meta-research and critical appraisal. BMC Med Res Methodol 2021;21:1.
17Sackett DL. Bias in analytic research. J Chronic Dis 1979;32:51-63.
18Steineck G, Ahlbom A. A definition of bias founded on the concept of the study base. Epidemiology 1992;3:477-82.
19Murray J, Cohen AL. Infectious disease surveillance. Int Encycl Public Health 2017;4:222-229.
20Burrell CJ, Howard CR, Murphy FA. Fenner and White's Medical Virology. 5th ed. UK: Elsevier; 2016. p. 1-583.
21Kahn R, Kennedy-Shaffer L, Grad YH, Robins JM, Lipsitch M. Potential biases arising from epidemic dynamics in observational seroprotection studies. Am J Epidemiol 2021;190:328-35.
22Alleva G, Arbia G, Falorsi PD, Nardelli V, Zuliani A. A sample approach to the estimation of the critical parameters of the SARS-CoV-2 epidemics: An operational design. arXiv 2020. Available from: [Last accessed 2021 Jul 06].
23Accorsi EK, Qiu X, Rumpler E, Kennedy-Shaffer L, Kahn R, Joshi K, et al. How to detect and reduce potential sources of biases in studies of SARS-CoV-2 and COVID-19. Eur J Epidemiol 2021;36:179-96.
24Ricoca Peixoto V, Nunes C, Abrantes A. Epidemic surveillance of covid-19: Considering uncertainty and under-ascertainment. Port J Public Health 2020;38:23-9.
25Suhail Y, Afzal J, Kshitiz. Incorporating and addressing testing bias within estimates of epidemic dynamics for SARS-CoV-2. BMC Med Res Methodol 2021;21:11.
26Deeks JJ, Dinnes J, Takwoingi Y, Davenport C, Spijker R, Taylor-Phillips S, et al. Antibody tests for identification of current and past infection with SARS-CoV-2. Cochrane Database Syst Rev 2020;6:CD013652.
27Takahashi S, Greenhouse B, Rodríguez-Barraquer I. Are seroprevalence estimates for severe acute respiratory syndrome coronavirus 2 biased? J Infect Dis 2020;222:1772-5.
28Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. Lancet 2020;395:689-97.
29Zhao Q, Chen Y, Small DS. Analysis of the epidemic growth of the early 2019-nCoV outbreak using internationally confirmed cases. medRxiv 2020. doi: 10.1101/2020.02.06.20020941.
30Grépin KA, Ho TL, Liu Z, Marion S, Piper J, Worsnop CZ, et al. Evidence of the effectiveness of travel-related measures during the early phase of the COVID-19 pandemic: A rapid systematic review. BMJ Glob Health 2021;6:e004537.
31Hernández JM. SARS-CoV-2 risk misclassification explains poor COVID-19 management. Lancet 2020;396:1733-4.
32Guarascio F. Coronavirus is Not High Threat toWorkers, EU Says, Causing Outcry; 2020. Available from: https://www. coronavirus-is-not-high-threat-to-workers-eu-says-idUSKBN23A1H9. Last accessed on 2021 Jun 25].
33European Commission. Commission Directive (EU) amending Annex III to Directive 2000/54/EC. Official J European Union 2020;262:21.
34Ai T, Yang Z, Hou H, Zhan C, Chen C, Lv W, et al. Correlation of chest CT and RT-PCR testing for coronavirus disease 2019 (COVID-19) in China: A report of 1014 cases. Radiology 2020;296:E32-40.
35Fang Y, Zhang H, Xie J, Lin M, Ying L, Pang P, et al. Sensitivity of chest CT for COVID-19: Comparison to RT-PCR. Radiology 2020;296:E115-7.
36Burstyn I, Goldstein ND, Gustafson P. Towards reduction in bias in epidemic curves due to outcome misclassification through Bayesian analysis of time-series of laboratory test results: Case study of COVID-19 in Alberta, Canada and Philadelphia, USA. BMC Med Res Methodol 2020;20:146.