• Open access
  • Published: 27 October 2021

A narrative review on the validity of electronic health record-based research in epidemiology

  • Milena A. Gianfrancesco 1 &
  • Neal D. Goldstein   ORCID: orcid.org/0000-0002-9597-5251 2  

BMC Medical Research Methodology volume  21 , Article number:  234 ( 2021 ) Cite this article

9965 Accesses

38 Citations

5 Altmetric

Metrics details

Electronic health records (EHRs) are widely used in epidemiological research, but the validity of the results is dependent upon the assumptions made about the healthcare system, the patient, and the provider. In this review, we identify four overarching challenges in using EHR-based data for epidemiological analysis, with a particular emphasis on threats to validity. These challenges include representativeness of the EHR to a target population, the availability and interpretability of clinical and non-clinical data, and missing data at both the variable and observation levels. Each challenge reveals layers of assumptions that the epidemiologist is required to make, from the point of patient entry into the healthcare system, to the provider documenting the results of the clinical exam and follow-up of the patient longitudinally; all with the potential to bias the results of analysis of these data. Understanding the extent of as well as remediating potential biases requires a variety of methodological approaches, from traditional sensitivity analyses and validation studies, to newer techniques such as natural language processing. Beyond methods to address these challenges, it will remain crucial for epidemiologists to engage with clinicians and informaticians at their institutions to ensure data quality and accessibility by forming multidisciplinary teams around specific research projects.

Peer Review reports

The proliferation of electronic health records (EHRs) spurred on by federal government incentives over the past few decades has resulted in greater than an 80% adoption-rate at hospitals [ 1 ] and close to 90% in office-based practices [ 2 ] in the United States. A natural consequence of the availability of electronic health data is the conduct of research with these data, both observational and experimental [ 3 ], due to lower overhead costs and lower burden of study recruitment [ 4 ]. Indeed, a search on PubMed for publications indexed by the MeSH term “electronic health records” reveals an exponential growth in biomedical literature, especially over the last 10 years with an excess of 50,000 publications.

An emerging literature is beginning to recognize the many challenges that still lay ahead in using EHR data for epidemiological investigations. Researchers in Europe identified 13 potential sources of “bias” (bias was defined as a contamination of the data) in EHR-based data covering almost every aspect of care delivery, from selective entrance into the healthcare system, to variation in care and documentation practices, to identification and extraction of the right data for analysis [ 5 ]. Many of the identified contaminants are directly relevant to traditional epidemiological threats to validity [ 4 ]. Data quality has consistently been invoked as a central challenge in EHRs. From a qualitative perspective, healthcare workers have described challenges in the healthcare environment (e.g., heavy workload), imperfect clinical documentation practices, and concerns over data extraction and reporting tools, all of which would impact the quality of data in the EHR [ 6 ]. From a quantitative perspective, researchers have noted limited sensitivity of diagnostic codes in the EHR when relying on discrete codings, noting that upon a manual chart review free text fields often capture the missed information, motivating such techniques as natural language processing (NLP) [ 7 ]. A systematic review of EHR-based studies also identified data quality as an overarching barrier to the use of EHRs in managing the health of the community, i.e. “population health” [ 8 ]. Encouragingly this same review also identified more facilitators than barriers to the use of EHRs in public health, suggesting that opportunities outweigh the challenges. Shortreed et al. further explored these opportunities discussing how EHRs can enhance pragmatic trials, bring additional sophistication to observational studies, aid in predictive modeling, and be linked together to create more comprehensive views of patients’ health [ 9 ]. Yet, as Shortreed and others have noted, significant challenges still remain.

It is our intention with this narrative review to discuss some of these challenges in further detail. In particular, we focus on specific epidemiological threats to validity -- internal and external -- and how EHR-based epidemiological research in particular can exacerbate some of these threats. We note that while there is some overlap in the challenges we discuss with traditional paper-based medical record research that has occurred for decades, the scale and scope of an EHR-based study is often well beyond what was traditionally possible in the manual chart review era and our applied examples attempt to reflect this. We also describe existing and emerging approaches for remediating these potential biases as they arise. A summary of these challenges may be found in Table 1 . Our review is grounded in the healthcare system in the United States, although we expect many of the issues we describe to be applicable regardless of locale; where necessary, we have flagged our comments as specific to the U.S.

Challenge #1: Representativeness

The selection process for how patients are captured in the EHR is complex and a function of geographic, social, demographic, and economic determinants [ 10 ]. This can be termed the catchment of the EHR. For a patient record to appear in the EHR the patient must have been registered in the system, typically to capture their demographic and billing information, and upon a clinical visit, their health details. While this process is not new to clinical epidemiology, what tends to separate EHR-based records from traditional paper-based records is the scale and scope of the data. Patient data may be available for longer periods of time longitudinally, as well as have data corresponding to interactions with multiple, potentially disparate, healthcare systems [ 11 ]. Given the consolidation of healthcare [ 12 ] and aggregated views of multiple EHRs through health information networks or exchanges [ 11 ] the ability to have a complete view of the patients’ total health is increasing. Importantly, the epidemiologist must ascertain whether the population captured within the EHR or EHR-derived data is representative of the population targeted for inference. This is particularly true under the paradigm of population health and inferring the health status of a community from EHR-based records [ 13 ]. For example, a study of Clostridium difficile infection at an urban safety net hospital in Philadelphia, Pennsylvania demonstrated notable differences in risk factors in the hospital’s EHR compared to national surveillance data, suggesting how catchment can influence epidemiologic measures [ 14 ]. Even health-related data captured through health information exchanges may be incomplete [ 15 ].

Several hypothetical study settings can further help the epidemiologist appreciate the relationship between representativeness and validity in EHR research. In the first hypothetical, an EHR-based study is conducted from a single-location federally qualified health center, and in the second hypothetical, an EHR-based study is conducted from a large academic health system. Suppose both studies occur in the same geographic area. It is reasonable to believe the patient populations captured in both EHRs will be quite different and the catchment process could lead to divergent estimates of disease or risk factor prevalence. The large academic health system may be less likely to capture primary care visits, as specialty care may drive the preponderance of patient encounters. However, this is not a bias per se : if the target of inference from these two hypothetical EHR-based studies is the local community, then selection bias becomes a distinct possibility. The epidemiologist must also consider the potential for generalizability and transportability -- two facets of external validity that respectively relate to the extrapolation of study findings to the source population or a different population altogether -- if there are unmeasured effect modifiers, treatment interference, or compound treatments in the community targeted for inference [ 16 ].

There are several approaches for ascertaining representativeness of EHR-based data. Comparing the EHR-derived sample to Census estimates of demography is straightforward but has several important limitations. First, as previously described, the catchment process may be driven by discordant geographical areas, especially for specialty care settings. Second and third, the EHR may have limited or inaccurate information on socioeconomic status, race, and ethnicity that one may wish to compare [ 17 , 18 ], and conversely the Census has limited estimates of health, chiefly disability, fertility, and insurance and payments [ 19 ]. If selection bias is suspected as a result of missing visits in a longitudinal study [ 20 ] or the catchment process in a cross-sectional study [ 21 ], using inverse probability weighting may remediate its influence. Comparing the weighted estimates to the original, non-weighted estimates provides insight into differences in the study participants. In the population health paradigm whereby the EHR is used as a surveillance tool to identify community health disparities [ 13 ], one also needs to be concerned about representativeness. There are emerging approaches for producing such small area community estimates from large observational datasets [ 22 , 23 ]. Conceivably, these approaches may also be useful for identifying issues of representativeness, for example by comparing stratified estimates across sociodemographic or other factors that may relate to catchment. Approaches for issues concerning representativeness specifically as it applies to external validity may be found in these references [ 24 , 25 ].

Challenge #2: Data availability and interpretation

Sub-challenge #2.1: billing versus clinical versus epidemiological needs.

There is an inherent tension in the use of EHR-based data for research purposes: the EHR was never originally designed for research. In the U.S., the Health Information Technology for Economic and Clinical Health Act, which promoted EHRs as a platform for comparative effectiveness research, was an attempt to address this deficiency [ 26 ]. A brief history of the evolution of the modern EHR reveals a technology that was optimized for capturing health details relevant for billing, scheduling, and clinical record keeping [ 27 ]. As such, the availability of data for fundamental markers of upstream health that are important for identifying inequities, such as socioeconomic status, race, ethnicity, and other social determinants of health (SDOH), may be insufficiently captured in the EHR [ 17 , 18 ]. Similarly, behavioral risk factors, such as being a sexual minority person, have historically been insufficiently recorded as discrete variables. It is only recently that such data are beginning to be captured in the EHR [ 28 , 29 ], or techniques such as NLP have made it possible to extract these details when stored in free text notes (described further in “ Unstructured data: clinical notes and reports ” section).

As an example, assessing clinical morbidities in the EHR may be done on the basis of extracting appropriate International Classification of Diseases (ICD) codes, used for billing and reimbursement in the U.S. These codes are known to have low sensitivity despite high specificity for accurate diagnostic status [ 30 , 31 ]. Expressed as predictive values, which depend upon prevalence, presence of a diagnostic code is a likely indicator of a disease state, whereas absence of a diagnostic code is a less reliable indicator of the absence of that morbidity. There may further be variation by clinical domain in that ICD codes may exist but not be used in some specialties [ 32 ], variation by coding vocabulary such as the use of SNOMED for clinical documentation versus ICD for billing necessitating an ontology mapper [ 33 ], and variation by the use of “rule-out” diagnostic codes resulting in false-positive diagnoses [ 34 , 35 , 36 ]. Relatedly is the notion of upcoding, or the billing of tests, procedures, or diagnoses to receive inflated reimbursement, which, although posited to be problematic in EHRs [ 37 ] in at least one study, has not been shown to have occurred [ 38 ]. In the U.S., the billing and reimbursement model, such as fee-for-service versus managed care, may result in varying diagnostic code sensitivities and specificities, especially if upcoding is occurring [ 39 ]. In short, there is potential for misclassification of key health data in the EHR.

Misclassification can potentially be addressed through a validation study (resources permitting) or application of quantitative bias analysis, and there is a rich literature regarding the treatment of misclassified data in statistics and epidemiology. Readers are referred to these texts as a starting point [ 40 , 41 ]. Duda et al. and Shepherd et al. have described an innovative data audit approach applicable to secondary analysis of observational data, such as EHR-derived data, that incorporates the audit error rate directly in the regression analysis to reduce information bias [ 42 , 43 ]. Outside of methodological tricks in the face of imperfect data, researchers must proactively engage with clinical and informatics colleagues to ensure that the right data for the research interests are available and accessible.

Sub-challenge #2.2: Consistency in data and interpretation

For the epidemiologist, abstracting data from the EHR into a research-ready analytic dataset presents a host of complications surrounding data availability, consistency and interpretation. It is easy to conflate the total volume of data in the EHR with data that are usable for research, however expectations should be tempered. Weiskopf et al. have noted such challenges for the researcher: in their study, less than 50% of patient records had “complete” data for research purposes per their four definitions of completeness [ 44 ]. Decisions made about the treatment of incomplete data can induce selection bias or impact precision of estimates (see Challenges #1 , #3 , and #4 ). The COVID-19 pandemic has further demonstrated the challenge of obtaining research data from EHRs across multiple health systems [ 45 ]. On the other hand, EHRs have a key advantage of providing near real-time data as opposed to many epidemiological studies that have a specific endpoint or are retrospective in nature. Such real-time data availability was leveraged during COVID-19 to help healthcare systems manage their pandemic response [ 46 , 47 ]. Logistical and technical issues aside, healthcare and documentation practices are nuanced to their local environments. In fact, researchers have demonstrated how the same research question analyzed in distinct clinical databases can yield different results [ 48 ].

Once the data are obtained, choices regarding operationalization of variables have the potential to induce information bias. Several hypothetical examples can help demonstrate this point. As a first example, differences in laboratory reporting may result in measurement error or misclassification. While the order for a particular laboratory assay is likely consistent within the healthcare system, patients frequently have a choice where to have that order fulfilled. Given the breadth of assays and reporting differences that may differ lab to lab [ 49 ], it is possible that the researcher working with the raw data may not consider all possible permutations. In other words, there may be lack of consistency in the reporting of the assay results. As a second example, raw clinical data requires interpretation to become actionable. A researcher interested in capturing a patient’s Charlson comorbidity index, which is based on 16 potential diagnoses plus the patient’s age [ 50 ], may never find such a variable in the EHR. Rather, this would require operationalization based on the raw data, each of which may be misclassified. Use of such composite measures introduces the notion of “differential item functioning”, whereby a summary indicator of a complexly measured health phenomenon may differ from group to group [ 51 ]. In this case, as opposed to a measurement error bias, this is one of residual confounding in that a key (unmeasured) variable is driving the differences. Remediation of these threats to validity may involve validation studies to determine the accuracy of a particular classifier, sensitivity analysis employing alternative interpretations when the raw data are available, and omitting or imputing biased or latent variables [ 40 , 41 , 52 ]. Importantly, in all cases, the epidemiologists should work with the various health care providers and personnel who have measured and recorded the data present in the EHR, as they likely understand it best.

Furthermore and related to “Billing versus Clinical versus Epidemiological Needs” section, the healthcare system in the U.S. is fragmented with multiple payers, both public and private, potentially exacerbating the data quality issues we describe, especially when linking data across healthcare systems. Single payer systems have enabled large and near-complete population-based studies due to data availability and consistency [ 53 , 54 , 55 ]. Data may also be inconsistent for retrospective longitudinal studies spanning many years if there have been changes to coding standards or practices over time, for example due to the transition from ICD-9 to ICD-10 largely occurring in the mid 2010s or the adoption of the Patient Protection and Affordable Care Act in the U.S. in 2010 with its accompanying changes in billing. Exploratory data analysis may reveal unexpected differences in key variables, by place or time, and recoding, when possible, can enforce consistency.

Sub-challenge #2.3: Unstructured data: clinical notes and reports

There may also be scenarios where structured data fields, while available, are not traditionally or consistently used within a given medical center or by a given provider. For example, reporting of adverse events of medications, disease symptoms, and vaccinations or hospitalizations occurring at different facility/health networks may not always be entered by providers in structured EHR fields. Instead, these types of patient experiences may be more likely to be documented in an unstructured clinical note, report (e.g. pathology or radiology report), or scanned document. Therefore, reliance on structured data to identify and study such issues may result in underestimation and potentially biased results.

Advances in NLP currently allow for information to be extracted from unstructured clinical notes and text fields in a reliable and accurate manner using computational methods. NLP utilizes a range of different statistical, machine learning, and linguistic techniques, and when applied to EHR data, has the potential to facilitate more accurate detection of events not traditionally located or consistently used in structured fields. Various NLP methods can be implemented in medical text analysis, ranging from simplistic and fast term recognition systems to more advanced, commercial NLP systems [ 56 ]. Several studies have successfully utilized text mining to extract information on a variety of health-related issues within clinical notes, such as opioid use [ 57 ], adverse events [ 58 , 59 ], symptoms (e.g., shortness of breath, depression, pain) [ 60 ], and disease phenotype information documented in pathology or radiology reports, including cancer stage, histology, and tumor grade [ 61 ], and lupus nephritis [ 32 ]. It is worth noting that scanned documents involve an additional layer of computation, relying on techniques such as optical character recognition, before NLP can be applied.

Hybrid approaches that combine both narrative and structured data, such as ICD codes, to improve accuracy of detecting phenotypes have also demonstrated high performance. Banerji et al. found that using ICD-9 codes to identify allergic drug reactions in the EHR had a positive predictive value of 46%, while an NLP algorithm in conjunction with ICD-9 codes resulted in a positive predictive value of 86%; negative predictive value also increased in the combined algorithm (76%) compared to ICD-9 codes alone (39%) [ 62 ]. In another example, researchers found that the combination of unstructured clinical notes with structured data for prediction tasks involving in-hospital mortality and 30-day hospital readmission outperformed models using either clinical notes or structured data alone [ 63 ]. As we move forward in analyzing EHR data, it will be important to take advantage of the wealth of information buried in unstructured data to assist in phenotyping patient characteristics and outcomes, capture missing confounders used in multivariate analyses, and develop prediction models.

Challenge #3: Missing measurements

While clinical notes may be useful to recover incomplete information from structured data fields, it may be the case that certain variables are not collected within the EHR at all. As mentioned above, it is important to remember that EHRs were not developed as a research tool (see “ Billing versus clinical versus epidemiological needs ” section), and important variables often used in epidemiologic research may not be typically included in EHRs including socioeconomic status (education, income, occupation) and SDOH [ 17 , 18 ]. Depending upon the interest of the provider or clinical importance placed upon a given variable, this information may be included in clinical notes. While NLP could be used to capture these variables, because they may not be consistently captured, there may be bias in identifying those with a positive mention as a positive case and those with no mention as a negative case. For example, if a given provider inquires about homelessness of a patient based on knowledge of the patient’s situation or other external factors and documents this in the clinical note, we have greater assurance that this is a true positive case. However, lack of mention of homelessness in a clinical note should not be assumed as a true negative case for several reasons: not all providers may feel comfortable asking about and/or documenting homelessness, they may not deem this variable worth noting, or implicit bias among clinicians may affect what is captured. As a result, such cases (i.e. no mention of homelessness) may be incorrectly identified as “not homeless,” leading to selection bias should a researcher form a cohort exclusively of patients who are identified as homeless in the EHR.

Not adjusting for certain measurements missing from EHR data can also lead to biased results if the measurement is an important confounder. Consider the example of distinguishing between prevalent and incident cases of disease when examining associations between disease treatments and patient outcomes [ 64 ]. The first date of an ICD code entered for a given patient may not necessarily be the true date of diagnosis, but rather documentation of an existing diagnosis. This limits the ability to adjust for disease duration, which may be an important confounder in studies comparing various treatments with patient outcomes over time, and may also lead to reverse causality if disease sequalae are assumed to be risk factors.

Methods to supplement EHR data with external data have been used to capture missing information. These methods may include imputation if information (e.g. race, lab values) is collected on a subset of patients within the EHR. It is important to examine whether missingness occurs completely at random or at random (“ignorable”), or not at random (“non-ignorable”), using the data available to determine factors associated with missingness, which will also inform the best imputation strategy to pursue, if any [ 65 , 66 ]. As an example, suppose we are interested in ascertaining a patient's BMI from the EHR. If men were less likely to have BMI measured than women, the probability of missing data (BMI) depends on the observed data (gender) and may therefore be predictable and imputable. On the other hand, suppose underweight individuals were less likely to have BMI measured; the probability of missing data depends on its own value, and as such is non-predictable and may require a validation study to confirm. Alternatively to imputing missing data, surrogate measures may be used, such as inferring area-based SES indicators, including median household income, percent poverty, or area deprivation index, by zip code [ 67 , 68 ]. Lastly, validation studies utilizing external datasets may prove helpful, such as supplementing EHR data with claims data that may be available for a subset of patients (see Challenge #4 ).

As EHRs are increasingly being used for research, there are active pushes to include more structured data fields that are important to population health research, such as SDOH [ 69 ]. Inclusion of such factors are likely to result in improved patient care and outcomes, through increased precision in disease diagnosis, more effective shared decision making, identification of risk factors, and tailoring services to a given population’s needs [ 70 ]. In fact, a recent review found that when individual level SDOH were included in predictive modeling, they overwhelmingly improved performance in medication adherence, risk of hospitalization, 30-day rehospitalizations, suicide attempts, and other healthcare services [ 71 ]. Whether or not these fields will be utilized after their inclusion in the EHR may ultimately depend upon federal and state incentives, as well as support from local stakeholders, and this does not address historic, retrospective analyses of these data.

Challenge #4: Missing visits

Beyond missing variable data that may not be captured during a clinical encounter, either through structured data or clinical notes, there also may be missing information for a patient as a whole. This can occur in a variety of ways; for example, a patient may have one or two documented visits in the EHR and then is never seen again (i.e. right censoring due to lost to follow-up), or a patient is referred from elsewhere to seek specialty care, with no information captured regarding other external issues (i.e. left censoring). This may be especially common in circumstances where a given EHR is more likely to capture specialty clinics versus primary care (see Challenge #1 ). A third scenario may include patients who appear, then are not observed for a long period of time, and then reappear: this case is particularly problematic as it may appear the patient was never lost to follow up but simply had fewer visits. In any of these scenarios, a researcher will lack a holistic view of the patient’s experiences, diagnoses, results, and more. As discussed above, assuming absence of a diagnostic code as absence of disease may lead to information and/or selection bias. Further, it has been demonstrated that one key source of bias in EHRs is “informed presence” bias, where those with more medical encounters are more likely to be diagnosed with various conditions (similar to Berkson’s bias) [ 72 ].

Several solutions to these issues have been proposed. For example, it is common for EHR studies to condition on observation time (i.e. ≥n visits required to be eligible into cohort); however, this may exclude a substantial amount of patients with certain characteristics, incurring a selection bias or limiting the generalizability of study findings (see Challenge #1 ). Other strategies attempt to account for missing visit biases through longitudinal imputation approaches; for example, if a patient missed a visit, a disease activity score can be imputed for that point in time, given other data points [ 73 , 74 ]. Surrogate measures may also be used to infer patient outcomes, such as controlling for “informative” missingness as an indicator variable or using actual number of missed visits that were scheduled as a proxy for external circumstances influencing care [ 20 ]. To address “informed presence” bias described above, conditioning on the number of health-care encounters may be appropriate [ 72 ]. Understanding the reason for the missing visit may help identify the best course of action and before imputing, one should be able to identify the type of missingness, whether “informative” or not [ 65 , 66 ]. For example, if distance to a healthcare location is related to appointment attendance, being able to account for this in analysis would be important: researchers have shown how the catchment of a healthcare facility can induce selection bias [ 21 ]. Relatedly, as telehealth becomes more common fueled by the COVID-19 pandemic [ 75 , 76 ], virtual visits may generate missingness of data recorded in the presence of a provider (e.g., blood pressure if the patient does not have access to a sphygmomanometer; see Challenge #3 ), or necessitate a stratified analysis by visit type to assess for effect modification.

Another common approach is to supplement EHR information with external data sources, such as insurance claims data, when available. Unlike a given EHR, claims data are able to capture a patient’s interaction with the health care system across organizations, and additionally includes pharmacy data such as if a prescription was filled or refilled. Often researchers examine a subset of patients eligible for Medicaid/Medicare and compare what is documented in claims with information available in the EHR [ 77 ]. That is, are there additional medications, diagnoses, hospitalizations found in the claims dataset that were not present in the EHR. In a study by Franklin et al., researchers utilized a linked database of Medicare Advantage claims and comprehensive EHR data from a multi-specialty outpatient practice to determine which dataset would be more accurate in predicting medication adherence [ 77 ]. They found that both datasets were comparable in identifying those with poor adherence, though each dataset incorporated different variables.

While validation studies such as those using claims data allow researchers to gain an understanding as to how accurate and complete a given EHR is, this may only be limited to the specific subpopulation examined (i.e. those eligible for Medicaid, or those over 65 years for Medicare). One study examined congruence between EHR of a community health center and Medicaid claims with respect to diabetes [ 78 ]. They found that patients who were older, male, Spanish-speaking, above the federal poverty level, or who had discontinuous insurance were more likely to have services documented in the EHR as compared to Medicaid claims data. Therefore, while claims data may help supplement and validate information in the EHR, on their own they may underestimate care in certain populations.

Research utilizing EHR data has undoubtedly positively impacted the field of public health through its ability to provide large-scale, longitudinal data on a diverse set of patients, and will continue to do so in the future as more epidemiologists take advantage of this data source. EHR data’s ability to capture individuals that traditionally aren’t included in clinical trials, cohort studies, and even claims datasets allows researchers to measure longitudinal outcomes in patients and perhaps change the understanding of potential risk factors.

However, as outlined in this review, there are important caveats to EHR analysis that need to be taken into account; failure to do so may threaten study validity. The representativeness of EHR data depends on the catchment area of the center and corresponding target population. Tools are available to evaluate and remedy these issues, which are critical to study validity as well as extrapolation of study findings. Data availability and interpretation, missing measurements, and missing visits are also key challenges, as EHRs were not specifically developed for research purposes, despite their common use for such. Taking advantage of all available EHR data, whether it be structured or unstructured fields through NLP, will be important in understanding the patient experience and identifying key phenotypes. Beyond methods to address these concerns, it will remain crucial for epidemiologists and data analysts to engage with clinicians and informaticians at their institutions to ensure data quality and accessibility by forming multidisciplinary teams around specific research projects. Lastly, integration across multiple EHRs, or datasets that encompass multi-institutional EHR records, add an additional layer of data quality and validity issues, with the potential to exacerbate the above-stated challenges found within a single EHR. At minimum, such studies should account for correlated errors [ 79 , 80 ], and investigate whether modularization, or submechanisms that determine whether data are observed or missing in each EHR, exist [ 65 ].

The identified challenges may also apply to secondary analysis of other large healthcare databases, such as claims data, although it is important not to conflate the two types of data. EHR data are driven by clinical care and claims data are driven by the reimbursement process where there is a financial incentive to capture diagnoses, procedures, and medications [ 48 ]. The source of data likely influences the availability, accuracy, and completeness of data. The fundamental representation of data may also differ as a record in a claims database corresponds to a “claim” as opposed to an “encounter” in the EHR. As such, the representativeness of the database populations, the sensitivity and specificity of variables, as well as the mechanisms of missingness in claims data may differ from EHR data. One study that evaluated pediatric quality care measures, such as BMI, noted inferior sensitivity based on claims data alone [ 81 ]. Linking claims data to EHR data has been proposed to enhance study validity, but many of the caveats raised in herein still apply [ 82 ].

Although we focused on epidemiological challenges related to study validity, there are other important considerations for researchers working with EHR data. Privacy and security of data as well as institutional review board (IRB) or ethics board oversight of EHR-based studies should not be taken for granted. For researchers in the U.S., Goldstein and Sarwate described Health Insurance Portability and Accountability Act (HIPAA)-compliant approaches to ensure the privacy and security of EHR data used in epidemiological research, and presented emerging approaches to analyses that separate the data from analysis [ 83 ]. The IRB oversees the data collection process for EHR-based research and through the HIPAA Privacy Rule these data typically do not require informed consent provided they are retrospective and reside at the EHR’s institution [ 84 ]. Such research will also likely receive an exempt IRB review provided subjects are non-identifiable.


As EHRs are increasingly being used for research, epidemiologists can take advantage of the many tools and methods that already exist and apply them to the key challenges described above. By being aware of the limitations that the data present and proactively addressing them, EHR studies will be more robust, informative, and important to the understanding of health and disease in the population.

Availability of data and materials

All data and materials used in this review are described herein.


Body Mass Index

Electronic Health Record

International Classification of Diseases

Institutional review board/ethics board

Health Insurance Portability and Accountability Act

Natural Language Processing

Social Determinants of Health

Socioeconomic Status

Adler-Milstein J, Holmgren AJ, Kralovec P, et al. Electronic health record adoption in US hospitals: the emergence of a digital “advanced use” divide. J Am Med Inform Assoc. 2017;24(6):1142–8.

Article   PubMed   PubMed Central   Google Scholar  

Office of the National Coordinator for Health Information Technology. ‘Office-based physician electronic health record adoption’, Health IT quick-stat #50. dashboard.healthit.gov/quickstats/pages/physician-ehr-adoption-trends.php . Accessed 15 Jan 2019.

Cowie MR, Blomster JI, Curtis LH, et al. Electronic health records to facilitate clinical research. Clin Res Cardiol. 2017;106(1):1–9.

Article   PubMed   Google Scholar  

Casey JA, Schwartz BS, Stewart WF, et al. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016;37:61–81.

Verheij RA, Curcin V, Delaney BC, et al. Possible sources of bias in primary care electronic health record data use and reuse. J Med Internet Res. 2018;20(5):e185.

Ni K, Chu H, Zeng L, et al. Barriers and facilitators to data quality of electronic health records used for clinical research in China: a qualitative study. BMJ Open. 2019;9(7):e029314.

Coleman N, Halas G, Peeler W, et al. From patient care to research: a validation study examining the factors contributing to data quality in a primary care electronic medical record database. BMC Fam Pract. 2015;16:11.

Kruse CS, Stein A, Thomas H, et al. The use of electronic health records to support population health: a systematic review of the literature. J Med Syst. 2018;42(11):214.

Shortreed SM, Cook AJ, Coley RY, et al. Challenges and opportunities for using big health care data to advance medical science and public health. Am J Epidemiol. 2019;188(5):851–61.

In: Smedley BD, Stith AY, Nelson AR, editors. Unequal treatment: confronting racial and ethnic disparities in health care. Washington (DC) 2003.

Chaudhry B, Wang J, Wu S, et al. Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Ann Intern Med. 2006;144(10):742–52.

Cutler DM, Scott Morton F. Hospitals, market share, and consolidation. JAMA. 2013;310(18):1964–70.

Article   CAS   PubMed   Google Scholar  

Cocoros NM, Kirby C, Zambarano B, et al. RiskScape: a data visualization and aggregation platform for public health surveillance using routine electronic health record data. Am J Public Health. 2021;111(2):269–76.

Vader DT, Weldie C, Welles SL, et al. Hospital-acquired Clostridioides difficile infection among patients at an urban safety-net hospital in Philadelphia: demographics, neighborhood deprivation, and the transferability of national statistics. Infect Control Hosp Epidemiol. 2020;42:1–7.

Google Scholar  

Dixon BE, Gibson PJ, Frederickson Comer K, et al. Measuring population health using electronic health records: exploring biases and representativeness in a community health information exchange. Stud Health Technol Inform. 2015;216:1009.

PubMed   Google Scholar  

Hernán MA, VanderWeele TJ. Compound treatments and transportability of causal inference. Epidemiology. 2011;22(3):368–77.

Casey JA, Pollak J, Glymour MM, et al. Measures of SES for electronic health record-based research. Am J Prev Med. 2018;54(3):430–9.

Polubriaginof FCG, Ryan P, Salmasian H, et al. Challenges with quality of race and ethnicity data in observational databases. J Am Med Inform Assoc. 2019;26(8-9):730–6.

U.S. Census Bureau. Health. Available at: https://www.census.gov/topics/health.html . Accessed 19 Jan 2021.

Gianfrancesco MA, McCulloch CE, Trupin L, et al. Reweighting to address nonparticipation and missing data bias in a longitudinal electronic health record study. Ann Epidemiol. 2020;50:48–51 e2.

Goldstein ND, Kahal D, Testa K, Burstyn I. Inverse probability weighting for selection bias in a Delaware community health center electronic medical record study of community deprivation and hepatitis C prevalence. Ann Epidemiol. 2021;60:1–7.

Gelman A, Lax J, Phillips J, et al. Using multilevel regression and poststratification to estimate dynamic public opinion. Unpublished manuscript, Columbia University. 2016 Sep 11. Available at: http://www.stat.columbia.edu/~gelman/research/unpublished/MRT(1).pdf . Accessed 22 Jan 2021.

Quick H, Terloyeva D, Wu Y, et al. Trends in tract-level prevalence of obesity in philadelphia by race-ethnicity, space, and time. Epidemiology. 2020;31(1):15–21.

Lesko CR, Buchanan AL, Westreich D, Edwards JK, Hudgens MG, Cole SR. Generalizing study results: a potential outcomes perspective. Epidemiology. 2017;28(4):553–61.

Westreich D, Edwards JK, Lesko CR, Stuart E, Cole SR. Transportability of trial results using inverse odds of sampling weights. Am J Epidemiol. 2017;186(8):1010–4.

Congressional Research Services (CRS). The Health Information Technology for Economic and Clinical Health (HITECH) Act. 2009. Available at: https://crsreports.congress.gov/product/pdf/R/R40161/9 . Accessed Jan 22 2021.

Hersh WR. The electronic medical record: Promises and problems. Journal of the American Society for Information Science. 1995;46(10):772–6.

Article   Google Scholar  

Collecting sexual orientation and gender identity data in electronic health records: workshop summary. Washington (DC) 2013.

Committee on the Recommended Social and Behavioral Domains and Measures for Electronic Health Records; Board on Population Health and Public Health Practice; Institute of Medicine. Capturing social and behavioral domains and measures in electronic health records: phase 2. Washington (DC): National Academies Press (US); 2015.

Goff SL, Pekow PS, Markenson G, et al. Validity of using ICD-9-CM codes to identify selected categories of obstetric complications, procedures and co-morbidities. Paediatr Perinat Epidemiol. 2012;26(5):421–9.

Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58(4):323–37.

Gianfrancesco MA. Application of text mining methods to identify lupus nephritis from electronic health records. Lupus Science & Medicine. 2019;6:A142.

National Library of Medicine. SNOMED CT to ICD-10-CM Map. Available at: https://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_icd10cm.html . Accessed 2 Jul 2021.

Klabunde CN, Harlan LC, Warren JL. Data sources for measuring comorbidity: a comparison of hospital records and medicare claims for cancer patients. Med Care. 2006;44(10):921–8.

Burles K, Innes G, Senior K, Lang E, McRae A. Limitations of pulmonary embolism ICD-10 codes in emergency department administrative data: let the buyer beware. BMC Med Res Methodol. 2017;17(1):89.

Asgari MM, Wu JJ, Gelfand JM, Salman C, Curtis JR, Harrold LR, et al. Validity of diagnostic codes and prevalence of psoriasis and psoriatic arthritis in a managed care population, 1996-2009. Pharmacoepidemiol Drug Saf. 2013;22(8):842–9.

Hoffman S, Podgurski A. Big bad data: law, public health, and biomedical databases. J Law Med Ethics. 2013;41(Suppl 1):56–60.

Adler-Milstein J, Jha AK. Electronic health records: the authors reply. Health Aff. 2014;33(10):1877.

Geruso M, Layton T. Upcoding: evidence from medicare on squishy risk adjustment. J Polit Econ. 2020;12(3):984–1026.

Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. New York: Springer-Verlag New York; 2009.

Book   Google Scholar  

Gustafson P. Measurement error and misclassification in statistics and epidemiology: impacts and Bayesian adjustments. Boca Raton: Chapman and Hall/CRC; 2004.

Duda SN, Shepherd BE, Gadd CS, et al. Measuring the quality of observational study data in an international HIV research network. PLoS One. 2012;7(4):e33908.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Shepherd BE, Yu C. Accounting for data errors discovered from an audit in multiple linear regression. Biometrics. 2011;67(3):1083–91.

Weiskopf NG, Hripcsak G, Swaminathan S, et al. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform. 2013;46(5):830–6.

Kaiser Health News. As coronavirus strikes, crucial data in electronic health records hard to harvest. Available at: https://khn.org/news/as-coronavirus-strikes-crucial-data-in-electronic-health-records-hard-to-harvest/ . Accessed 15 Jan 2021.

Reeves JJ, Hollandsworth HM, Torriani FJ, Taplitz R, Abeles S, Tai-Seale M, et al. Rapid response to COVID-19: health informatics support for outbreak management in an academic health system. J Am Med Inform Assoc. 2020;27(6):853–9.

Grange ES, Neil EJ, Stoffel M, Singh AP, Tseng E, Resco-Summers K, et al. Responding to COVID-19: The UW medicine information technology services experience. Appl Clin Inform. 2020;11(2):265–75.

Madigan D, Ryan PB, Schuemie M, et al. Evaluating the impact of database heterogeneity on observational study results. Am J Epidemiol. 2013;178(4):645–51.

Lippi G, Mattiuzzi C. Critical laboratory values communication: summary recommendations from available guidelines. Ann Transl Med. 2016;4(20):400.

Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83.

Jones RN. Differential item functioning and its relevance to epidemiology. Curr Epidemiol Rep. 2019;6:174–83.

Edwards JK, Cole SR, Troester MA, Richardson DB. Accounting for misclassified outcomes in binary regression models using multiple imputation with internal validation data. Am J Epidemiol. 2013;177(9):904–12.

Satkunasivam R, Klaassen Z, Ravi B, Fok KH, Menser T, Kash B, et al. Relation between surgeon age and postoperative outcomes: a population-based cohort study. CMAJ. 2020;192(15):E385–92.

Melamed N, Asztalos E, Murphy K, Zaltz A, Redelmeier D, Shah BR, et al. Neurodevelopmental disorders among term infants exposed to antenatal corticosteroids during pregnancy: a population-based study. BMJ Open. 2019;9(9):e031197.

Kao LT, Lee HC, Lin HC, Tsai MC, Chung SD. Healthcare service utilization by patients with obstructive sleep apnea: a population-based study. PLoS One. 2015;10(9):e0137459.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Jung K, LePendu P, Iyer S, Bauer-Mehren A, Percha B, Shah NH. Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. J Am Med Inform Assoc. 2015;22(1):121–31.

Canan C, Polinski JM, Alexander GC, et al. Automatable algorithms to identify nonmedical opioid use using electronic data: a systematic review. J Am Med Inform Assoc. 2017;24(6):1204–10.

Iqbal E, Mallah R, Jackson RG, et al. Identification of adverse drug events from free text electronic patient records and information in a large mental health case register. PLoS One. 2015;10(8):e0134208.

Rochefort CM, Verma AD, Eguale T, et al. A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data. J Am Med Inform Assoc. 2015;22(1):155–65.

Koleck TA, Dreisbach C, Bourne PE, et al. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc. 2019;26(4):364–79.

Wang L, Luo L, Wang Y, et al. Natural language processing for populating lung cancer clinical research data. BMC Med Inform Decis Mak. 2019;19(Suppl 5):239.

Banerji A, Lai KH, Li Y, et al. Natural language processing combined with ICD-9-CM codes as a novel method to study the epidemiology of allergic drug reactions. J Allergy Clin Immunol Pract. 2020;8(3):1032–1038.e1.

Zhang D, Yin C, Zeng J, et al. Combining structured and unstructured data for predictive models: a deep learning approach. BMC Med Inform Decis Mak. 2020;20(1):280.

Farmer R, Mathur R, Bhaskaran K, Eastwood SV, Chaturvedi N, Smeeth L. Promises and pitfalls of electronic health record analysis. Diabetologia. 2018;61:1241–8.

Haneuse S, Arterburn D, Daniels MJ. Assessing missing data assumptions in EHR-based studies: a complex and underappreciated task. JAMA Netw Open. 2021;4(2):e210184.

Groenwold RHH. Informative missingness in electronic health record systems: the curse of knowing. Diagn Progn Res. 2020;4:8.

Berkowitz SA, Traore CY, Singer DE, et al. Evaluating area-based socioeconomic status indicators for monitoring disparities within health care systems: results from a primary care network. Health Serv Res. 2015;50(2):398–417.

Kind AJH, Buckingham WR. Making neighborhood-disadvantage metrics accessible - the neighborhood atlas. N Engl J Med. 2018;378(26):2456–8.

Cantor MN, Thorpe L. Integrating data on social determinants of health into electronic health records. Health Aff. 2018;37(4):585–90.

Adler NE, Stead WW. Patients in context--EHR capture of social and behavioral determinants of health. N Engl J Med. 2015;372(8):698–701.

Chen M, Tan X, Padman R. Social determinants of health in electronic health records and their impact on analysis and risk prediction: a systematic review. J Am Med Inform Assoc. 2020;27(11):1764–73.

Goldstein BA, Bhavsar NA, Phelan M, et al. Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am J Epidemiol. 2016;184(11):847–55.

Petersen I, Welch CA, Nazareth I, et al. Health indicator recording in UK primary care electronic health records: key implications for handling missing data. Clin Epidemiol. 2019;11:157–67.

Li R, Chen Y, Moore JH. Integration of genetic and clinical information to improve imputation of data missing from electronic health records. J Am Med Inform Assoc. 2019;26(10):1056–63.

Koonin LM, Hoots B, Tsang CA, Leroy Z, Farris K, Jolly T, et al. Trends in the use of telehealth during the emergence of the COVID-19 pandemic - United States, January-March 2020. MMWR Morb Mortal Wkly Rep. 2020;69(43):1595–9.

Barnett ML, Ray KN, Souza J, Mehrotra A. Trends in telemedicine use in a large commercially insured population, 2005-2017. JAMA. 2018;320(20):2147–9.

Franklin JM, Gopalakrishnan C, Krumme AA, et al. The relative benefits of claims and electronic health record data for predicting medication adherence trajectory. Am Heart J. 2018;197:153–62.

Devoe JE, Gold R, McIntire P, et al. Electronic health records vs Medicaid claims: completeness of diabetes preventive care data in community health centers. Ann Fam Med. 2011;9(4):351–8.

Schmajuk G, Li J, Evans M, Anastasiou C, Izadi Z, Kay JL, et al. RISE registry reveals potential gaps in medication safety for new users of biologics and targeted synthetic DMARDs. Semin Arthritis Rheum. 2020 Dec;50(6):1542–8.

Izadi Z, Schmajuk G, Gianfrancesco M, Subash M, Evans M, Trupin L, et al. Rheumatology Informatics System for Effectiveness (RISE) practices see significant gains in rheumatoid arthritis quality measures. Arthritis Care Res. 2020. https://doi.org/10.1002/acr.24444 .

Angier H, Gold R, Gallia C, Casciato A, Tillotson CJ, Marino M, et al. Variation in outcomes of quality measurement by data source. Pediatrics. 2014;133(6):e1676–82.

Lin KJ, Schneeweiss S. Considerations for the analysis of longitudinal electronic health records linked to claims data to study the effectiveness and safety of drugs. Clin Pharmacol Ther. 2016;100(2):147–59.

Goldstein ND, Sarwate AD. Privacy, security, and the public health researcher in the era of electronic health record research. Online J Public Health Inform. 2016;8(3):e207.

U.S. Department of Health and Human Services (HHS). 45 CFR 46. http://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/index.html .

Download references


The authors thank Dr. Annemarie Hirsch, Department of Population Health Sciences, Geisinger, for assistance in conceptualizing an earlier version of this work.

Research reported in this publication was supported in part by the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health under Award Number K01AR075085 (to MAG) and the National Institute Of Allergy And Infectious Diseases of the National Institutes of Health under Award Number K01AI143356 (to NDG). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and affiliations.

Division of Rheumatology, University of California School of Medicine, San Francisco, CA, USA

Milena A. Gianfrancesco

Department of Epidemiology and Biostatistics, Drexel University Dornsife School of Public Health, 3215 Market St., Philadelphia, PA, 19104, USA

Neal D. Goldstein

You can also search for this author in PubMed   Google Scholar


Both authors conceptualized, wrote, and approved the final submitted version.

Corresponding author

Correspondence to Neal D. Goldstein .

Ethics declarations

Ethics approval and consent to participate.

Not applicable

Consent for publication

Competing interests.

The authors have no competing interests to declare

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Gianfrancesco, M.A., Goldstein, N.D. A narrative review on the validity of electronic health record-based research in epidemiology. BMC Med Res Methodol 21 , 234 (2021). https://doi.org/10.1186/s12874-021-01416-5

Download citation

Received : 02 July 2021

Accepted : 28 September 2021

Published : 27 October 2021

DOI : https://doi.org/10.1186/s12874-021-01416-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Electronic health records
  • Data quality
  • Secondary analysis

BMC Medical Research Methodology

ISSN: 1471-2288

research paper on electronic record

  • Research article
  • Open access
  • Published: 04 September 2014

Implementing electronic health records in hospitals: a systematic literature review

  • Albert Boonstra 1 ,
  • Arie Versluis 2 &
  • Janita F J Vos 1  

BMC Health Services Research volume  14 , Article number:  370 ( 2014 ) Cite this article

91k Accesses

141 Citations

55 Altmetric

Metrics details

The literature on implementing Electronic Health Records (EHR) in hospitals is very diverse. The objective of this study is to create an overview of the existing literature on EHR implementation in hospitals and to identify generally applicable findings and lessons for implementers.

A systematic literature review of empirical research on EHR implementation was conducted. Databases used included Web of Knowledge, EBSCO, and Cochrane Library. Relevant references in the selected articles were also analyzed. Search terms included Electronic Health Record (and synonyms), implementation, and hospital (and synonyms). Articles had to meet the following requirements: (1) written in English, (2) full text available online, (3) based on primary empirical data, (4) focused on hospital-wide EHR implementation, and (5) satisfying established quality criteria.

Of the 364 initially identified articles, this study analyzes the 21 articles that met the requirements. From these articles, 19 interventions were identified that are generally applicable and these were placed in a framework consisting of the following three interacting dimensions: (1) EHR context, (2) EHR content, and (3) EHR implementation process.


Although EHR systems are anticipated as having positive effects on the performance of hospitals, their implementation is a complex undertaking. This systematic review reveals reasons for this complexity and presents a framework of 19 interventions that can help overcome typical problems in EHR implementation. This framework can function as a reference for implementers in developing effective EHR implementation strategies for hospitals.

Peer Review reports

In recent years, Electronic Health Records (EHRs) have been implemented by an ever increasing number of hospitals around the world. There have, for example, been initiatives, often driven by government regulations or financial stimulations, in the USA [ 1 ], the United Kingdom [ 2 ] and Denmark [ 3 ]. EHR implementation initiatives tend to be driven by the promise of enhanced integration and availability of patient data [ 4 ], by the need to improve efficiency and cost-effectiveness [ 5 ], by a changing doctor-patient relationship toward one where care is shared by a team of health care professionals [ 5 ], and/or by the need to deal with a more complex and rapidly changing environment [ 6 ].

EHR systems have various forms, and the term can relate to a broad range of electronic information systems used in health care. EHR systems can be used in individual organizations, as interoperating systems in affiliated health care units, on a regional level, or nationwide [ 1 , 2 ]. Health care units that use EHRs include hospitals, pharmacies, general practitioner surgeries, and other health care providers [ 7 ].

The implementation of hospital-wide EHR systems is a complex matter involving a range of organizational and technical factors including human skills, organizational structure, culture, technical infrastructure, financial resources, and coordination [ 8 , 9 ]. As Grimson et al. [ 5 ] argue, implementing information systems (IS) in hospitals is more challenging than elsewhere because of the complexity of medical data, data entry problems, security and confidentiality concerns, and a general lack of awareness of the benefits of Information Technology (IT). Boonstra and Govers [ 10 ] provide three reasons why hospitals differ from many other industries, and these differences might also affect EHR implementations. The first reason is that hospitals have multiple objectives, such as curing and caring for patients, and educating new physicians and nurses. Second, hospitals have complicated and highly varied structures and processes. Third, hospitals have a varied workforce including medical professionals who possess high levels of expertise, power, and autonomy. These distinct characteristics justify a study that focuses on the identification and analysis of the findings of previous studies on EHR implementation in hospitals.

Study aim, theoretical framework, and terminology

In dealing with the complexity of EHR implementation in hospitals, it is helpful to know which factors are seen as important in the literature and to capture the existing knowledge on EHR implementation in hospitals. As such, the objective of this research is to identify, categorize, and analyze the existing findings in the literature on EHR implementation processes in hospitals. This could contribute to greater insight into the underlying patterns and complex relationships involved in EHR implementation and could identify ways to tackle EHR implementation problems. In other words, this study focusses on the identification of factors that determine the progress of EHR implementation in hospitals. The motives behind implementing EHRs in hospitals and the effects on performance of implemented EHR systems are beyond the scope of this paper.

To our knowledge, there have been no systematic reviews of the literature concerning EHR implementation in hospitals and this article therefore fills that gap. Two interesting related review studies on EHR implementation are Keshavjee et al. [ 11 ] and McGinn et al. [ 12 ]. The study of Keshavjee et al. [ 11 ] develops a literature based integrative framework for EHR implementation. McGinn et al. [ 12 ] adopt an exclusive user perspective on EHR and their study is limited to Canada and countries with comparable socio-economic levels. Both studies are not explicitly focused on hospitals and include other contexts such as small clinics and national or regional EHR initiatives.

This systematic review is explicitly focused on hospital-wide, single hospital EHR implementations and identifies empirical studies (that include collected primary data) that reflect this situation. The categorization of the findings from the selected articles draws on Pettigrew’s framework for understanding strategic change [ 13 ]. This model has been widely applied in case study research into organizational contexts [ 14 ], as well as in studies on the implementation of health care innovations [ 15 ]. It generates insights by analyzing three interactive dimensions – context , content , and process – that together shape organizational change. Pettigrew’s framework [ 13 ] is seen as applicable because implementing an EHR artefact is an organization-wide effort. This framework was specifically selected for its focus on organizational change, its ease of understanding, and its relatively general dimensions allowing a broad range of findings to be included. The framework structures and focusses the analysis of the findings from the selected articles.

An organization’s context can be divided into internal and external components. External context refers to the social, economic, political, and competitive environments in which an organization operates. The internal context refers to the structure, culture, resources, capabilities, and politics of an organization. The content covers the specific areas of the transformation under examination. In an EHR implementation, these are the EHR system itself (both hardware and software), the work processes, and everything related to these (e.g. social conditions). The process dimension concerns the processes of change, made up of the plans, actions, reactions, and interactions of the stakeholders, rather than work processes in general. It is important to note that Pettigrew [ 13 ] does not see strategic change as a rational analytical process but rather as an iterative, continuous, multilevel process. This highlights that the outcome of an organizational change will be determined by the context, content, and process of that change. The framework with its three categories, shown in Figure  1 , illustrates the conceptual model used to categorize the findings of this systematic literature review.

figure 1

Pettigrew ’ s framework [ 13 ] ] and the corresponding categories.

In the literature, several terms are used to refer to electronic medical information systems. In this article, the term Electronic Health Record (EHR) is used throughout. Commonly used terms identified by ISO (the International Organization for Standardization) [ 16 ] plus another not identified by ISO are outlined below and used in our search. ISO considers Electronic Health Record (EHR) to be an overall term for “ a repository of information regarding the health status of a subject of care , in computer processable form ” [ 16 ], p. 13. ISO uses different terms to describe various types of EHRs. These include Electronic Medical Record (EMR), which is similar to an EHR but restricted to the medical domain. The terms Electronic Patient Record (EPR) and Computerized Patient Record (CPR) are also identified. Häyrinen et al. [ 17 ] view both terms as having the same meaning and referring to a system that contains clinical information from a particular hospital. Another term seen is Electronic Healthcare Record (EHCR) which refers to a system that contains all the available health information on a patient [ 17 ] and can thus be seen as synonymous with EHR [ 16 ]. A term often found in the literature is Computerized Physician Order Entry (CPOE). Although this term is not mentioned by ISO [ 16 ] or by Häyrinen et al. [ 17 ], we included CPOE for three reasons. First, it is considered by many to be a key hospital-wide function of an EHR system e.g. [ 8 , 18 ]. Second, from a preliminary analysis of our initial results, we found that, from the perspective of the implementation process, comparable issues and factors emerged from both CPOEs and EHRs. Third, the implementation of a comprehensive electronic medical record requires physicians to make direct order entries [ 19 ]. Kaushal et al. define a CPOE as “ a variety of computer - based systems that share the common features of automating the medication ordering process and that ensure standardized , legible , and complete orders ” [ 18 ], p. 1410. Other terms found in the literature were not included in this review as they were considered either irrelevant or too broadly defined. Examples of such terms are Electronic Client Record (ECR), Personal Health Record (PHR), Digital Medical Record (DMR), Health Information Technology (HIT), and Clinical Information System (CIS).

Search strategies

In order for a systematic literature review to be comprehensive, it is essential that all terms relevant to the aim of the research are covered in the search. Further, we need to include relevant synonyms and related terms, both for electronic medical information systems and for hospitals. By adding an * to the end of a term, the search engines pick out other forms, and by adding “ “ around words one ensures that only the complete term is searched for. Further, by including a ? as a wildcard character, every possible combination is included in the search.

The search used three categories of keywords. The first category included the following terms as approximate synonyms for hospital: “hospital*”, “healthcare”, and “clinic*”. The second category concerned implementation and included the term “implement*”. For the third category, electronic medical information systems, the following search terms were used: “Electronic Health Record*”, “Electronic Patient Record*”, “Electronic Medical Record*”, “Computeri?ed Patient Record*”, “Electronic Healthcare Record*”, “Computeri?ed Physician Order Entry”.

This relatively large set of keywords was necessary to ensure that articles were not missed in the search, and required a large number of search strategies to cover all those keywords. As we were seeking papers about the implementation of electronic medical information systems in hospitals , the search strategies included the terms shown in Table  1 .

The following three search engines were chosen based on their relevance to the field and their accessibility by the researcher: Web of knowledge, EBSCO, and The Cochrane Library. Most search engines use several databases but not all of them were relevant for this research as they serve a wide range of fields. Appendix A provides an overview of the databases used. The reference lists included in articles that met the selection criteria were checked for other possibly relevant studies that had not been identified in the database search.

The articles identified from the various search strategies had to be academic peer-reviewed articles if they were to be included in our review. Further, they were assessed and had to satisfy the following criteria to be included: (1) written in English, (2) full text available online, (3) based on primary empirical data, (4) focused on hospital-wide EHR implementation, and (5) meeting established quality criteria. A long list of abstracts was generated, and all of them were independently reviewed by two of the authors. They independently reviewed the abstracts, eliminated duplicates and shortlisted abstracts for detailed review. When opinions differed, a final decision over inclusion was made following a discussion between the researchers.

Data analysis

The quality of the articles that survived this filtering was assessed by the first two authors using the Standard Quality Assessment Criteria for Evaluating Primary Research Papers [ 18 ]. In other words, the quality of the articles was jointly assessed by evaluating whether specific criteria had been addressed, resulting in a rating of 2 (fully addressed), 1 (partly addressed), or 0 (not addressed) for each criteria. Different questions are posed for qualitative and quantitative research and, in the event of a mixed-method study, both questionnaires were used. Papers were included if they received at least half of the total possible points, admittedly a relatively liberal cut-off point given comments in the Standard Quality Assessment Criteria for Evaluating Primary Research Papers [ 20 ].

The next step was to extract the findings of the reviewed articles and to analyze these with the aim of reaching general findings on the implementation of EHR systems in hospitals. Categorizing these general findings can increase clarity. The earlier introduced conceptual model, based on Pettigrew’s framework for understanding strategic change, includes three categories: context (A), content (B), and process (C). As our review is specifically aimed at identifying findings related to the implementation process, possible motives for introducing such a system, as well as its effects and outcomes, are outside its scope. The authors held frequent discussions between themselves to discuss the meaning and the categorization of the general findings.

Paper selection

Applying the 18 search strategies listed in Table  1 with the various search engines resulted in 364 articles being identified. The searches were carried out on 12 March 2013 for search strategies 1–15 and on 18 April 2013 for search strategies 16–18. The latter three strategies were added following a preliminary analysis of the first set of results which highlighted several other terms and descriptions for information technology in health care. Not surprisingly, many duplicates were included in the 364 articles, both within and between search engines. Using the Refworks functions for identifying exact and close duplicates, 160 duplicates were found. However, this procedure did not identify all the duplicates present and the second author carried out a manual check that identified an additional 23 duplicates. When removing duplicates, we retained the link to the first search engine that identified the article and, as the Web of Knowledge was the first search engine used, most articles appear to have stemmed from this search engine. This left 181 different articles which were screened on title and abstract to check whether they met the selection criteria. When this was uncertain, the contents of the paper were further investigated. This screening resulted in just 13 articles that met all the selection criteria. We then performed two additional checks for completeness. First, checking the references of these articles identified another nine articles. Second, as suggested by the referees of this paper, we also used the term “introduc*” instead of “implement*”, together with the other two original categories of terms, and the term “provider” instead of “physician”, as part of CPOE. Each of these two searches identified one additional article (see Table  1 ). Of these resulting 24 articles, two proved to be almost identical so one was excluded, resulting in 23 articles for a final quality assessment.The results of the quality assessment can be found in Appendix B. The results show that two articles failed to meet the quality threshold and so 21 articles remained for in-depth analysis. Figure  2 displays the steps taken in this selection procedure.

figure 2

Selection procedure.

To provide greater insight into the context and nature of the 21 remaining articles, an overview is provided in Table  2 . All the studies except one were published after 2000. This reflects the recent increase in effort to implement organization-wide information systems, such as EHR systems, and also increasing incentives from governments to make use of EHR systems in hospitals. Of the 21 studies, 14 can be classified as qualitative, 6 as quantitative, and 1 as a mixed-method study. Most studies were conducted in the USA, with eight in various European countries. Teaching and non-teaching hospitals are almost equally the subject of inquiry, and some researchers have focused on specific types of hospitals such as rural, critical access, or psychiatric hospitals. Ten of the articles were in journals with a five-year impact factor in the Journal Citation Reports 2011 database. There is a huge difference in the number of citations but one should never forget that newer studies have had fewer opportunities to be cited.

Theoretical perspectives of reviewed articles

In research, it is common to use theoretical frameworks when designing an academic study [ 41 ]. Theoretical frameworks provide a way of thinking about and looking at the subject matter and describe the underlying assumptions about the nature of the subject matter [ 42 ]. By building on existing theories, research becomes focused in aiming to enrich and extend the existing knowledge in that particular field [ 42 ]. To provide a more thorough understanding of the selected articles, their theoretical frameworks, if present, are outlined in Table  3 .

It is striking that no specific theoretical frameworks have been used in the research leading to 13 of the 21 selected articles. Most articles simply state their objective as gaining insight into certain aspects of EHR implementation (as shown in Table  1 ) and do not use a particular theoretical approach to identify and categorize findings. As such, these articles add knowledge to the field of EHR implementation but do not attempt to extend existing theories.

Aarts et al. [ 21 ] introduce the notion of the sociotechnical approach: emphasizing the importance of focusing both on the social aspects of an EHR implementation and on the technical aspects of the system. Using the concept of emergent change, they argue that an implementation process is far from linear and predictable due to the contingencies and the organizational complexity that influences the process. A sociotechnical approach and the concept of emergent change are also included in the theoretical framework of Takian et al. [ 37 ]. Aarts et al. [ 21 ] elaborate on the sociotechnical approach when stating that the fit between work processes and the information technology determines the success of the implementation. Aarts and Berg [ 22 ] introduce a model of success or failure in information system implementation. They see creating synergy among the medical work practices, the information system, and the hospital organization as necessary for implementation, and argue that this will only happen if sufficient people accept a change in work practices. Cresswell et al.’s study [ 26 ] is also influenced by sociotechnical principles and draws on Actor-Network Theory. Gastaldi et al. [ 28 ] perceive Electronic Health Records as knowledge management systems and question how such systems can be used to develop knowledge assets. Katsma et al. [ 31 ] focus on implementation success and elaborate on the notion that implementation success is determined by system quality and acceptance through participation. As such, they adopt more of a social view on implementation success rather than a sociotechnical approach. Rivard et al. [ 34 ] examine the difficulties in EHR implementation from a cultural perspective. They not only view culture as a set of assumptions shared by an entire collective (an integration perspective) but also expect subcultures to exist (a differentiation perspective), as well as individual assumptions not shared by a specific (sub-) group (fragmentation perspective). Ford et al. [ 27 ] focus on an entirely different topic and investigate the IT adoption strategies of hospitals using a framework that identifies three strategies. These are the single-vendor strategy (in which all IT is purchased from a single vendor), the best-of-breed strategy (integrating IT from multiple vendors), and the best-of-suit strategy (a hybrid approach using a focal system from one vendor as the basis plus other applications from other vendors).

To summarize, the articles by Aarts et al. [ 21 ], Aarts and Berg [ 22 ], Cresswell et al. [ 26 ], and Takian et al. [ 37 ] apply a sociotechnical framework to focus their research. Gastaldi et al. [ 28 ] see EHRs as a means to renew organizational capabilities. Katsma et al. [ 31 ] use a social framework by focusing on the relevance of an IT system as perceived by the user and the participation of users in the implementation process. Rivard et al. [ 34 ] analyze how organizational cultures can be receptive to EHR implementation. Ford et al. [ 27 ] look at adoption strategies, leading them to focus on the selection procedure for Electronic Health Records. The 13 other studies did not use an explicit theoretical lens in their research.

Implementation-related findings

The process of categorization started by assessing whether a specific finding from a study should be placed in Category A, B, or C. Thirty findings were placed in Category A (context), 31 in Category B (content), and 66 in Category C (process). Comparing and combining the specific findings resulted in several general findings within each category. The general findings are each given a code (category character plus number) and the related code is indicated alongside each specific finding in Appendix C. Findings that were only seen in one article, and thus were lacking support, were discarded.

Category A - context

The context category of an EHR implementation process includes both internal variables (such as resources, capabilities, culture, and politics) and external variables (such as economic, political, and social variables). Six general findings were identified, all but one related to internal variables. An overview of the findings and corresponding articles can be found in Table  4 . The lack of general findings related to external variables reflects our decision to exclude the underlying reasons (e.g. political or social pressures) for implementing an EHR system from this review. Similarly, internal findings related to aspects such as perceived financial benefits or improved quality of care, are outside our scope.

A1: Large (or system-affiliated), urban, not-for-profit, and teaching hospitals are more likely to have implemented an EHR system due to having greater financial capabilities, a greater change readiness, and less focus on profit

The research reviewed shows that larger or system-affiliated hospitals are more likely to have implemented an EHR system, and that this can be explained by their easier access to the large financial resources required. Larger hospitals have more financial resources than smaller hospitals [ 30 ] and system-affiliated hospitals can share costs [ 27 ]. Hospitals situated in urban areas more often have an EHR system than rural hospitals, which is attributed to less knowledge of EHR systems and less support from medical staff in rural hospitals [ 29 ]. The fact that not-for-profit hospitals more often have an EHR system fully implemented and teaching hospitals slightly more often than private hospitals is attributed to the latter’s more wait-and-see approach and the more progressive change-ready nature of public and teaching hospitals [ 27 , 32 ].

A2: EHR implementation requires the selection of a mature vendor who is committed to providing a system that fits the hospital’s specific needs

Although this finding is not a great surprise, it is relevant to discuss it further. A hospital selecting its own vendor can ensure that the system will match the specific needs of that hospital [ 32 ]. Further, it is important to deal with a vendor that has proven itself on the EHR market with mature and successful products. The vendor must also be able to identify hospital workflows and adapt its product accordingly, and be committed to a long-term trusting relationship with the hospital [ 33 ]. With this in mind, the initial price of the system should not be the overriding consideration: the organization should be willing to avoid purely cost-oriented vendors [ 28 ], as costs soon mount if problems arise.

A3: The presence of hospital staff with previous experience of health information technology increases the likelihood of EHR implementation as less uncertainty is experienced by the end-users

In order to be able to work with an EHR system, users must be capable of using information technology such as computers and have adequate typing skills [ 19 , 32 ]. Knowledge of, and previous experience with, EHR systems or other medical information systems reduces uncertainty and disturbance for users, and this results in a more positive attitude towards the system [ 29 , 32 , 37 , 38 ].

A4: An organizational culture that supports collaboration and teamwork fosters EHR implementation success because trust between employees is higher

The influence of organizational culture on the success of organizational change is addressed in almost all the popular approaches to change management, as well as in several of the articles in this literature review. Ash et al. [ 23 , 24 ] and Scott et al. [ 35 ] highlight that a strong culture with a history of collaboration, teamwork, and trust between different stakeholder groups minimizes resistance to change. Boyer et al. [ 25 ] suggest creating a favorable culture that is more adaptive to EHR implementation. However, creating a favorable culture is not necessarily easy: a comprehensive approach including incentives, resource allocation, and a responsible team was used in the example of Boyer et al. [ 25 ].

A5: EHR implementation is most likely in an organization with little bureaucracy and considerable flexibility as changes can be rapidly made

A highly bureaucratic organizational structure hampers change: it slows the process and often leads to inter-departmental conflict [ 19 ]. Specifically, appointing a multidisciplinary team to deal with EHR-related issues can prevent conflict and stimulate collaboration [ 25 ].

A6: EHR system implementation is difficult because cure and care activities must be ensured at all times

During the process of implementing an EHR system, it is of the utmost importance that all relevant information is always available [ 28 , 34 , 39 ]. Ensuring the continuity of quality care while implementing an EHR system is difficult and is an important distinction from many other IT implementations.

Category B - content

The content of the EHR implementation process consists of the EHR system and the corresponding objectives, assumptions, and complementary services. Table  5 lists the five extracted general findings. These focus on both the hardware and software of the EHR system, and its relation to work practices and privacy.

B1: Creating a fit by adapting both the technology and work practices is a key factor in the implementation of EHR

This finding elaborates on the sociotechnical approach identified in the earlier section on the theories adopted in the articles. Several authors [ 21 , 26 , 31 , 37 ] make clear that creating a fit between the EHR system and the existing work practices requires an initial acknowledgement that an EHR implementation is not just a technical project and that existing work practices will change due to the new system. By customizing and adapting the system to meet specific needs, users will become more open to using it [ 19 , 26 , 28 ].

B2: Hardware availability and system reliability, in terms of speed, availability, and a lack of failures, are necessary to ensure EHR use

In several articles, authors highlight the importance of having sufficient hardware. A system can only be used if it is available to the users, and a system will only be used if it works without problems. Ash et al. [ 24 ], Scott et al. [ 35 ], and Weir et al. [ 19 ] refer to the speed of the system as well as to the availability of a sufficient number of adequate terminals see also [ 40 ] in various locations. Systems must be logically structured [ 29 ], reliable [ 32 ], and provide safe information access [ 37 ]. Boyer et al. [ 25 ] also mention the importance of technical aspects but add that these are not sufficient for EHR implementation.

B3: To ensure EHR implementation, the software needs to be user-friendly with regard to ease of use, efficiency in use, and functionality

Some authors distinguish between technical availability and reliability, and the user-friendliness of the software [ 19 , 24 , 32 ]. They argue that it is not sufficient for a system to be available and reliable, it should also be easy and efficient in use, and provide the functionality required for medical staff to give good care. If a system fails to do this, staff will not use the system and will stick to their old ways of working.

B4: An EHR implementation should contain adequate safeguards for patient privacy and confidentiality

Concerns over privacy and confidentiality are recognized by Boyer et al. [ 25 ] and Houser and Johnson [ 29 ] and are considered as a barrier to EHR implementation. Yoon-Flannery et al. [ 40 ] and Takian et al. [ 37 ] also recognize the importance of patient privacy and the need to address this issue by providing training and creating adequate safeguards.

B5: EHR implementation requires a vendor who is willing to adapt its product to hospital work processes

A vendor must be responsive and enable the hospital to develop its product to ensure a good and usable EHR system [ 32 , 33 ]. By so doing, dependence on the vendor decreases and concerns that arise within the hospital can be addressed [ 32 ]. This finding is related to A2 in the sense that an experienced, cooperative, and flexible vendor is needed to deal with the range of interest groups found in hospitals.

Category C - process

This category refers to the actual process of implementing the EHR system. Variables considered are time, change approach, and change management. In our review, this category produced the largest number of general findings (see Table  6 ), as might be expected given our focus on the implementation process. EHR implementation often leads to anxiety, uncertainty, and concerns about a possible negative impact of the EHR on work processes and quality. The process findings, including leadership, resource availability, communication and participation are explicitly aimed at overcoming resistance to EHR implementation. These interventions help to create a positive atmosphere of goal directedness, co-creation and partnership.

C1: Due to their influential position, management’s active involvement and support is positively associated with EHR implementation, and also counterbalances the physicians’ medical dominance

Several authors note the important role that managers play in EHR implementation. Whereas some authors refer to supportive leadership [ 19 , 24 ], others emphasize that strong and active management involvement is needed [ 25 , 32 – 35 ]. Strong leadership is relevant as it effectively counterbalances the physicians’ medical dominance. For instance, Rivard et al. [ 34 ] observe that physicians’ medical dominance and the status and autonomy of other health professionals hinder collaboration and teamwork, and that this complicates EHR implementation. Poon et al. [ 33 ] acknowledge this aspect and argue for strong leadership in order to deal with the otherwise dominant physicians. They also claim that leaders have to set an example and use the system themselves. At the same time, it is motivating that the implementation is managed by leaders who are recognized by the medical staff, for instance by head nurses and physicians or by former physicians and nurses [ 25 , 33 ]. Ovretveit et al. [ 32 ] argue that it helps the implementation if senior management repeatedly declares the EHR implementation to be of the highest priority and supports this with sufficient financial and human resources. Poon et al. [ 33 ] add to this by highlighting that, especially during uncertainties and setbacks, the common vision that guides the EHR implementation has to be communicated to hospital staff. Sufficient human resources include the selection of competent and experienced project leaders who are familiar with EHR implementation. Scott et al. [ 35 ] identify leadership styles for different phases: participatory leadership is valued in selection decisions, whereas a more hierarchical leadership style is preferable in the actual implementation.

C2: Participation of clinical staff in the implementation process increases support for and acceptance of the EHR implementation

Participation of end-users (the clinical staff) generates commitment and enables problems to be quickly solved [ 25 , 26 , 36 ]. Especially because it is very unlikely that the system will be perfect for all, it is important that the clinical staff become the owner, rather than customers, of the system. Clinical staff should participate at all levels and in all steps [ 19 , 28 , 32 , 36 ] from initial system selection onwards [ 35 ]. Ovretveit et al. [ 32 ] propose that this involvement should have an extensive timeframe, starting in the early stages of implementation, when initial vendor requirements are formulated (‘consultation before implementation’), through to the beginning of the use phase. Creating multidisciplinary work groups which determine the content of the EHR and the rules regarding the sharing of information contributes to EHR acceptance [ 25 ] and ensures realistic approaches acceptable to the clinical staff [ 36 ].

C3: Training end-users and providing real-time support is important for EHR implementation success

Frequently, the end-users of a new EHR system lack experience with the specific EHR system or with EHR systems in general. Although it is increasingly hard to imagine society or workplaces without IT, a large specific system, such as an EHR, still requires considerable training on how to use it properly. The importance of training is often underestimated, and inadequate training will create a barrier to EHR use [ 19 , 29 ]. Consequently, adequate training, of appropriate quantity and quality, must be provided at the right times and locations [ 19 , 32 , 36 ]. Simon et al. [ 36 ] add to this the importance of real-time support, preferably provided by peers and super-users.

C4: A comprehensive implementation strategy, offering both clear guidance and room for emergent change, is needed for implementing an EHR system

Several articles highlight aspects of an EHR implementation strategy. A good strategy facilitates EHR implementation [ 19 , 25 ] and consists of careful planning and preparation [ 36 ], a sustainable business plan, effective communication [ 28 , 40 ] and mandatory implementation [ 19 ]. Emergent change is perceived as a key characteristic of EHR implementation in complex organizations such as hospitals [ 21 ], and this suggests an implementation approach based on a development paradigm [ 31 ], which may initially even involve parallel use of paper [ 26 ]. The notion of emergent change has been variously applied, including in the theoretical frameworks of Aarts et al. [ 21 ] and Katsma et al. [ 31 ]. These studies recognize that EHR implementation is relatively unpredictable due to unforeseen contingencies for which one cannot plan. With their emphasis on emergent change with unpredictable outcomes, Aarts et al. [ 21 ] make a case for acknowledging that unexpected and unplanned contingencies will influence the implementation process. They argue that the changes resulting from these contingencies often manifest themselves unexpectedly and must then be dealt with. Additionally, Takian et al. [ 37 ] state that it is crucial to contextualize an EHR implementation so as to be better prepared for unexpected changes.

C5: Establishing an interdisciplinary implementation group consisting of developers, members of the IT department, and end-users fosters EHR implementation success

In line with the arguments for management support and for the participation of clinical staff, Ovretveit et al. [ 32 ], Simon et al. [ 36 ] and Weir et al. [ 19 ] build a case for using an interdisciplinary implementation group. By having all the direct stakeholders working together, a better EHR system can be delivered faster and with fewer problems.

C6: Resistance of clinical staff, in particular of physicians, is a major barrier to EHR implementation, but can be reduced by addressing their concerns

Clinical staff’s attitude is a crucial factor in EHR implementation [ 36 ]. Particularly, the physicians constitute an important group in hospitals. As such, their possible resistance to EHR implementation will form a major barrier [ 29 , 33 ] and may lead to workarounds [ 26 ]. Whether physicians accept or reject an EHR implementation depends on their acceptance of their work practices being transformed [ 22 ]. The likelihood of acceptance will be increased if implementers address the concerns of physicians [ 24 , 28 , 32 , 33 ], but also of other members of clinical staff [ 36 ].

C7: Identifying champions among clinical staff reduces resistance

The previous finding already elaborated on clinical staff resistance and suggested reducing this by addressing their concerns. Another way to reduce their resistance is related to the process of implementation and involves identifying physician champions, typically physicians that are well respected due to their knowledge and contacts [ 32 , 33 ]. Simon et al. [ 36 ] emphasize the importance of identifying champions among each stakeholder group. These champions can provide reassurance to their peers.

C8: Assigning a sufficient number of staff and other resources to the EHR implementation process is important in adequately implementing the system

Implementing a large EHR system requires considerable resources, including human ones. Assigning appropriate people, such as super-users [ 36 ] and a sufficient number of them to that process will increase the likelihood of success [ 19 , 32 , 33 , 36 ]. Further, it is important to have sufficient time and financial resources [ 26 , 32 ]. This finding is also relevant in relation to finding A6 (ensuring good care during organizational change).

These 19 general findings have been identified from the individual findings within the 20 analyzed articles. These findings are all related to one of the three main and interacting dimensions of the framework: six to context, five to content, and eight to process. This identification and explanation of the general findings concludes the results section of this systematic literature review and forms the basis for the discussion below.

This review of the existing academic literature sheds light on the current knowledge regarding EHR implementation. The 21 selected articles all originate from North America or Europe, perhaps reflecting a greater governmental attention to EHR implementation in these regions and, of course, our inclusion of only articles written in English. Two articles were rejected for quality reasons [ 43 , 44 ], see Appendix B. All but one of the selected articles have been published since 2000, reflecting the growing interest in implementing EHR systems in hospitals. Eight articles built their research on a theoretical framework, four of which use the same general lens of the sociotechnical approach [ 21 , 22 , 26 , 37 ]. Katsma et al. [ 31 ] and Rivard et al. [ 34 ] focus more on the social and cultural aspects of EHR implementation, the former on the relevance for, and participation of, users, the latter on three different cultural perspectives. Ford et al. [ 27 ] researched adoption strategies for EHR systems and Gastaldi et al. [ 26 ] consider them as a means to renew organizational capabilities. It is notable that the other reviewed articles did not use a theoretical framework to analyze EHR implementation and made no attempt to elaborate on existing theories.

A total of 127 findings were extracted from the articles, and these findings were categorized using Pettigrew’s framework for strategic change [ 13 ] as a conceptual model including the three dimensions of context, content, and process. To ensure a tight focus, the scope of the review was explicitly limited to findings related to the EHR implementation process, thus excluding the reasons for, barriers to, and outcomes of an EHR implementation.

Some of the findings require further interpretation. Contextual finding A1 relates to the demographics of a hospital. One of the assertions is that privately owned hospitals are less likely than public hospitals to invest in an EHR. The former apparently perceive the costs of EHR implementation to outweigh the benefits. This seems remarkable given that there is a general belief that information technology increases efficiency and reduces process costs, so more than compensating for the high initial investments. It is however important to note that the literature on EHR is ambivalent when it comes to efficiency; several authors record a decrease in the efficiency of work practices [ 25 , 33 , 35 , 38 ], whereas others mention an increase [ 29 , 31 ]. Finding A2 is a reminder of the importance of carefully selecting an appropriate vendor, taking into account experience with the EHR market and the maturity of their products rather than, for example, focussing on the cost price of the system. Given the huge investment costs, the price of an EHR system tends to have a major influence on vendor selection, an aspect that is also promoted by the current European tendering regulations that oblige (semi-) public institutions, like many hospitals, to select the lowest bidder, or the bidder that is economically the most preferable [ 45 ]. The finding that EHR system implementation is difficult because good medical care needs to be ensured at all times (A6) also deserves mention. Essentially, many system implementations in hospitals are different from IT implementations in other contexts because human lives are at stake in hospitals. This not only complicates the implementation process because medical work practices have to continue, it also requires a system to be reliable from the moment it is launched.

The findings regarding the content of the EHR system (Category B) highlight the importance of a suitable software product. A well-defined selection process of the software package and its associated vendor (discussed in A2) is seen as critical (B5). Selection should be based on a careful requirements analysis and an analysis of the experience and quality of the vendor. An important requirement is a sufficient degree of flexibility to customize and adapt the software to meet the needs of users and the work practices of the hospital (finding B1). At the same time the software product should challenge the hospital to rethink and improve its processes. A crucial condition for the acceptance by the diverse user groups of hospitals is the robustness of the EHR system in terms of availability, speed, reliability and flexibility (B2). This also requires adequate hardware in terms of access to computers, and mobile equipment to enable availability at all the locations of the hospital. Perceived ease of use of the system (B4) and the protection of patients’ privacy (B4) are other content factors that can make or break EHR implementation in hospitals.

The findings on the implementation process, our Category C, highlight four aspects that are commonly mentioned in change management approaches as important success factors in organizational change. The active involvement and support of management (C1), the participation of clinical staff (C2), a comprehensive implementation strategy (C4), and using an interdisciplinary implementation group (C5) correspond with three of the ten guidelines offered by Kanter et al. [ 46 ]. These three guidelines are: (1) support a strong leader role; (2) communicate, involve people, and be honest; and (3) craft an implementation plan. As the implementation of an EHR system is an organizational change process it is no surprise that these commonalities are identified in several of the analyzed articles. Three Category C findings (C2, C6, and C7) concern dealing with clinical staff given their powerful positions and potential resistance. Physicians are the most influential medical care providers, and their resistance can delay an EHR implementation [ 23 ], lead to at least some of it being dropped [ 21 , 22 , 34 ], or to it not being implemented at all [ 33 ]. Thus, there is ample evidence of the crucial importance of physicians’ acceptance of an EHR for it to be implemented. This means that clinicians and other key personnel should be highly engaged and motivated to contribute to EHR. Prompt feedback on requests, and high quality support during the implementation, and an EHR that clearly supports clinical work are key issues that contribute to a motivated clinical staff.

Analyzing and comparing the findings enables us to categorize them in terms of subject matter (see Table  7 ). By categorizing the findings in terms of subject, and by totaling the number of articles related to the individual findings on that subject, one can deduce how much attention has been given in the literature to the different topics. This analysis highlights that the involvement of physicians in the implementation process, the quality of the system, and a comprehensive implementation strategy are considered the crucial elements in EHR implementation.

Notwithstanding the useful results, this review and analysis has some limitations. Although we carefully developed and executed the search strategy, we cannot be sure that we found all the relevant articles. Since we focused narrowly on keywords, and these had to be part of an article’s title, we could have excluded relevant articles that used different terminology in their titles. Although searching the reference lists of identified articles did result in several additional articles, some relevant articles might still have been missed. Another limitation is the exclusion of publications in languages other than English. Further, the selection and categorization of specific findings, and the subsequent extraction of general findings, is subjective and depends on the interpretations of the authors, and other researchers might have made different choices. A final limitation is inherent to literature reviews in that the authors of the studies included may have had different motives and aims, and used different methods and interpretative means, in drawing their conclusions.

The existing literature fails to provide evidence of there being a comprehensive approach to implementing EHR systems in hospitals that integrates relevant aspects into an ‘EHR change approach’. The literature is diffuse, and articles seldom build on earlier ones to increase the theoretical knowledge on EHR implementation, notable exceptions being Aarts et al. [ 21 ], Aarts and Berg [ 22 ], Cresswell et al. [ 26 ], and Takian et al. [ 37 ]. The earlier discussion on the various results summarizes the existing knowledge and reveals gaps in the knowledge associated with EHR implementation. The number of EHR implementations in hospitals is growing, as well as the body of literature on this subject. This systematic review of the literature has produced 19 general findings on EHR implementation, which were each placed in one of three categories. A number of these general findings are in line with the wider literature on change management, and others relate to the specific nature of EHR implementation in hospitals.

The findings presented in this article can be viewed as an overview of important subjects that should be addressed in implementing an EHR system. It is clear that EHR systems have particular complexities and should be implemented with great care, and with attention given to context, content, and process issues and to interactions between these issues. As such, we have achieved our research goal by creating a systematic review of the literature on EHR implementation. This paper’s academic contribution is in providing an overview of the existing literature with regard to important factors in EHR implementation in hospitals. Academics interested in this specific field can now more easily access knowledge on EHR implementation in hospitals and can use this article as a starting point and build on the existing knowledge. The managerial contribution lies in the general findings that can be applied as guidelines when implementing EHR in hospitals. We have not set out to provide a single blueprint for implementing an EHR system, but rather to provide guidelines and to highlight points that deserve attention. Recognizing and addressing these aspects can increase the likelihood of getting an EHR system successfully implemented.

Appendix A - List of databases

This appendix provides an overview of all databases included in the used search engines. The databases in italic were excluded for the research as these databases focus on fields not relevant for the subject of EHR implementations.

Web of Knowledge

Web of Science

Biological Abstracts

Journal Citation Reports

Academic Search Premier

AMED - The Allied and Complementary Medicine Database

America : History & Life

American Bibliography of Slavic and East European Studies

Arctic & Antarctic Regions

Art Full Text ( H.W. Wilson )

Art Index Retrospective ( H.W. Wilson )

ATLA Religion Database with ATLASerials

Business Source Premier

Communication & Mass Media Complete

eBook Collection ( EBSCOhost )

Funk & Wagnalls New World Encyclopedia

Historical Abstracts

L ’ Annéephilologique

Library, Information Science & Technology Abstracts

MAS Ultra - School Edition

Military & Government Collection

MLA Directory of Periodicals

MLA International Bibliography

New Testament Abstracts

Old Testament Abstracts

Philosopher ’ s Index

Primary Search



Psychology and Behavioral Sciences Collection

Regional Business News

Research Starters - Business

RILM Abstracts of Music Literature

The Cochrane Library

Cochrane Database of Systematic Reviews

Cochrane Central Register of Controlled Trials

Cochrane Methodology Register

Database of Abstracts of Reviews of Effects

Health Technology Assessment Database

NHS Economic Evaluation Database

About The Cochrane Collaboration

Appendix B - Quality assessment

The quality of the articles was assessed with the Standard Quality Assessment Criteria for Evaluating Primary Research Papers [ 18 ]. Assessment was done by questioning whether particular criteria had been addressed, resulting in a rating of 2 (completely addressed), 1 (partly addressed), or 0 (not addressed) points. Table  8 provides the overview of the scores of the articles, (per question) for qualitative studies; Table  9 for quantitative studies; and Table  10 for mixed methods studies. Articles were included if they scored 50% or higher of the total amount of points possible. Based on this assessment, two articles were excluded from the search.

Appendix C - All findings

Table  11 displays all findings from the selected articles. The category number is related to the general finding as discussed in the Results section.

Abramson EL, McGinnis S, Edwards A, Maniccia DM, Moore J, Kaushal R: Electronic health record adoption and health information exchange among hospitals in New York State. J Eval Clin Pract. 2011, 18: 1156-1162.

Article   PubMed   Google Scholar  

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, Sheikh A: Implementation and adoption of nationwide electronic health records in secondary care in England: qualitative analysis of interim results from a prospective national evaluation. Br Med J. 2010, 341: c4564-10.1136/bmj.c4564.

Article   Google Scholar  

Rigsrevisionen: Extract from the report to the Public Accounts Committee on the implementation of electronic patient records at Danish hospitals. 2011, http://uk.rigsrevisionen.dk/media/1886186/4-2010.pdf , 2011

Google Scholar  

Hartswood M, Procter R, Rouncefield M, Slack R: Making a Case in Medical Work: Implications for the Electronic Medical Record. Comput Supported Coop Work. 2003, 12: 241-266. 10.1023/A:1025055829026.

Grimson J, Grimson W, Hasselbring W: The SI Challenge in Health Care. Commun ACM. 2000, 43 (6): 49-55.

Mantzana V, Themistocleous M, Irani Z, Morabito V: Identifying healthcare actors involved in theadoption of information systems. Eur J Inf Syst. 2007, 16: 91-102. 10.1057/palgrave.ejis.3000660.

Boonstra A, Boddy D, Bell S: Stakeholder management in IOS projects: analysis of an attempt to implement an electronic patient file. Eur J Inf Syst. 2008, 17 (2): 100-111. 10.1057/ejis.2008.2.

Jha A, DesRoches CM, Campbell EG, Donelan K, Rao SR, Ferris TF, Shields A, Rosenbaum S, Blumenthal D: Use of Electronic Health Records in US hospitals. N Engl J Med. 2009, 360: 1628-1638. 10.1056/NEJMsa0900592.

Article   CAS   PubMed   Google Scholar  

Heeks R: Health information systems: Failure, success and improvisation. Int J Med Inform. 2006, 75: 125-137. 10.1016/j.ijmedinf.2005.07.024.

Boonstra A, Govers MJ: Understanding ERP system implementation in a hospital by analysing stakeholders. N Technol Work Employ. 2009, 24 (2): 177-193. 10.1111/j.1468-005X.2009.00227.x.

Keshavjee K, Bosomworth J, Copen J, Lai J, Kucukyazici B, Liani R, Holbrook AM: Best practices in EMR implementation: a systematic review. Proceed of the 11th International Symposium on Health Information Mangement Research – iSHIMR. 2006, 1-15.

McGinn CA, Grenier S, Duplantie J, Shaw N, Sicotte C, Mathieu L, Leduc Y, Legare F, Gagnon MP: Comparison of use groups perspectives of barriers and facilitator to implementing EHR – a systematic review. BMC Med. 2011, 9: 46-10.1186/1741-7015-9-46.

Article   PubMed   PubMed Central   Google Scholar  

Pettigrew AM: Context and action in the transformation of the firm. J Manag Stud. 1987, 24 (6): 649-670. 10.1111/j.1467-6486.1987.tb00467.x.

Hartley J: Case Study Research. Chapter 26. Essential Guide to Qualitative Methods in Organizational Research. Edited by: Cassel C, Symon G. 2004, London: Sage

Hage E, Roo JP, Offenbeek MAG, Boonstra A: Implementation factors and their effect on e-health service adoption in rural communities: a systematic literature review. BMC Health Serv Res. 2013, 13 (19): 1-16.

ISO: Health informatics: Electronic health record - Definition, scope and context. Draft Tech Report. 2004, 03-16. ISO/DTR 20514. available at https://www.iso.org/obp/ui/#iso:std:iso:tr:20514:ed-1:v1:en

Häyrinen K, Saranto K, Nykänen P: Definition, structure, content, use and impacts of electronic health records: A review of the research literature. Int J Med Inform. 2008, 77: 291-304. 10.1016/j.ijmedinf.2007.09.001.

Kaushal R, Shojania KG, Bates DW: Effects of computerized physician order entry and clinical decision support systems on medication safety: a systematic review. Arch Intern Med. 2003, 163 (12): 1409-1416. 10.1001/archinte.163.12.1409.

Weir C, Lincoln M, Roscoe D, Turner C, Moreshead G: Dimensions associated with successful implementation of a hospital based integrated order entry system. Proc Annu Symp Comput Appl [Sic] in Med Care Symp Comput Appl Med Care. 1994, 653: 7.

Kmet LM, Lee RC, Cook LS: Standard quality assessment criteria for evaluating primary research papers from a variety of fields. 2004, Alberta Heritage Foundation for Medical Research, http://www.ihe.ca/documents/HTA-FR13.pdf .

Aarts J, Doorewaard H, Berg M: Understanding implementation: The case of a computerized physician order entry system in a large dutch university medical center. J Am Med Inform Assoc. 2004, 11 (3): 207-216. 10.1197/jamia.M1372.

Aarts J, Berg M: Same systems, different outcomes - Comparing the implementation of computerized physician order entry in two Dutch hospitals. Methods Inf Med. 2006, 45 (1): 53-61.

CAS   PubMed   Google Scholar  

Ash J, Gorman P, Lavelle M, Lyman J, Fournier L: Investigating physician order entry in the field: lessons learned in a multi-center study. Stud Health Technol Inform. 2001, 84 (2): 1107-1111.

Ash JS, Gorman PN, Lavelle M, Payne TH, Massaro TA, Frantz GL, Lyman JA: A cross-site qualitative study of physician order entry. J Am Med Inform Assoc. 2003, 10 (2): 188-200. 10.1197/jamia.M770.

Boyer L, Samuelian J, Fieschi M, Lancon C: Implementing electronic medical records in a psychiatric hospital: A qualitative study. Int J Psychiatry Clin Pract. 2010, 14 (3): 223-227. 10.3109/13651501003717243.

Cresswell KM, Worth A, Sheikh A: Integration of a nationally procured electronic health record system into user work practices. BMC Med Inform Decis Mak. 2012, 12: 15-10.1186/1472-6947-12-15.

Ford EW, Menachemi N, Huerta TR, Yu F: Hospital IT Adoption Strategies Associated with Implementation Success: Implications for Achieving Meaningful Use. J Healthc Manag. 2010, 55 (3): 175-188.

PubMed   Google Scholar  

Gastaldi L, Lettieri E, Corso M, Masella C: Performance improvement in hospitals: leveraging on knowledge assets dynamics through the introduction of an electronic medical record. Meas Bus Excell. 2012, 16 (4): 14-30. 10.1108/13683041211276410.

Houser SH, Johnson LA: Perceptions regarding electronic health record implementation among health information management professionals in Alabama: a statewide survey and analysis. Perspect Health Inf Manage/AHIMA, Am Health Inf Manage Assoc. 2008, 5: 6-6.

Jaana M, Ward MM, Bahensky JA: EMRs and Clinical IS Implementation in Hospitals: A Statewide Survey. J Rural Health. 2012, 28: 34-43. 10.1111/j.1748-0361.2011.00386.x.

Katsma CP, Spil TAM, Ligt E, Wassenaar A: Implementation and use of an electronic health record: measuring relevance and participation in four hospitals. Int J Healthc Technol Manag. 2007, 8 (6): 625-643. 10.1504/IJHTM.2007.014194.

Ovretveit J, Scott T, Rundall TG, Shortell SM, Brommels M: Improving quality through effective implementation of information technology in healthcare. Int J Qual Health Care. 2007, 19 (5): 259-266. 10.1093/intqhc/mzm031.

Poon EG, Blumenthal D, Jaggi T, Honour MM, Bates DW, Kaushal R: Overcoming barriers to adopting and implementing computerized physician order entry systems in US hospitals. Health Aff. 2004, 23 (4): 184-190. 10.1377/hlthaff.23.4.184.

Rivard S, Lapointe L, Kappos A: An Organizational Culture-Based Theory of Clinical Information Systems Implementation in Hospitals. J Assoc Inf Syst. 2011, 12 (2): 123-162.

Scott JT, Rundall TG, Vogt TM, Hsu J: Kaiser Permanente’s experience of implementing an electronic medical record: a qualitative study. Br Med J. 2005, 331 (7528): 1313-1316. 10.1136/bmj.38638.497477.68.

Simon SR, Keohane CA, Amato M, Coffey M, Cadet M, Zimlichman E: Lessons learned from implementation of computerized provider order entry in 5 community hospitals: a qualitative study. BMC Med Inform Decis Mak. 2013, 13: 67-10.1186/1472-6947-13-67.

Takian A, Sheikh A, Barber N: We are bitter, but we are better off: case study of the implementation of an electronic health record system into a mental health hospital in England. BMC Health Serv Res. 2012, 12: 484-10.1186/1472-6963-12-484.

Ward MM, Vartak S, Schwichtenberg T, Wakefield DS: Nurses’ Perceptions of How Clinical Information System Implementation Affects Workflow and Patient Care. Cin-Comput Inform Nurs. 2011, 29 (9): 502-511. 10.1097/NCN.0b013e31822b8798.

Ward MM, Vartak S, Loes JL, O’Brien J, Mills TR, Halbesleben JRB, Wakefield DS: CAH Staff Perceptions of a Clinical Information System Implementation. Am J Manage Care. 2012, 18 (5): 244-252.

Yoon-Flannery K, Zandieh SO, Kuperman GJ, Langsam DJ, Hyman D, Kaushal R: A qualitative analysis of an electronic health record (EHR) implementation in an academic ambulatory setting. Inform Prim Care. 2008, 16 (4): 277-284.

Van Aken J, Berends H, Van der Bij H: Problem solving in organizations. 2012, New York, USA: Cambridge University Press

Book   Google Scholar  

Botha ME: Theory development in perspective: the role of conceptual frameworks and models in theory development. J Adv Nurs. 1989, 14 (1): 49-55. 10.1111/j.1365-2648.1989.tb03404.x.

Spetz J, Keane D: Information Technology Implementation in a Rural Hospital: A Cautionary Tale. J Healthc Manag. 2009, 54 (5): 337-347.

Massaro TA: Introducing Physician Order Entry at a Major Academic Medical-Center. Impact Organ Culture Behav Acad Med. 1993, 68 (1): 20-25.

CAS   Google Scholar  

Lundberg S, Bergman M: Tender evaluation and supplier selection methods in public procurement. J Purch Supply Manage. 2013, 19 (2): 73-83. 10.1016/j.pursup.2013.02.003.

Kanter RM, Stein BA, Jick TD: The Challenge of Organizational Change. 1992, New York, USA: Free Press

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1472-6963/14/370/prepub

Download references


We acknowledge the Master degree program Change Management at the University of Groningen for supporting this study. We also thank the referees for their valuable comments.

Author information

Authors and affiliations.

Faculty of Economics and Business, University of Groningen, Groningen, The Netherlands

Albert Boonstra & Janita F J Vos

Deloitte Consulting, Amsterdam, The Netherlands

Arie Versluis

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Albert Boonstra .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contributions

AB and JV established the research design and made significant contributions to the interpretation of the results. They supervised AV throughout the study, and participated in writing the final version of this paper. AV contributed substantially to the selection and analysis of included papers, and wrote a preliminary draft of this article. All authors have read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2, rights and permissions.

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Boonstra, A., Versluis, A. & Vos, J.F.J. Implementing electronic health records in hospitals: a systematic literature review. BMC Health Serv Res 14 , 370 (2014). https://doi.org/10.1186/1472-6963-14-370

Download citation

Received : 23 September 2013

Accepted : 11 August 2014

Published : 04 September 2014

DOI : https://doi.org/10.1186/1472-6963-14-370

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Clinical Staff
  • Electronic Health Record
  • Electronic Patient Record
  • Health Information Technology
  • Computerize Physician Order Entry

BMC Health Services Research

ISSN: 1472-6963

research paper on electronic record

  • Systematic review
  • Open access
  • Published: 19 February 2024

‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice

  • Annette Boaz   ORCID: orcid.org/0000-0003-0557-1294 1 ,
  • Juan Baeza 2 ,
  • Alec Fraser   ORCID: orcid.org/0000-0003-1121-1551 2 &
  • Erik Persson 3  

Implementation Science volume  19 , Article number:  15 ( 2024 ) Cite this article

1758 Accesses

68 Altmetric

Metrics details

The gap between research findings and clinical practice is well documented and a range of strategies have been developed to support the implementation of research into clinical practice. The objective of this study was to update and extend two previous reviews of systematic reviews of strategies designed to implement research evidence into clinical practice.

We developed a comprehensive systematic literature search strategy based on the terms used in the previous reviews to identify studies that looked explicitly at interventions designed to turn research evidence into practice. The search was performed in June 2022 in four electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched from January 2010 up to June 2022 and applied no language restrictions. Two independent reviewers appraised the quality of included studies using a quality assessment checklist. To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes.

We identified 32 reviews conducted between 2010 and 2022. The reviews are mainly of multi-faceted interventions ( n  = 20) although there are reviews focusing on single strategies (ICT, educational, reminders, local opinion leaders, audit and feedback, social media and toolkits). The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Furthermore, a lot of nuance lies behind these headline findings, and this is increasingly commented upon in the reviews themselves.

Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been identified. We need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of research perspectives (including social science) in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed.

Peer Review reports

Contribution to the literature

Considerable time and money is invested in implementing and evaluating strategies to increase the implementation of research into clinical practice.

The growing body of evidence is not providing the anticipated clear lessons to support improved implementation.

Instead what is needed is better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice.

This would involve a more central role in implementation science for a wider range of perspectives, especially from the social, economic, political and behavioural sciences and for greater use of different types of synthesis, such as realist synthesis.


The gap between research findings and clinical practice is well documented and a range of interventions has been developed to increase the implementation of research into clinical practice [ 1 , 2 ]. In recent years researchers have worked to improve the consistency in the ways in which these interventions (often called strategies) are described to support their evaluation. One notable development has been the emergence of Implementation Science as a field focusing explicitly on “the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice” ([ 3 ] p. 1). The work of implementation science focuses on closing, or at least narrowing, the gap between research and practice. One contribution has been to map existing interventions, identifying 73 discreet strategies to support research implementation [ 4 ] which have been grouped into 9 clusters [ 5 ]. The authors note that they have not considered the evidence of effectiveness of the individual strategies and that a next step is to understand better which strategies perform best in which combinations and for what purposes [ 4 ]. Other authors have noted that there is also scope to learn more from other related fields of study such as policy implementation [ 6 ] and to draw on methods designed to support the evaluation of complex interventions [ 7 ].

The increase in activity designed to support the implementation of research into practice and improvements in reporting provided the impetus for an update of a review of systematic reviews of the effectiveness of interventions designed to support the use of research in clinical practice [ 8 ] which was itself an update of the review conducted by Grimshaw and colleagues in 2001. The 2001 review [ 9 ] identified 41 reviews considering a range of strategies including educational interventions, audit and feedback, computerised decision support to financial incentives and combined interventions. The authors concluded that all the interventions had the potential to promote the uptake of evidence in practice, although no one intervention seemed to be more effective than the others in all settings. They concluded that combined interventions were more likely to be effective than single interventions. The 2011 review identified a further 13 systematic reviews containing 313 discrete primary studies. Consistent with the previous review, four main strategy types were identified: audit and feedback; computerised decision support; opinion leaders; and multi-faceted interventions (MFIs). Nine of the reviews reported on MFIs. The review highlighted the small effects of single interventions such as audit and feedback, computerised decision support and opinion leaders. MFIs claimed an improvement in effectiveness over single interventions, although effect sizes remained small to moderate and this improvement in effectiveness relating to MFIs has been questioned in a subsequent review [ 10 ]. In updating the review, we anticipated a larger pool of reviews and an opportunity to consolidate learning from more recent systematic reviews of interventions.

This review updates and extends our previous review of systematic reviews of interventions designed to implement research evidence into clinical practice. To identify potentially relevant peer-reviewed research papers, we developed a comprehensive systematic literature search strategy based on the terms used in the Grimshaw et al. [ 9 ] and Boaz, Baeza and Fraser [ 8 ] overview articles. To ensure optimal retrieval, our search strategy was refined with support from an expert university librarian, considering the ongoing improvements in the development of search filters for systematic reviews since our first review [ 11 ]. We also wanted to include technology-related terms (e.g. apps, algorithms, machine learning, artificial intelligence) to find studies that explored interventions based on the use of technological innovations as mechanistic tools for increasing the use of evidence into practice (see Additional file 1 : Appendix A for full search strategy).

The search was performed in June 2022 in the following electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched for articles published since the 2011 review. We searched from January 2010 up to June 2022 and applied no language restrictions. Reference lists of relevant papers were also examined.

We uploaded the results using EPPI-Reviewer, a web-based tool that facilitated semi-automation of the screening process and removal of duplicate studies. We made particular use of a priority screening function to reduce screening workload and avoid ‘data deluge’ [ 12 ]. Through machine learning, one reviewer screened a smaller number of records ( n  = 1200) to train the software to predict whether a given record was more likely to be relevant or irrelevant, thus pulling the relevant studies towards the beginning of the screening process. This automation did not replace manual work but helped the reviewer to identify eligible studies more quickly. During the selection process, we included studies that looked explicitly at interventions designed to turn research evidence into practice. Studies were included if they met the following pre-determined inclusion criteria:

The study was a systematic review

Search terms were included

Focused on the implementation of research evidence into practice

The methodological quality of the included studies was assessed as part of the review

Study populations included healthcare providers and patients. The EPOC taxonomy [ 13 ] was used to categorise the strategies. The EPOC taxonomy has four domains: delivery arrangements, financial arrangements, governance arrangements and implementation strategies. The implementation strategies domain includes 20 strategies targeted at healthcare workers. Numerous EPOC strategies were assessed in the review including educational strategies, local opinion leaders, reminders, ICT-focused approaches and audit and feedback. Some strategies that did not fit easily within the EPOC categories were also included. These were social media strategies and toolkits, and multi-faceted interventions (MFIs) (see Table  2 ). Some systematic reviews included comparisons of different interventions while other reviews compared one type of intervention against a control group. Outcomes related to improvements in health care processes or patient well-being. Numerous individual study types (RCT, CCT, BA, ITS) were included within the systematic reviews.

We excluded papers that:

Focused on changing patient rather than provider behaviour

Had no demonstrable outcomes

Made unclear or no reference to research evidence

The last of these criteria was sometimes difficult to judge, and there was considerable discussion amongst the research team as to whether the link between research evidence and practice was sufficiently explicit in the interventions analysed. As we discussed in the previous review [ 8 ] in the field of healthcare, the principle of evidence-based practice is widely acknowledged and tools to change behaviour such as guidelines are often seen to be an implicit codification of evidence, despite the fact that this is not always the case.

Reviewers employed a two-stage process to select papers for inclusion. First, all titles and abstracts were screened by one reviewer to determine whether the study met the inclusion criteria. Two papers [ 14 , 15 ] were identified that fell just before the 2010 cut-off. As they were not identified in the searches for the first review [ 8 ] they were included and progressed to assessment. Each paper was rated as include, exclude or maybe. The full texts of 111 relevant papers were assessed independently by at least two authors. To reduce the risk of bias, papers were excluded following discussion between all members of the team. 32 papers met the inclusion criteria and proceeded to data extraction. The study selection procedure is documented in a PRISMA literature flow diagram (see Fig.  1 ). We were able to include French, Spanish and Portuguese papers in the selection reflecting the language skills in the study team, but none of the papers identified met the inclusion criteria. Other non- English language papers were excluded.

figure 1

PRISMA flow diagram. Source: authors

One reviewer extracted data on strategy type, number of included studies, local, target population, effectiveness and scope of impact from the included studies. Two reviewers then independently read each paper and noted key findings and broad themes of interest which were then discussed amongst the wider authorial team. Two independent reviewers appraised the quality of included studies using a Quality Assessment Checklist based on Oxman and Guyatt [ 16 ] and Francke et al. [ 17 ]. Each study was rated a quality score ranging from 1 (extensive flaws) to 7 (minimal flaws) (see Additional file 2 : Appendix B). All disagreements were resolved through discussion. Studies were not excluded in this updated overview based on methodological quality as we aimed to reflect the full extent of current research into this topic.

The extracted data were synthesised using descriptive and narrative techniques to identify themes and patterns in the data linked to intervention strategies, targeted behaviours, study settings and study outcomes.

Thirty-two studies were included in the systematic review. Table 1. provides a detailed overview of the included systematic reviews comprising reference, strategy type, quality score, number of included studies, local, target population, effectiveness and scope of impact (see Table  1. at the end of the manuscript). Overall, the quality of the studies was high. Twenty-three studies scored 7, six studies scored 6, one study scored 5, one study scored 4 and one study scored 3. The primary focus of the review was on reviews of effectiveness studies, but a small number of reviews did include data from a wider range of methods including qualitative studies which added to the analysis in the papers [ 18 , 19 , 20 , 21 ]. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. In this section, we discuss the different EPOC-defined implementation strategies in turn. Interestingly, we found only two ‘new’ approaches in this review that did not fit into the existing EPOC approaches. These are a review focused on the use of social media and a review considering toolkits. In addition to single interventions, we also discuss multi-faceted interventions. These were the most common intervention approach overall. A summary is provided in Table  2 .

Educational strategies

The overview identified three systematic reviews focusing on educational strategies. Grudniewicz et al. [ 22 ] explored the effectiveness of printed educational materials on primary care physician knowledge, behaviour and patient outcomes and concluded they were not effective in any of these aspects. Koota, Kääriäinen and Melender [ 23 ] focused on educational interventions promoting evidence-based practice among emergency room/accident and emergency nurses and found that interventions involving face-to-face contact led to significant or highly significant effects on patient benefits and emergency nurses’ knowledge, skills and behaviour. Interventions using written self-directed learning materials also led to significant improvements in nurses’ knowledge of evidence-based practice. Although the quality of the studies was high, the review primarily included small studies with low response rates, and many of them relied on self-assessed outcomes; consequently, the strength of the evidence for these outcomes is modest. Wu et al. [ 20 ] questioned if educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes. Although based on evaluation projects and qualitative data, their results also suggest that positive changes on patient outcomes can be made following the implementation of specific evidence-based approaches (or projects). The differing positive outcomes for educational strategies aimed at nurses might indicate that the target audience is important.

Local opinion leaders

Flodgren et al. [ 24 ] was the only systemic review focusing solely on opinion leaders. The review found that local opinion leaders alone, or in combination with other interventions, can be effective in promoting evidence‐based practice, but this varies both within and between studies and the effect on patient outcomes is uncertain. The review found that, overall, any intervention involving opinion leaders probably improves healthcare professionals’ compliance with evidence-based practice but varies within and across studies. However, how opinion leaders had an impact could not be determined because of insufficient details were provided, illustrating that reporting specific details in published studies is important if diffusion of effective methods of increasing evidence-based practice is to be spread across a system. The usefulness of this review is questionable because it cannot provide evidence of what is an effective opinion leader, whether teams of opinion leaders or a single opinion leader are most effective, or the most effective methods used by opinion leaders.

Pantoja et al. [ 26 ] was the only systemic review focusing solely on manually generated reminders delivered on paper included in the overview. The review explored how these affected professional practice and patient outcomes. The review concluded that manually generated reminders delivered on paper as a single intervention probably led to small to moderate increases in adherence to clinical recommendations, and they could be used as a single quality improvement intervention. However, the authors indicated that this intervention would make little or no difference to patient outcomes. The authors state that such a low-tech intervention may be useful in low- and middle-income countries where paper records are more likely to be the norm.

ICT-focused approaches

The three ICT-focused reviews [ 14 , 27 , 28 ] showed mixed results. Jamal, McKenzie and Clark [ 14 ] explored the impact of health information technology on the quality of medical and health care. They examined the impact of electronic health record, computerised provider order-entry, or decision support system. This showed a positive improvement in adherence to evidence-based guidelines but not to patient outcomes. The number of studies included in the review was low and so a conclusive recommendation could not be reached based on this review. Similarly, Brown et al. [ 28 ] found that technology-enabled knowledge translation interventions may improve knowledge of health professionals, but all eight studies raised concerns of bias. The De Angelis et al. [ 27 ] review was more promising, reporting that ICT can be a good way of disseminating clinical practice guidelines but conclude that it is unclear which type of ICT method is the most effective.

Audit and feedback

Sykes, McAnuff and Kolehmainen [ 29 ] examined whether audit and feedback were effective in dementia care and concluded that it remains unclear which ingredients of audit and feedback are successful as the reviewed papers illustrated large variations in the effectiveness of interventions using audit and feedback.

Non-EPOC listed strategies: social media, toolkits

There were two new (non-EPOC listed) intervention types identified in this review compared to the 2011 review — fewer than anticipated. We categorised a third — ‘care bundles’ [ 36 ] as a multi-faceted intervention due to its description in practice and a fourth — ‘Technology Enhanced Knowledge Transfer’ [ 28 ] was classified as an ICT-focused approach. The first new strategy was identified in Bhatt et al.’s [ 30 ] systematic review of the use of social media for the dissemination of clinical practice guidelines. They reported that the use of social media resulted in a significant improvement in knowledge and compliance with evidence-based guidelines compared with more traditional methods. They noted that a wide selection of different healthcare professionals and patients engaged with this type of social media and its global reach may be significant for low- and middle-income countries. This review was also noteworthy for developing a simple stepwise method for using social media for the dissemination of clinical practice guidelines. However, it is debatable whether social media can be classified as an intervention or just a different way of delivering an intervention. For example, the review discussed involving opinion leaders and patient advocates through social media. However, this was a small review that included only five studies, so further research in this new area is needed. Yamada et al. [ 31 ] draw on 39 studies to explore the application of toolkits, 18 of which had toolkits embedded within larger KT interventions, and 21 of which evaluated toolkits as standalone interventions. The individual component strategies of the toolkits were highly variable though the authors suggest that they align most closely with educational strategies. The authors conclude that toolkits as either standalone strategies or as part of MFIs hold some promise for facilitating evidence use in practice but caution that the quality of many of the primary studies included is considered weak limiting these findings.

Multi-faceted interventions

The majority of the systematic reviews ( n  = 20) reported on more than one intervention type. Some of these systematic reviews focus exclusively on multi-faceted interventions, whilst others compare different single or combined interventions aimed at achieving similar outcomes in particular settings. While these two approaches are often described in a similar way, they are actually quite distinct from each other as the former report how multiple strategies may be strategically combined in pursuance of an agreed goal, whilst the latter report how different strategies may be incidentally used in sometimes contrasting settings in the pursuance of similar goals. Ariyo et al. [ 35 ] helpfully summarise five key elements often found in effective MFI strategies in LMICs — but which may also be transferrable to HICs. First, effective MFIs encourage a multi-disciplinary approach acknowledging the roles played by different professional groups to collectively incorporate evidence-informed practice. Second, they utilise leadership drawing on a wide set of clinical and non-clinical actors including managers and even government officials. Third, multiple types of educational practices are utilised — including input from patients as stakeholders in some cases. Fourth, protocols, checklists and bundles are used — most effectively when local ownership is encouraged. Finally, most MFIs included an emphasis on monitoring and evaluation [ 35 ]. In contrast, other studies offer little information about the nature of the different MFI components of included studies which makes it difficult to extrapolate much learning from them in relation to why or how MFIs might affect practice (e.g. [ 28 , 38 ]). Ultimately, context matters, which some review authors argue makes it difficult to say with real certainty whether single or MFI strategies are superior (e.g. [ 21 , 27 ]). Taking all the systematic reviews together we may conclude that MFIs appear to be more likely to generate positive results than single interventions (e.g. [ 34 , 45 ]) though other reviews should make us cautious (e.g. [ 32 , 43 ]).

While multi-faceted interventions still seem to be more effective than single-strategy interventions, there were important distinctions between how the results of reviews of MFIs are interpreted in this review as compared to the previous reviews [ 8 , 9 ], reflecting greater nuance and debate in the literature. This was particularly noticeable where the effectiveness of MFIs was compared to single strategies, reflecting developments widely discussed in previous studies [ 10 ]. We found that most systematic reviews are bounded by their clinical, professional, spatial, system, or setting criteria and often seek to draw out implications for the implementation of evidence in their areas of specific interest (such as nursing or acute care). Frequently this means combining all relevant studies to explore the respective foci of each systematic review. Therefore, most reviews we categorised as MFIs actually include highly variable numbers and combinations of intervention strategies and highly heterogeneous original study designs. This makes statistical analyses of the type used by Squires et al. [ 10 ] on the three reviews in their paper not possible. Further, it also makes extrapolating findings and commenting on broad themes complex and difficult. This may suggest that future research should shift its focus from merely examining ‘what works’ to ‘what works where and what works for whom’ — perhaps pointing to the value of realist approaches to these complex review topics [ 48 , 49 ] and other more theory-informed approaches [ 50 ].

Some reviews have a relatively small number of studies (i.e. fewer than 10) and the authors are often understandably reluctant to engage with wider debates about the implications of their findings. Other larger studies do engage in deeper discussions about internal comparisons of findings across included studies and also contextualise these in wider debates. Some of the most informative studies (e.g. [ 35 , 40 ]) move beyond EPOC categories and contextualise MFIs within wider systems thinking and implementation theory. This distinction between MFIs and single interventions can actually be very useful as it offers lessons about the contexts in which individual interventions might have bounded effectiveness (i.e. educational interventions for individual change). Taken as a whole, this may also then help in terms of how and when to conjoin single interventions into effective MFIs.

In the two previous reviews, a consistent finding was that MFIs were more effective than single interventions [ 8 , 9 ]. However, like Squires et al. [ 10 ] this overview is more equivocal on this important issue. There are four points which may help account for the differences in findings in this regard. Firstly, the diversity of the systematic reviews in terms of clinical topic or setting is an important factor. Secondly, there is heterogeneity of the studies within the included systematic reviews themselves. Thirdly, there is a lack of consistency with regards to the definition and strategies included within of MFIs. Finally, there are epistemological differences across the papers and the reviews. This means that the results that are presented depend on the methods used to measure, report, and synthesise them. For instance, some reviews highlight that education strategies can be useful to improve provider understanding — but without wider organisational or system-level change, they may struggle to deliver sustained transformation [ 19 , 44 ].

It is also worth highlighting the importance of the theory of change underlying the different interventions. Where authors of the systematic reviews draw on theory, there is space to discuss/explain findings. We note a distinction between theoretical and atheoretical systematic review discussion sections. Atheoretical reviews tend to present acontextual findings (for instance, one study found very positive results for one intervention, and this gets highlighted in the abstract) whilst theoretically informed reviews attempt to contextualise and explain patterns within the included studies. Theory-informed systematic reviews seem more likely to offer more profound and useful insights (see [ 19 , 35 , 40 , 43 , 45 ]). We find that the most insightful systematic reviews of MFIs engage in theoretical generalisation — they attempt to go beyond the data of individual studies and discuss the wider implications of the findings of the studies within their reviews drawing on implementation theory. At the same time, they highlight the active role of context and the wider relational and system-wide issues linked to implementation. It is these types of investigations that can help providers further develop evidence-based practice.

This overview has identified a small, but insightful set of papers that interrogate and help theorise why, how, for whom, and in which circumstances it might be the case that MFIs are superior (see [ 19 , 35 , 40 ] once more). At the level of this overview — and in most of the systematic reviews included — it appears to be the case that MFIs struggle with the question of attribution. In addition, there are other important elements that are often unmeasured, or unreported (e.g. costs of the intervention — see [ 40 ]). Finally, the stronger systematic reviews [ 19 , 35 , 40 , 43 , 45 ] engage with systems issues, human agency and context [ 18 ] in a way that was not evident in the systematic reviews identified in the previous reviews [ 8 , 9 ]. The earlier reviews lacked any theory of change that might explain why MFIs might be more effective than single ones — whereas now some systematic reviews do this, which enables them to conclude that sometimes single interventions can still be more effective.

As Nilsen et al. ([ 6 ] p. 7) note ‘Study findings concerning the effectiveness of various approaches are continuously synthesized and assembled in systematic reviews’. We may have gone as far as we can in understanding the implementation of evidence through systematic reviews of single and multi-faceted interventions and the next step would be to conduct more research exploring the complex and situated nature of evidence used in clinical practice and by particular professional groups. This would further build on the nuanced discussion and conclusion sections in a subset of the papers we reviewed. This might also support the field to move away from isolating individual implementation strategies [ 6 ] to explore the complex processes involving a range of actors with differing capacities [ 51 ] working in diverse organisational cultures. Taxonomies of implementation strategies do not fully account for the complex process of implementation, which involves a range of different actors with different capacities and skills across multiple system levels. There is plenty of work to build on, particularly in the social sciences, which currently sits at the margins of debates about evidence implementation (see for example, Normalisation Process Theory [ 52 ]).

There are several changes that we have identified in this overview of systematic reviews in comparison to the review we published in 2011 [ 8 ]. A consistent and welcome finding is that the overall quality of the systematic reviews themselves appears to have improved between the two reviews, although this is not reflected upon in the papers. This is exhibited through better, clearer reporting mechanisms in relation to the mechanics of the reviews, alongside a greater attention to, and deeper description of, how potential biases in included papers are discussed. Additionally, there is an increased, but still limited, inclusion of original studies conducted in low- and middle-income countries as opposed to just high-income countries. Importantly, we found that many of these systematic reviews are attuned to, and comment upon the contextual distinctions of pursuing evidence-informed interventions in health care settings in different economic settings. Furthermore, systematic reviews included in this updated article cover a wider set of clinical specialities (both within and beyond hospital settings) and have a focus on a wider set of healthcare professions — discussing both similarities, differences and inter-professional challenges faced therein, compared to the earlier reviews. These wider ranges of studies highlight that a particular intervention or group of interventions may work well for one professional group but be ineffective for another. This diversity of study settings allows us to consider the important role context (in its many forms) plays on implementing evidence into practice. Examining the complex and varied context of health care will help us address what Nilsen et al. ([ 6 ] p. 1) described as, ‘society’s health problems [that] require research-based knowledge acted on by healthcare practitioners together with implementation of political measures from governmental agencies’. This will help us shift implementation science to move, ‘beyond a success or failure perspective towards improved analysis of variables that could explain the impact of the implementation process’ ([ 6 ] p. 2).

This review brings together 32 papers considering individual and multi-faceted interventions designed to support the use of evidence in clinical practice. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been conducted. As a whole, this substantial body of knowledge struggles to tell us more about the use of individual and MFIs than: ‘it depends’. To really move forwards in addressing the gap between research evidence and practice, we may need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of perspectives, especially from the social, economic, political and behavioural sciences in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed. Harvey et al. [ 53 ] suggest that when context is likely to be critical to implementation success there are a range of primary research approaches (participatory research, realist evaluation, developmental evaluation, ethnography, quality/ rapid cycle improvement) that are likely to be appropriate and insightful. While these approaches often form part of implementation studies in the form of process evaluations, they are usually relatively small scale in relation to implementation research as a whole. As a result, the findings often do not make it into the subsequent systematic reviews. This review provides further evidence that we need to bring qualitative approaches in from the periphery to play a central role in many implementation studies and subsequent evidence syntheses. It would be helpful for systematic reviews, at the very least, to include more detail about the interventions and their implementation in terms of how and why they worked.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.


Before and after study

Controlled clinical trial

Effective Practice and Organisation of Care

High-income countries

Information and Communications Technology

Interrupted time series

Knowledge translation

Low- and middle-income countries

Randomised controlled trial

Grol R, Grimshaw J. From best evidence to best practice: effective implementation of change in patients’ care. Lancet. 2003;362:1225–30. https://doi.org/10.1016/S0140-6736(03)14546-1 .

Article   PubMed   Google Scholar  

Green LA, Seifert CM. Translation of research into practice: why we can’t “just do it.” J Am Board Fam Pract. 2005;18:541–5. https://doi.org/10.3122/jabfm.18.6.541 .

Eccles MP, Mittman BS. Welcome to Implementation Science. Implement Sci. 2006;1:1–3. https://doi.org/10.1186/1748-5908-1-1 .

Article   PubMed Central   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:2–14. https://doi.org/10.1186/s13012-015-0209-1 .

Article   Google Scholar  

Waltz TJ, Powell BJ, Matthieu MM, Damschroder LJ, et al. Use of concept mapping to characterize relationships among implementation strategies and assess their feasibility and importance: results from the Expert Recommendations for Implementing Change (ERIC) study. Implement Sci. 2015;10:1–8. https://doi.org/10.1186/s13012-015-0295-0 .

Nilsen P, Ståhl C, Roback K, et al. Never the twain shall meet? - a comparison of implementation science and policy implementation research. Implementation Sci. 2013;8:2–12. https://doi.org/10.1186/1748-5908-8-63 .

Rycroft-Malone J, Seers K, Eldh AC, et al. A realist process evaluation within the Facilitating Implementation of Research Evidence (FIRE) cluster randomised controlled international trial: an exemplar. Implementation Sci. 2018;13:1–15. https://doi.org/10.1186/s13012-018-0811-0 .

Boaz A, Baeza J, Fraser A, European Implementation Score Collaborative Group (EIS). Effective implementation of research into practice: an overview of systematic reviews of the health literature. BMC Res Notes. 2011;4:212. https://doi.org/10.1186/1756-0500-4-212 .

Article   PubMed   PubMed Central   Google Scholar  

Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, et al. Changing provider behavior – an overview of systematic reviews of interventions. Med Care. 2001;39 8Suppl 2:II2–45.

Google Scholar  

Squires JE, Sullivan K, Eccles MP, et al. Are multifaceted interventions more effective than single-component interventions in changing health-care professionals’ behaviours? An overview of systematic reviews. Implement Sci. 2014;9:1–22. https://doi.org/10.1186/s13012-014-0152-6 .

Salvador-Oliván JA, Marco-Cuenca G, Arquero-Avilés R. Development of an efficient search filter to retrieve systematic reviews from PubMed. J Med Libr Assoc. 2021;109:561–74. https://doi.org/10.5195/jmla.2021.1223 .

Thomas JM. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.

Effective Practice and Organisation of Care (EPOC). The EPOC taxonomy of health systems interventions. EPOC Resources for review authors. Oslo: Norwegian Knowledge Centre for the Health Services; 2016. epoc.cochrane.org/epoc-taxonomy . Accessed 9 Oct 2023.

Jamal A, McKenzie K, Clark M. The impact of health information technology on the quality of medical and health care: a systematic review. Health Inf Manag. 2009;38:26–37. https://doi.org/10.1177/183335830903800305 .

Menon A, Korner-Bitensky N, Kastner M, et al. Strategies for rehabilitation professionals to move evidence-based knowledge into practice: a systematic review. J Rehabil Med. 2009;41:1024–32. https://doi.org/10.2340/16501977-0451 .

Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J Clin Epidemiol. 1991;44:1271–8. https://doi.org/10.1016/0895-4356(91)90160-b .

Article   CAS   PubMed   Google Scholar  

Francke AL, Smit MC, de Veer AJ, et al. Factors influencing the implementation of clinical guidelines for health care professionals: a systematic meta-review. BMC Med Inform Decis Mak. 2008;8:1–11. https://doi.org/10.1186/1472-6947-8-38 .

Jones CA, Roop SC, Pohar SL, et al. Translating knowledge in rehabilitation: systematic review. Phys Ther. 2015;95:663–77. https://doi.org/10.2522/ptj.20130512 .

Scott D, Albrecht L, O’Leary K, Ball GDC, et al. Systematic review of knowledge translation strategies in the allied health professions. Implement Sci. 2012;7:1–17. https://doi.org/10.1186/1748-5908-7-70 .

Wu Y, Brettle A, Zhou C, Ou J, et al. Do educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes? A systematic review. Nurse Educ Today. 2018;70:109–14. https://doi.org/10.1016/j.nedt.2018.08.026 .

Yost J, Ganann R, Thompson D, Aloweni F, et al. The effectiveness of knowledge translation interventions for promoting evidence-informed decision-making among nurses in tertiary care: a systematic review and meta-analysis. Implement Sci. 2015;10:1–15. https://doi.org/10.1186/s13012-015-0286-1 .

Grudniewicz A, Kealy R, Rodseth RN, Hamid J, et al. What is the effectiveness of printed educational materials on primary care physician knowledge, behaviour, and patient outcomes: a systematic review and meta-analyses. Implement Sci. 2015;10:2–12. https://doi.org/10.1186/s13012-015-0347-5 .

Koota E, Kääriäinen M, Melender HL. Educational interventions promoting evidence-based practice among emergency nurses: a systematic review. Int Emerg Nurs. 2018;41:51–8. https://doi.org/10.1016/j.ienj.2018.06.004 .

Flodgren G, O’Brien MA, Parmelli E, et al. Local opinion leaders: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD000125.pub5 .

Arditi C, Rège-Walther M, Durieux P, et al. Computer-generated reminders delivered on paper to healthcare professionals: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2017. https://doi.org/10.1002/14651858.CD001175.pub4 .

Pantoja T, Grimshaw JM, Colomer N, et al. Manually-generated reminders delivered on paper: effects on professional practice and patient outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD001174.pub4 .

De Angelis G, Davies B, King J, McEwan J, et al. Information and communication technologies for the dissemination of clinical practice guidelines to health professionals: a systematic review. JMIR Med Educ. 2016;2:e16. https://doi.org/10.2196/mededu.6288 .

Brown A, Barnes C, Byaruhanga J, McLaughlin M, et al. Effectiveness of technology-enabled knowledge translation strategies in improving the use of research in public health: systematic review. J Med Internet Res. 2020;22:e17274. https://doi.org/10.2196/17274 .

Sykes MJ, McAnuff J, Kolehmainen N. When is audit and feedback effective in dementia care? A systematic review. Int J Nurs Stud. 2018;79:27–35. https://doi.org/10.1016/j.ijnurstu.2017.10.013 .

Bhatt NR, Czarniecki SW, Borgmann H, et al. A systematic review of the use of social media for dissemination of clinical practice guidelines. Eur Urol Focus. 2021;7:1195–204. https://doi.org/10.1016/j.euf.2020.10.008 .

Yamada J, Shorkey A, Barwick M, Widger K, et al. The effectiveness of toolkits as knowledge translation strategies for integrating evidence into clinical care: a systematic review. BMJ Open. 2015;5:e006808. https://doi.org/10.1136/bmjopen-2014-006808 .

Afari-Asiedu S, Abdulai MA, Tostmann A, et al. Interventions to improve dispensing of antibiotics at the community level in low and middle income countries: a systematic review. J Glob Antimicrob Resist. 2022;29:259–74. https://doi.org/10.1016/j.jgar.2022.03.009 .

Boonacker CW, Hoes AW, Dikhoff MJ, Schilder AG, et al. Interventions in health care professionals to improve treatment in children with upper respiratory tract infections. Int J Pediatr Otorhinolaryngol. 2010;74:1113–21. https://doi.org/10.1016/j.ijporl.2010.07.008 .

Al Zoubi FM, Menon A, Mayo NE, et al. The effectiveness of interventions designed to increase the uptake of clinical practice guidelines and best practices among musculoskeletal professionals: a systematic review. BMC Health Serv Res. 2018;18:2–11. https://doi.org/10.1186/s12913-018-3253-0 .

Ariyo P, Zayed B, Riese V, Anton B, et al. Implementation strategies to reduce surgical site infections: a systematic review. Infect Control Hosp Epidemiol. 2019;3:287–300. https://doi.org/10.1017/ice.2018.355 .

Borgert MJ, Goossens A, Dongelmans DA. What are effective strategies for the implementation of care bundles on ICUs: a systematic review. Implement Sci. 2015;10:1–11. https://doi.org/10.1186/s13012-015-0306-1 .

Cahill LS, Carey LM, Lannin NA, et al. Implementation interventions to promote the uptake of evidence-based practices in stroke rehabilitation. Cochrane Database Syst Rev. 2020. https://doi.org/10.1002/14651858.CD012575.pub2 .

Pedersen ER, Rubenstein L, Kandrack R, Danz M, et al. Elusive search for effective provider interventions: a systematic review of provider interventions to increase adherence to evidence-based treatment for depression. Implement Sci. 2018;13:1–30. https://doi.org/10.1186/s13012-018-0788-8 .

Jenkins HJ, Hancock MJ, French SD, Maher CG, et al. Effectiveness of interventions designed to reduce the use of imaging for low-back pain: a systematic review. CMAJ. 2015;187:401–8. https://doi.org/10.1503/cmaj.141183 .

Bennett S, Laver K, MacAndrew M, Beattie E, et al. Implementation of evidence-based, non-pharmacological interventions addressing behavior and psychological symptoms of dementia: a systematic review focused on implementation strategies. Int Psychogeriatr. 2021;33:947–75. https://doi.org/10.1017/S1041610220001702 .

Noonan VK, Wolfe DL, Thorogood NP, et al. Knowledge translation and implementation in spinal cord injury: a systematic review. Spinal Cord. 2014;52:578–87. https://doi.org/10.1038/sc.2014.62 .

Albrecht L, Archibald M, Snelgrove-Clarke E, et al. Systematic review of knowledge translation strategies to promote research uptake in child health settings. J Pediatr Nurs. 2016;31:235–54. https://doi.org/10.1016/j.pedn.2015.12.002 .

Campbell A, Louie-Poon S, Slater L, et al. Knowledge translation strategies used by healthcare professionals in child health settings: an updated systematic review. J Pediatr Nurs. 2019;47:114–20. https://doi.org/10.1016/j.pedn.2019.04.026 .

Bird ML, Miller T, Connell LA, et al. Moving stroke rehabilitation evidence into practice: a systematic review of randomized controlled trials. Clin Rehabil. 2019;33:1586–95. https://doi.org/10.1177/0269215519847253 .

Goorts K, Dizon J, Milanese S. The effectiveness of implementation strategies for promoting evidence informed interventions in allied healthcare: a systematic review. BMC Health Serv Res. 2021;21:1–11. https://doi.org/10.1186/s12913-021-06190-0 .

Zadro JR, O’Keeffe M, Allison JL, Lembke KA, et al. Effectiveness of implementation strategies to improve adherence of physical therapist treatment choices to clinical practice guidelines for musculoskeletal conditions: systematic review. Phys Ther. 2020;100:1516–41. https://doi.org/10.1093/ptj/pzaa101 .

Van der Veer SN, Jager KJ, Nache AM, et al. Translating knowledge on best practice into improving quality of RRT care: a systematic review of implementation strategies. Kidney Int. 2011;80:1021–34. https://doi.org/10.1038/ki.2011.222 .

Pawson R, Greenhalgh T, Harvey G, et al. Realist review–a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy. 2005;10Suppl 1:21–34. https://doi.org/10.1258/1355819054308530 .

Rycroft-Malone J, McCormack B, Hutchinson AM, et al. Realist synthesis: illustrating the method for implementation research. Implementation Sci. 2012;7:1–10. https://doi.org/10.1186/1748-5908-7-33 .

Johnson MJ, May CR. Promoting professional behaviour change in healthcare: what interventions work, and why? A theory-led overview of systematic reviews. BMJ Open. 2015;5:e008592. https://doi.org/10.1136/bmjopen-2015-008592 .

Metz A, Jensen T, Farley A, Boaz A, et al. Is implementation research out of step with implementation practice? Pathways to effective implementation support over the last decade. Implement Res Pract. 2022;3:1–11. https://doi.org/10.1177/26334895221105585 .

May CR, Finch TL, Cornford J, Exley C, et al. Integrating telecare for chronic disease management in the community: What needs to be done? BMC Health Serv Res. 2011;11:1–11. https://doi.org/10.1186/1472-6963-11-131 .

Harvey G, Rycroft-Malone J, Seers K, Wilson P, et al. Connecting the science and practice of implementation – applying the lens of context to inform study design in implementation research. Front Health Serv. 2023;3:1–15. https://doi.org/10.3389/frhs.2023.1162762 .

Download references


The authors would like to thank Professor Kathryn Oliver for her support in the planning the review, Professor Steve Hanney for reading and commenting on the final manuscript and the staff at LSHTM library for their support in planning and conducting the literature search.

This study was supported by LSHTM’s Research England QR strategic priorities funding allocation and the National Institute for Health and Care Research (NIHR) Applied Research Collaboration South London (NIHR ARC South London) at King’s College Hospital NHS Foundation Trust. Grant number NIHR200152. The views expressed are those of the author(s) and not necessarily those of the NIHR, the Department of Health and Social Care or Research England.

Author information

Authors and affiliations.

Health and Social Care Workforce Research Unit, The Policy Institute, King’s College London, Virginia Woolf Building, 22 Kingsway, London, WC2B 6LE, UK

Annette Boaz

King’s Business School, King’s College London, 30 Aldwych, London, WC2B 4BG, UK

Juan Baeza & Alec Fraser

Federal University of Santa Catarina (UFSC), Campus Universitário Reitor João Davi Ferreira Lima, Florianópolis, SC, 88.040-900, Brazil

Erik Persson

You can also search for this author in PubMed   Google Scholar


AB led the conceptual development and structure of the manuscript. EP conducted the searches and data extraction. All authors contributed to screening and quality appraisal. EP and AF wrote the first draft of the methods section. AB, JB and AF performed result synthesis and contributed to the analyses. AB wrote the first draft of the manuscript and incorporated feedback and revisions from all other authors. All authors revised and approved the final manuscript.

Corresponding author

Correspondence to Annette Boaz .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: appendix a., additional file 2: appendix b., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Boaz, A., Baeza, J., Fraser, A. et al. ‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice. Implementation Sci 19 , 15 (2024). https://doi.org/10.1186/s13012-024-01337-z

Download citation

Received : 01 November 2023

Accepted : 05 January 2024

Published : 19 February 2024

DOI : https://doi.org/10.1186/s13012-024-01337-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation
  • Interventions
  • Clinical practice
  • Research evidence
  • Multi-faceted

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

research paper on electronic record

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access


Research Article

Electronic Medical Records implementation in hospital: An empirical investigation of individual and organizational determinants

Contributed equally to this work with: Anna De Benedictis, Emanuele Lettieri

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Current address: Department of Healthcare Professions, University Hospital Campus Bio-Medico, Rome, Italy

Affiliations Department of Healthcare Professions, University Hospital Campus Bio-Medico, Rome, Italy, Faculty of Medicine & Surgery, University Campus Bio-Medico, Rome, Italy

ORCID logo

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Current address: Department of Economics, Management and Industrial Engineering, Politecnico of Milan, Milan, Italy

Affiliation Department of Economics, Management and Industrial Engineering, Politecnico of Milan, Milan, Italy

Roles Conceptualization, Data curation, Methodology, Writing – review & editing

Roles Conceptualization, Methodology, Writing – review & editing

Roles Formal analysis, Investigation, Project administration

Affiliation Department of Healthcare Professions, University Hospital Campus Bio-Medico, Rome, Italy

Roles Conceptualization, Writing – review & editing

  • Anna De Benedictis, 
  • Emanuele Lettieri, 
  • Luca Gastaldi, 
  • Cristina Masella, 
  • Alessia Urgu, 
  • Daniela Tartaglini


  • Published: June 4, 2020
  • https://doi.org/10.1371/journal.pone.0234108
  • Reader Comments

Fig 1

The implementation of hospital-wide Electronic Medical Records (EMRs) is still an unsolved quest for many hospital managers. EMRs have long been considered a key factor for improving healthcare quality and safety, reducing adverse events for patients, decreasing costs, optimizing processes, improving clinical research and obtaining best clinical performances. However, hospitals continue to experience resistance from professionals to accepting EMRs. This study combines institutional and individual factors to explain which determinants can trigger or inhibit the EMRs implementation in hospitals, and which variables managers can exploit to guide professionals’ behaviours. Data have been collected through a survey administered to physicians and nurses in an Italian University Hospital in Rome. A total of 114 high-quality responses had been received. Results show that both, physicians and nurses, expect many benefits from the use of EMRs. In particular, it is believed that the EMRs will have a positive impact on quality, efficiency and effectiveness of care; handover communication between healthcare workers; teaching, tutoring and research activities; greater control of your own business. Moreover, data show an interplay between individual and institutional determinants: normative factors directly affect perceived usefulness (C = 0.30 **), perceived ease of use (C = 0.26 **) and intention to use EMRs (C = 0.33 **), regulative factors affect the intention to use EMRs (C = -0.21 **), and perceived usefulness directly affect the intention to use EMRs (C = 0.33 **). The analysis carried out shows that the key determinants of the intention to use EMRs are the normative ones (peer influence) and the individual ones (perceived usefulness), and that perceived usefulness works also as a mediator between normative factors and intention to use EMRs. Therefore, Management can leverage on power users to motivate, generate and manage change.

Citation: De Benedictis A, Lettieri E, Gastaldi L, Masella C, Urgu A, Tartaglini D (2020) Electronic Medical Records implementation in hospital: An empirical investigation of individual and organizational determinants. PLoS ONE 15(6): e0234108. https://doi.org/10.1371/journal.pone.0234108

Editor: Stefano Triberti, University of Milan, ITALY

Received: September 16, 2019; Accepted: May 19, 2020; Published: June 4, 2020

Copyright: © 2020 De Benedictis et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.


Healthcare is the most complex and fast-moving industry that exists. New digital technologies are constantly being developed, all with the potential to support the clinical practice by bringing many advantages into the healthcare sector [ 1 ]. Nevertheless, the healthcare industry has lagged behind other sectors in the adoption of Information Technology (IT) in the workplace [ 2 ]. Electronic Medical Records (EMRs) have long been considered a key factor for improving healthcare quality and safety, reducing adverse events for patients, decreasing costs, optimizing processes, improving clinical research and obtaining best clinical performance [e.g., 3 – 5 ]. However, the pace of adoption of EMRs–as other digital technologies–in healthcare continues to lag [ 2 , 6 ], and hospitals continue to experience resistance from professionals to accepting digital technology [ 7 ]. Though many research and development programs exist and venture capital investment has been growing, successful IT projects in healthcare continue to be rare, and a plan to accelerate innovation is needed beginning with a diagnosis of the problem [ 2 ]. Some studies analyzed both individual and organizational factors that affect the acceptance and implementation of technology [ 8 ], but they have generated mixed results [ 9 ]. Indeed, mechanisms that drive the adoption and implementation of IT in hospitals remain unclear. Organizational studies conceive organizations as strongly institutionalized settings in which individual behaviours are influenced by regulations, social norms and cultural systems [ 10 , 11 ]. In contrast, Information Science has mostly adopted user acceptance models, which emphasise individuals’ rational and volitional assessment of the costs and benefits they would attain from the new digital technology [ 11 ].

Hospitals are highly institutionalized and regulated contexts, in terms of regulatory oversight and professional roles, and are operationally and technically complex [ 12 ]. Physicians and nurses have a high level of professionalism and they often affiliate within their specialities via professional training and participation in speciality-focused organizations [ 13 ]. Successful adoption or perceived usefulness of EMRs by others within their specialities may influence hospital professionals’ decisions, particularly if they are uncertain about individual benefits. Nevertheless, the majority of academic research in IT adoption in healthcare focused on the individual level [ 14 ]. The most widely used model to explore issues related to the acceptance of technology is the Technology Acceptance Model (TAM) [ 15 ], which identifies two main antecedents the perceived usefulness and the perceived ease of use of technology. The TAM has been validated in multiple settings [e.g. 16 – 18 ]. In its basic framework, the end user’s attitudes and perceptions regarding the use of new technology determine the user’s behavioural intention to use it. Institutional theory, instead, is based on the assumption that individual behaviours are modelled by regulations, social norms and meaning systems and that institutions embodied in routines rely on automatic cognition and uncritical processing of existing schemata and privilege consistency with stereotypes and speed over accuracy [ 19 ]. Thus, in this theory, normative and cultural conditions are co-determinants of the adoption of new technologies [ 20 ]. The use of institutional theory in Information Science is rare compared to other fields such as organization science [ 21 ]. However, several studies have used an institutional approach for exploring the adoption of technology considering institutional forces as crucial to shaping organizational actions and the opinions of the decision-makers [ 22 , 23 , 24 ].

Both institutional theory and user acceptance models have independently tried to incorporate elements of the other theory to enrich their explanatory power [ 2 ]. User acceptance models have incorporated the direct effects of social influences and organizational conditions on individuals’ behavioural intention [ 25 , 26 ], and institutional studies have demonstrated that even when professionals are subject to institutional influences, their self-determination plays an important role even in highly-institutionalized and regulated settings such as hospitals [ 27 ]. Previous studies about technology acceptance and adoption compared individual and social levels including environmental factors [ 22 , 28 – 30 ], typically based on the diffusion of innovation theory (DOI) [ 31 ] or the TOE (technology, organization, and environment) framework [ 32 ]. Moreover, only a few studies have tested both explanations (institutional and individual) in an integrative framework [ 23 ] to explain the behaviour of organizations.

The main purpose of this study was to explore which are the main determinants of hospital professionals’ intention to use EMRs through a novel theoretical model that combines organizational theories and technology acceptance models. By combining these theories, this study investigated the interplay between organizational and individual factors, thus offering novel insights on the determinants of hospital professionals’ acceptance of digital technology by showing how and to what extent the interplay between individual and organizational determinants might trigger or inhibit the acceptance of digital technology. This study focused on perceived usefulness and perceived ease of use as explanatory factors at the individual level, and on inter-hospital normative and regulative forces as explanatory factors at the organizational level. Intention to use has been preferred to repetitive use as the dependent variable. This choice is because of the still relatively low adoption rate of EMRs in many Countries such as Italy, where this study is located. In the specific case of Italy, a recent report issued by the Politecnico di Milano within the research activities of the Permanent Observatory of Digital Transformation in Health Care [ 33 ] pointed out that only 53% of Italian hospitals have in place an EMRs for therapy management, only 30% of Italian hospitals have in place an EMRs that collects vital parameters and informed consensus, and only 19% of Italian hospitals have in place an EMRs that supports clinical decision-making. In this view, a large number of Italian hospitals–as well as hospitals from other Countries who are still lagging in the adoption of EMRs–is expected to commit in the next years to adopt EMRs and the understanding of which individual or organizational factors might shape hospital professionals’ intention to use EMRs might contribute to the successful adoption and implementation of such and other digital technologies. In this view, the results of this study might be valuable for hospital managers and professionals of different countries who are going to invest in the digital transformation of their hospitals.

Material and methods

Ethics statement.

The study has been approved by the Ethics Board of the University Hospital Campus Bio-Medico of Rome. (Approval number: 61/16 OSS ComEt CBM), and written consent has been obtained by professionals involved in the study.

Theoretical background

To evaluate the potential interplay between individual and institutional variables, a research framework has been created ( Fig 1 ). The framework integrates into a coherent view of two theories that belong to two different bodies of literature:

  • The Technology Acceptance Model (TAM), from Information Science, that has been widely used in the last decades in healthcare to understand what leads professionals or patients to accept or reject Information Technology [ 15 ];
  • The Institutional Theory, from Public Management, that has been largely adopted in the last decades to assess how institutional factors shape professionals’ behaviours [ 34 – 36 ].


  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image


Technology acceptance model.

Davis introduced the TAM in 1989 [ 15 ]. The main problem raised by the author was to understand what leads people to accept or reject Information Technology. In this regards, two main variables have been identified: the perceived usefulness and the ease of use. Perceived usefulness measures “the degree to which a person believes that using a particular system would enhance his or her job performance” [ 15 ], and therefore induces individuals to use technology as it allows to obtain better results. On the other hand, the ease of use measures “the degree to which a person believes that using a system would be free of effort” [ 15 ] and induces the potential users to use a certain technology since it requires low energy expenditure while it may bring advantages. The first one induces an individual to use technology as it allows to obtain better results in his work; the ease of use, on the other hand, stimulates potential users to use a certain technology since many advantages are supported with low energy expenditure.

Institutional theory.

The Institutional Theory refers to a line of organizational research that recognize the significant organizational effects that are associated with the increase of cultural and social forces. According to Scott [ 34 – 36 ], “Institutions are made up of cultural-cognitive, normative and regulative elements, which together with associated activities and resources offer stability and meaning to social life.” These three forces are present in totally developed institutional systems, with economists and political scientists emphasizing regulative, sociological and normative factors, and anthropologists and organizational theorists emphasizing cognitive-cultural factors. According to this perspective, individuals are embedded in institutional pillars that limit the scope of their rational assessment and direct the engagement of specific behaviours [ 34 – 36 ]. Scott [ 34 – 36 ] defines the three institutional pillars as follows:

  • regulative pillars : which regard the existence of regulations, rules and processes whose breach is monitored and sanctioned;
  • normative pillars : which introduce a social dimension of appropriate behaviours in the organization;
  • cultural pillars : which emphasize the use of common schemas, frames, and other shared symbolic representations that create an attachment to the ‘appropriate’ behaviour.

Research framework

Consistently to our research questions, we combined the two theories described above to develop an original, comprehensive research framework where individual and institutional determinants have been interlinked to explore their potential interplay in explaining hospital professionals’ intention to use an EMR. Coherently to past researches about user acceptance of new technologies [ 36 , 37 ], we considered age and job seniority as key control variables. Additionally, to narrow the knowledge gap about how hospital professionals belonging to either different profession (e.g., physicians vs. nurses) or different speciality (e.g., cardiology vs. orthopaedics) might be interested to use an EMR, we included clinical speciality and profession as control variables. Fig 1 offers a synoptic view of our research framework, where the independent variable (i.e., the intention to use an EMR) is explained by individual factors from TAM (i.e., perceived usefulness and perceived ease of use) as well as by institutional factors from Institutional Theory (i.e., regulative factors that refer to the degree of adhesion to hospital managers’ goals, and normative factors that explain the peer influence among hospital colleagues. Control variables have been also displayed.

According to the research questions and the research framework, the following research hypotheses (H) were stated: H1: Individual factors (perceived usefulness, perceived ease of use) directly affect the intention to use EMRs; H2: Organizational factors (normative and regulative factors) directly affect individual factors and the intention to use EMRs; H3: Some control variables (age, seniority, clinical specialities and different professions) directly affect individual factors and the intention to use EMRs.

Setting and research methodology

Given the explorative nature of this study, a single case study research design has been adopted. The choice of a single case study offers the opportunity to eliminate potential confounding factors due to the heterogeneity–in terms of strategy, legacy, professionals’ behaviours and technology infrastructure–that different hospitals might show. We selected the Teaching Hospital Campus Bio-Medico (CBM) in Rome (Italy) as an adequate setting for investigating our research questions. This hospital is mid-size (around 300 beds), many-disciplines, teaching and private. Being a teaching hospital, there is more room for divergent goals between professionals and managers, thus creating the correct setting where to investigate the interplay between individual and organizational factors. Being many-discipline, there is room to study the potential conflict among professionals from different disciplines concerning the intention to use EMRs. Finally, being mid-size, CBM is a valid setting to observe the potential divergence between nurses and doctors in the intention to use EMRs. A quantitative study has been performed using a survey administered to hospital professionals (physicians and nurses). The questionnaire has been designed based on the scales identified in the literature and reviewed in detail by the authors. Moreover, a pilot test of the questionnaire has been carried out before the survey. The initial questionnaire comprised 20 items that were reviewed for face validity by a panel of four experts, consisting of one nurse and one physician—with more than 9 years of work experience -, and two engineers with expertise in Information Science. Panel members were asked to evaluate each statement for clarity, ease of use and appropriateness. Based on their comments and suggestions, five items were removed and changes were made in the wording of several items to increase clarity.

This 15-item questionnaire was tested for content validity by 10 experts not involved in the preceding phase to identify its ability to measure the determinants of the intention to use EMRs in hospitals and to identify, for each item, utility, consistency with the research objectives, easy of reply and other important aspects to take into account. Audio-recorded individual interviews using a semi-structured grid were carried out with 10 experts including two nurses, three head nurses, two managers and three physicians. The interviews lasted 60 minutes on average and were conducted in a designated room by three researchers: one acted as the interviewer, the other two helped with audio-recording and with filling out the grid for item evaluation. Based on the expert evaluation, three items were modified.

The questionnaire consists of two main sections: scales and constructs of the proposed model; control variables and characteristics of respondents. Eleven items evaluated individual variables, in particular, the scale for the measurement of perceived usefulness has been adapted from the studies of Venkatesh [ 38 , 39 ]. Organizational variables were explored through 4 items related to normative and regulative factors. The scale for the measurement of normative and regulative factors has been adapted from the study of Scott [ 20 ]. The survey items are available in Annex ( S1 Table ). Additional questions have been designed to gather demographic and sample information. All questionnaire items related to the constructs of the proposed model were explored using a 7 point Likert scale with 1 indicating “strongly disagree” and 7 “strongly agree”. The first re-call has been made one week after the expiration date for compilation. Three days after the first follow-up, the second recall has been sent. Finally, three days after, the third recall has been sent.

The statistical analysis was performed using the software Stata 14.1®. The internal consistency was evaluated through Cronbach’s Alpha coefficients, the path analysis was performed to test the proposed model considering a p-value of <0.05 as significant. The correlation between profession (doctors vs. nurses) and the answers provided for each item was analyzed through the Fisher’s test; a p-value of <0.05 was considered significant.

The study has been approved by the General Management and the Ethics Board of CBM. The link for the online questionnaire was sent by e-mail to 380 nurses and 250 physician representatives of different clinical areas. All questionnaires were filled out anonymously in a period between February and September 2018. The final sample included 114 hospital professionals (response rate 19%), composed by 78 (68%) nurses and 36 (32%) physicians. They were 84 (74%) females and 30 (36%) males, aged 37.4 years on average (range 23–66, SD 9.6), with a mean work experience of 13.24 (range 0.5–41, SD 8.73). The sample of respondents has been compared–in terms of age, gender and clinical experience–to the whole population of doctors and nurses enrolled at CBM confirming the absence of potential response biases related to the non-respondents.

Questionnaire’s internal consistency

The internal consistency of constructs was evaluated through Cronbach's Alpha coefficients, values greater than or equal to 0.7 were considered acceptable. (α ≥ 0.90 were considered excellent; 0.8 ≤ α < 0.9 good; 0.7 ≤ α < 0.8 acceptable; 0.6 ≤ α < 0.7 questionable; 0.5 ≤ α < 0.6 poor; α < 0.5 unacceptable) ( Table 1 ).



Determinants of current behaviours

Data show that both physicians and nurses expect many benefits from the use of EMRs. In particular, they think EMRs will have a positive impact on relevant factors such as quality, efficiency and effectiveness of care; handover communication among healthcare workers; teaching, tutoring and research activities; greater control of their tasks. Data confirm that perceived usefulness (C = 0.33**) directly affects the intention to use EMRs. Concerning the organizational factors, data prove that there does exist an interplay between them and individual determinants. In fact, normative factors directly affect perceived usefulness (C = 0.30**), perceived ease of use (C = 0.26**) and intention to use EMRs (C = 0.33**). Regulative factors affect the intention to use EMRs, with a negative sign (C = -0.21**). Control variables (i.e., age, seniority, clinical area and profession) have no impact on other variables in our model. Fig 2 offers a graphical representation of our results.



Moreover, the findings show a significant correlation between being a nurse or a physician and the perceived ease of use and intention to use EMRs. In particular, more nurse than physicians perceive EMRs as easy to use (p = 0.019 for the item “the EMR will be easy to use”) and state that they would like to use it (p = 0.01 for the item “if I had the opportunity I would use the EMR for most of my work’s processes”).

This study sought to better clarify the relationship between organizational and individual determinants of the intention to use EMRs in a hospital setting by nurses and physicians. Previous studies [ 40 – 46 ] have focused mainly on either the barriers or the facilitators that might impact on the implementation of EMRs, but, to the best of authors’ knowledge, it has never been deepened if and how organizational and individual factors do interact and affect jointly hospital professionals’ motivation to use EMRs. Findings confirmed the positive role played by the perceived usefulness as driving individual factor to the intention to use EMRs and shed light on the significant positive role played by the normative (peer influence) factors [ 2 ], both with direct and indirect effects. In this view, hospital managers can leverage on lead peer influence (i.e., innovation champions) to motivate, generate and manage change and generate a virtuous circle inside the hospital to motivate the use of EMRs. The EMRs implementation process should take into account that professionals need proper time to re-establish control over their tasks and processes. The introduction of EMRs in daily clinical practice changes the status quo and, if, on one hand, it allows many new opportunities, on the other hand, it involves changes that can have different effects on hospital professionals also based on their own characteristics, knowledge, skills and work type. In general, this is what happens in the case of effective implementation, while the consequences of poorly managed implementation can be very complex and involve a greater expenditure of time, energy and money to restart the processes at the previous speed and functionality. In this sense, to increase the motivation of users in all phases of the project represent an essential point for effective management of change. This study confirms the importance of involving front-line professionals, as soon as the hospital decides to start the implementation phase to increase their motivation to use EMRs. In fact, as a result of their involvement, professionals will better understand the rationale of this technological shift and their perception of usefulness will increase consequently. Moreover, it is important to consider that, as reported by Gastaldi et al. [ 2 ] in the absence of coercive mechanisms, institutional pressures toward EMR use are primarily normative and/or mimetic [ 2 ].

In the study, the construct “Regulative factor” has been derived from the Institutional theory and is aimed at exploring the pressure that a hospital professional might perceive from the goals set by hospital managers. This pressure is intended to be independent of the specific strategy/initiative and to be a general availability of a hospital professional to align his/her behaviour to the goals set by hospital managers. An example of a question is: “I very much agree with most of the objectives of the management”. The regulative factor should be analyzed together with the construct “Normative factor” that crystallizes the perceived pressure from peers. Hospitals are intended as professional bureaucracies where professionals feel more the pressures from peers rather than from apex managers. What is interesting is that the regulative factor affects negatively the intention to use, meaning that more the general agreement with managers’ goals less the intention to use an EMR. This finding might appear as counter-intuitive and contrary to what has been found in other studies [ 47 ]. This result cannot be explained by the potential misalignment between hospitals managers’ goals and those of physicians and nurses, being the former more focused on the efficiency and the latter on the effectiveness of care delivery. Managers at CBM have proved to be committed to the quality of care and not to efficiency strategies that might reduce the effectiveness of care. This context is quite typical in Italy, where the tensions between “medicine” and “management” are less evident than in other countries, such as in the US. We think that the negative impact of the regulative factors on the perception of usefulness is because hospital managers did not detail enough their goals about the digital transformation of care delivery, thus impacting negatively on hospital professionals’ perception about the usefulness of an EMR. Being these goals enough general–e.g., providing support to research activities and care delivery, promoting efficiency and process redesign–while the linkage between the regulative factors and the perception of usefulness failed to materialize, the linkage between the regulative factors and the intention to use EMRs became negative as hospital professionals lost the connection between EMR usage and managers’ goals. In this view, more contextualized goals about the usage of EMR are expected to positively affect the intention to use it among those professionals who are more willing to be adherent to managers’ goals. This finding should be tested and confirmed by further replication studies that might capture more in detail the relationships between regulative factors and either the perceived usefulness or the intention to use. For instance, it might be valuable to understand whether and how the co-development of hospitals goals between managers and professionals might impact these relationships as well as the specific content of hospital goals (financial vs. quality of care, operative vs. research).

This study offers original insights to further the ongoing debate about the digital transformation of hospitals, with a focus to EMRs. Our results show that there is an interplay between individual and organizational factors in shaping hospital professionals’ intention to use EMRs. The study showed that the main determinants of the intention to use EMRs are the normative ones (peer influence) and the individual ones (perceive usefulness).

From an academic viewpoint, the study offers an original perspective and a new theoretical framework, which combines organizational theories and technology acceptance models to explain hospital professionals’ acceptance of EMRs. In particular, the results confirm the importance of individual variables, not only as directly related to the acceptance of new technology, but also as important mediators between institutional variables and acceptance, thus highlight and confirming the importance of the connections between organizational studies and information science.

Despite the original contributions, this study suffers at least two limitations that should be addressed by future research. First, the research design is based on a single case study. Further research should consider a multi-centre design, thus allowing the generalization of our results. Moreover, a multi-centre study will allow exploring the role that hospital characteristics–in terms of strategy, legacy, etc.–might have on shaping both the organizational and individual factors investigated in this study. Second, this study investigated the intention to use EMRs as the dependent variable. Further research should consider hospitals where EMRs are already mature technologies, thus allowing the investigation of the actual use and which factors might facilitate/inhibit the translation of the intention to use into actual use.

Supporting information

S1 table. questionnaire..


S2 Table. Perceived usefulness.


S3 Table. Perceived ease of use.


S4 Table. Intention to use.


S5 Table. Normative factors (Peer influence).


S6 Table. Regulative factors (Adhesion to the management objectives).



We want to thank Dr Federica Segato for her valuable comments in all phases of this study.

  • 1. Barlow J. Managing Innovation in Healthcare. World Scientific Publishing Europe. London; 2017.
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 10. Scott W.R., & Davis G. Organizations and organizing: rational, natural and open systems perspectives. New Jersey: Prentice Hall; 2008.
  • 12. Scott W.R., Ruef P., Mendel P., Caronna C. Institutional change and healthcare organizations: from professional dominance to managed care. University of Chicago Press. Chicago; 2000.
  • 19. Lawrence T.B., Suddaby R. and Leca B. Institutional Work: Actors and Agency in Institutional Studies of Organizations. Cambridge University Press, Cambridge, UK; 2009.
  • 27. Leca, B., Battilana, J. and Boxenbaum, E. Agency and Institutions: A Review of Institutional Entrepreneurship, Working Paper 08–096, Harvard Business School; 2008.
  • 31. Rogers E. Diffusion of Innovations. 4th ed., Free Press, New York; 1995.
  • 32. Tornatzky L., Fleischer M. The Process of Technology Innovation, Lexington Books, Lexington, MA; 1990.
  • 33. https://www.osservatori.net/it_it/pubblicazioni/rapporti/sanita-digitale-italia-stato-dell-arte-e-trend .

The Impact of the Electronic Health Record on Moving New Evidence-Based Nursing Practices Forward


  • 1 The Ohio State University Wexner Medical Center, The Ohio State University College of Nursing, Columbus, OH, USA.
  • 2 Translational Implementation Science Core, Helene Fuld Health Trust National Institute for EBP in Nursing and Healthcare, Columbus, OH, USA.
  • 3 The Ohio State University Wexner Medical Center, Columbus, OH, USA.
  • 4 Helene Fuld Health Trust National Institute for EBP in Nursing and Healthcare, The Ohio State University, Columbus, OH, USA.
  • 5 Department of Critical Care Nursing, The Ohio State University Wexner Medical Center, Columbus, OH, USA.
  • PMID: 32233009
  • DOI: 10.1111/wvn.12435

Background: Anecdotal reports from across the country highlight the fact that nurses are facing major challenges in moving new evidence-based practice (EBP) initiatives into the electronic health record (EHR).

Purpose: The purpose of this study was to: (a) learn current processes for embedding EBP into EHRs, (b) uncover facilitators and barriers associated with rapid movement of new evidence-based nursing practices into the EHR and (c) identify strategies and processes that have been successfully implemented in healthcare organizations across the nation.

Methods: A qualitative study design was utilized. Purposive sampling was used to recruit nurses from across the country (N = 29). Nine focus group sessions were conducted. Semistructured interview questions were developed. Focus groups were conducted by video and audio conferencing. Using an inductive approach, each transcript was read and initial codes were generated resulting in major themes and subthemes.

Results: Five major themes were identified: (a) barriers to advancing EBP secondary to the EHR, (b) organizational structure and governing processes of the EHR, (c) current processes for prioritization of EHR changes, (d) impact on ability of clinicians to implement EBP and (e) wait times and delays.

Linking evidence to action: Delays in moving new EBP practice changes into the EHR are significant. These delays are sources of frustration and job dissatisfaction. Our results underscore the importance of a priori planning for anticipated changes and building expected delays into the timeline for EBP projects. Moreover, nurse executives must advocate for greater representation of nursing within informatics technology governance structures and additional resources to hire nurse informaticians.

Keywords: electronic health record; evidenced-based practice; hospital governance; nursing informatics.

© 2020 Sigma Theta Tau International.

  • Electronic Health Records / standards*
  • Electronic Health Records / trends
  • Evidence-Based Practice / methods*
  • Evidence-Based Practice / standards
  • Evidence-Based Practice / trends
  • Focus Groups / methods
  • Nursing Research / instrumentation*
  • Nursing Research / methods
  • Nursing Research / trends
  • Qualitative Research

Grants and funding

  • Helene Fuld Health Trust National Institute for EBP in Nursing and Healthcare

Electronic Health Record: A review

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 19 February 2024

Genomic data in the All of Us Research Program

The all of us research program genomics investigators.

Nature ( 2024 ) Cite this article

56k Accesses

531 Altmetric

Metrics details

  • Genetic variation
  • Genome-wide association studies

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics 1 , 2 , 3 , 4 . The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health 5 , 6 . Here we describe the programme’s genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.

Comprehensively identifying genetic variation and cataloguing its contribution to health and disease, in conjunction with environmental and lifestyle factors, is a central goal of human health research 1 , 2 . A key limitation in efforts to build this catalogue has been the historic under-representation of large subsets of individuals in biomedical research including individuals from diverse ancestries, individuals with disabilities and individuals from disadvantaged backgrounds 3 , 4 . The All of Us Research Program (All of Us) aims to address this gap by enrolling and collecting comprehensive health data on at least one million individuals who reflect the diversity across the USA 5 , 6 . An essential component of All of Us is the generation of whole-genome sequence (WGS) and genotyping data on one million participants. All of Us is committed to making this dataset broadly useful—not only by democratizing access to this dataset across the scientific community but also to return value to the participants themselves by returning individual DNA results, such as genetic ancestry, hereditary disease risk and pharmacogenetics according to clinical standards, to those who wish to receive these research results.

Here we describe the release of WGS data from 245,388 All of Us participants and demonstrate the impact of this high-quality data in genetic and health studies. We carried out a series of data harmonization and quality control (QC) procedures and conducted analyses characterizing the properties of the dataset including genetic ancestry and relatedness. We validated the data by replicating well-established genotype–phenotype associations including low-density lipoprotein cholesterol (LDL-C) and 117 additional diseases. These data are available through the All of Us Researcher Workbench, a cloud platform that embodies and enables programme priorities, facilitating equitable data and compute access while ensuring responsible conduct of research and protecting participant privacy through a passport data access model.

The All of Us Research Program

To accelerate health research, All of Us is committed to curating and releasing research data early and often 6 . Less than five years after national enrolment began in 2018, this fifth data release includes data from more than 413,000 All of Us participants. Summary data are made available through a public Data Browser, and individual-level participant data are made available to researchers through the Researcher Workbench (Fig. 1a and Data availability).

figure 1

a , The All of Us Research Hub contains a publicly accessible Data Browser for exploration of summary phenotypic and genomic data. The Researcher Workbench is a secure cloud-based environment of participant-level data in a Controlled Tier that is widely accessible to researchers. b , All of Us participants have rich phenotype data from a combination of physical measurements, survey responses, EHRs, wearables and genomic data. Dots indicate the presence of the specific data type for the given number of participants. c , Overall summary of participants under-represented in biomedical research (UBR) with data available in the Controlled Tier. The All of Us logo in a is reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Participant data include a rich combination of phenotypic and genomic data (Fig. 1b ). Participants are asked to complete consent for research use of data, sharing of electronic health records (EHRs), donation of biospecimens (blood or saliva, and urine), in-person provision of physical measurements (height, weight and blood pressure) and surveys initially covering demographics, lifestyle and overall health 7 . Participants are also consented for recontact. EHR data, harmonized using the Observational Medical Outcomes Partnership Common Data Model 8 ( Methods ), are available for more than 287,000 participants (69.42%) from more than 50 health care provider organizations. The EHR dataset is longitudinal, with a quarter of participants having 10 years of EHR data (Extended Data Fig. 1 ). Data include 245,388 WGSs and genome-wide genotyping on 312,925 participants. Sequenced and genotyped individuals in this data release were not prioritized on the basis of any clinical or phenotypic feature. Notably, 99% of participants with WGS data also have survey data and physical measurements, and 84% also have EHR data. In this data release, 77% of individuals with genomic data identify with groups historically under-represented in biomedical research, including 46% who self-identify with a racial or ethnic minority group (Fig. 1c , Supplementary Table 1 and Supplementary Note ).

Scaling the All of Us infrastructure

The genomic dataset generated from All of Us participants is a resource for research and discovery and serves as the basis for return of individual health-related DNA results to participants. Consequently, the US Food and Drug Administration determined that All of Us met the criteria for a significant risk device study. As such, the entire All of Us genomics effort from sample acquisition to sequencing meets clinical laboratory standards 9 .

All of Us participants were recruited through a national network of partners, starting in 2018, as previously described 5 . Participants may enrol through All of Us - funded health care provider organizations or direct volunteer pathways and all biospecimens, including blood and saliva, are sent to the central All of Us Biobank for processing and storage. Genomics data for this release were generated from blood-derived DNA. The programme began return of actionable genomic results in December 2022. As of April 2023, approximately 51,000 individuals were sent notifications asking whether they wanted to view their results, and approximately half have accepted. Return continues on an ongoing basis.

The All of Us Data and Research Center maintains all participant information and biospecimen ID linkage to ensure that participant confidentiality and coded identifiers (participant and aliquot level) are used to track each sample through the All of Us genomics workflow. This workflow facilitates weekly automated aliquot and plating requests to the Biobank, supplies relevant metadata for the sample shipments to the Genome Centers, and contains a feedback loop to inform action on samples that fail QC at any stage. Further, the consent status of each participant is checked before sample shipment to confirm that they are still active. Although all participants with genomic data are consented for the same general research use category, the programme accommodates different preferences for the return of genomic data to participants and only data for those individuals who have consented for return of individual health-related DNA results are distributed to the All of Us Clinical Validation Labs for further evaluation and health-related clinical reporting. All participants in All of Us that choose to get health-related DNA results have the option to schedule a genetic counselling appointment to discuss their results. Individuals with positive findings who choose to obtain results are required to schedule an appointment with a genetic counsellor to receive those findings.

Genome sequencing

To satisfy the requirements for clinical accuracy, precision and consistency across DNA sample extraction and sequencing, the All of Us Genome Centers and Biobank harmonized laboratory protocols, established standard QC methodologies and metrics, and conducted a series of validation experiments using previously characterized clinical samples and commercially available reference standards 9 . Briefly, PCR-free barcoded WGS libraries were constructed with the Illumina Kapa HyperPrep kit. Libraries were pooled and sequenced on the Illumina NovaSeq 6000 instrument. After demultiplexing, initial QC analysis is performed with the Illumina DRAGEN pipeline (Supplementary Table 2 ) leveraging lane, library, flow cell, barcode and sample level metrics as well as assessing contamination, mapping quality and concordance to genotyping array data independently processed from a different aliquot of DNA. The Genome Centers use these metrics to determine whether each sample meets programme specifications and then submits sequencing data to the Data and Research Center for further QC, joint calling and distribution to the research community ( Methods ).

This effort to harmonize sequencing methods, multi-level QC and use of identical data processing protocols mitigated the variability in sequencing location and protocols that often leads to batch effects in large genomic datasets 9 . As a result, the data are not only of clinical-grade quality, but also consistent in coverage (≥30× mean) and uniformity across Genome Centers (Supplementary Figs. 1 – 5 ).

Joint calling and variant discovery

We carried out joint calling across the entire All of Us WGS dataset (Extended Data Fig. 2 ). Joint calling leverages information across samples to prune artefact variants, which increases sensitivity, and enables flagging samples with potential issues that were missed during single-sample QC 10 (Supplementary Table 3 ). Scaling conventional approaches to whole-genome joint calling beyond 50,000 individuals is a notable computational challenge 11 , 12 . To address this, we developed a new cloud variant storage solution, the Genomic Variant Store (GVS), which is based on a schema designed for querying and rendering variants in which the variants are stored in GVS and rendered to an analysable variant file, as opposed to the variant file being the primary storage mechanism (Code availability). We carried out QC on the joint call set on the basis of the approach developed for gnomAD 3.1 (ref.  13 ). This included flagging samples with outlying values in eight metrics (Supplementary Table 4 , Supplementary Fig. 2 and Methods ).

To calculate the sensitivity and precision of the joint call dataset, we included four well-characterized samples. We sequenced the National Institute of Standards and Technology reference materials (DNA samples) from the Genome in a Bottle consortium 13 and carried out variant calling as described above. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations 14 . The overall sensitivity for single-nucleotide variants was over 98.7% and precision was more than 99.9%. For short insertions or deletions, the sensitivity was over 97% and precision was more than 99.6% (Supplementary Table 5 and Methods ).

The joint call set included more than 1 billion genetic variants. We annotated the joint call dataset on the basis of functional annotation (for example, gene symbol and protein change) using Illumina Nirvana 15 . We defined coding variants as those inducing an amino acid change on a canonical ENSEMBL transcript and found 272,051,104 non-coding and 3,913,722 coding variants that have not been described previously in dbSNP 16 v153 (Extended Data Table 1 ). A total of 3,912,832 (99.98%) of the coding variants are rare (allelic frequency < 0.01) and the remaining 883 (0.02%) are common (allelic frequency > 0.01). Of the coding variants, 454 (0.01%) are common in one or more of the non-European computed ancestries in All of Us, rare among participants of European ancestry, and have an allelic number greater than 1,000 (Extended Data Table 2 and Extended Data Fig. 3 ). The distributions of pathogenic, or likely pathogenic, ClinVar variant counts per participant, stratified by computed ancestry, filtered to only those variants that are found in individuals with an allele count of <40 are shown in Extended Data Fig. 4 . The potential medical implications of these known and new variants with respect to variant pathogenicity by ancestry are highlighted in a companion paper 17 . In particular, we find that the European ancestry subset has the highest rate of pathogenic variation (2.1%), which was twice the rate of pathogenic variation in individuals of East Asian ancestry 17 .The lower frequency of variants in East Asian individuals may be partially explained by the fact the sample size in that group is small and there may be knowledge bias in the variant databases that is reducing the number of findings in some of the less-studied ancestry groups.

Genetic ancestry and relatedness

Genetic ancestry inference confirmed that 51.1% of the All of Us WGS dataset is derived from individuals of non-European ancestry. Briefly, the ancestry categories are based on the same labels used in gnomAD 18 . We trained a classifier on a 16-dimensional principal component analysis (PCA) space of a diverse reference based on 3,202 samples and 151,159 autosomal single-nucleotide polymorphisms. We projected the All of Us samples into the PCA space of the training data, based on the same single-nucleotide polymorphisms from the WGS data, and generated categorical ancestry predictions from the trained classifier ( Methods ). Continuous genetic ancestry fractions for All of Us samples were inferred using the same PCA data, and participants’ patterns of ancestry and admixture were compared to their self-identified race and ethnicity (Fig. 2 and Methods ). Continuous ancestry inference carried out using genome-wide genotypes yields highly concordant estimates.

figure 2

a , b , Uniform manifold approximation and projection (UMAP) representations of All of Us WGS PCA data with self-described race ( a ) and ethnicity ( b ) labels. c , Proportion of genetic ancestry per individual in six distinct and coherent ancestry groups defined by Human Genome Diversity Project and 1000 Genomes samples.

Kinship estimation confirmed that All of Us WGS data consist largely of unrelated individuals with about 85% (215,107) having no first- or second-degree relatives in the dataset (Supplementary Fig. 6 ). As many genomic analyses leverage unrelated individuals, we identified the smallest set of samples that are required to be removed from the remaining individuals that had first- or second-degree relatives and retained one individual from each kindred. This procedure yielded a maximal independent set of 231,442 individuals (about 94%) with genome sequence data in the current release ( Methods ).

Genetic determinants of LDL-C

As a measure of data quality and utility, we carried out a single-variant genome-wide association study (GWAS) for LDL-C, a trait with well-established genomic architecture ( Methods ). Of the 245,388 WGS participants, 91,749 had one or more LDL-C measurements. The All of Us LDL-C GWAS identified 20 well-established genome-wide significant loci, with minimal genomic inflation (Fig. 3 , Extended Data Table 3 and Supplementary Fig. 7 ). We compared the results to those of a recent multi-ethnic LDL-C GWAS in the National Heart, Lung, and Blood Institute (NHLBI) TOPMed study that included 66,329 ancestrally diverse (56% non-European ancestry) individuals 19 . We found a strong correlation between the effect estimates for NHLBI TOPMed genome-wide significant loci and those of All of Us ( R 2  = 0.98, P  < 1.61 × 10 −45 ; Fig. 3 , inset). Notably, the per-locus effect sizes observed in All of Us are decreased compared to those in TOPMed, which is in part due to differences in the underlying statistical model, differences in the ancestral composition of these datasets and differences in laboratory value ascertainment between EHR-derived data and epidemiology studies. A companion manuscript extended this work to identify common and rare genetic associations for three diseases (atrial fibrillation, coronary artery disease and type 2 diabetes) and two quantitative traits (height and LDL-C) in the All of Us dataset and identified very high concordance with previous efforts across all of these diseases and traits 20 .

figure 3

Manhattan plot demonstrating robust replication of 20 well-established LDL-C genetic loci among 91,749 individuals with 1 or more LDL-C measurements. The red horizontal line denotes the genome wide significance threshold of P = 5 × 10 –8 . Inset, effect estimate ( β ) comparison between NHLBI TOPMed LDL-C GWAS ( x  axis) and All of Us LDL-C GWAS ( y  axis) for the subset of 194 independent variants clumped (window 250 kb, r2 0.5) that reached genome-wide significance in NHLBI TOPMed.

Genotype-by-phenotype associations

As another measure of data quality and utility, we tested replication rates of previously reported phenotype–genotype associations in the five predicted genetic ancestry populations present in the Phenotype/Genotype Reference Map (PGRM): AFR, African ancestry; AMR, Latino/admixed American ancestry; EAS, East Asian ancestry; EUR, European ancestry; SAS, South Asian ancestry. The PGRM contains published associations in the GWAS catalogue in these ancestry populations that map to International Classification of Diseases-based phenotype codes 21 . This replication study specifically looked across 4,947 variants, calculating replication rates for powered associations in each ancestry population. The overall replication rates for associations powered at 80% were: 72.0% (18/25) in AFR, 100% (13/13) in AMR, 46.6% (7/15) in EAS, 74.9% (1,064/1,421) in EUR, and 100% (1/1) in SAS. With the exception of the EAS ancestry results, these powered replication rates are comparable to those of the published PGRM analysis where the replication rates of several single-site EHR-linked biobanks ranges from 76% to 85%. These results demonstrate the utility of the data and also highlight opportunities for further work understanding the specifics of the All of Us population and the potential contribution of gene–environment interactions to genotype–phenotype mapping and motivates the development of methods for multi-site EHR phenotype data extraction, harmonization and genetic association studies.

More broadly, the All of Us resource highlights the opportunities to identify genotype–phenotype associations that differ across diverse populations 22 . For example, the Duffy blood group locus ( ACKR1 ) is more prevalent in individuals of AFR ancestry and individuals of AMR ancestry than in individuals of EUR ancestry. Although the phenome-wide association study of this locus highlights the well-established association of the Duffy blood group with lower white blood cell counts both in individuals of AFR and AMR ancestry 23 , 24 , it also revealed genetic-ancestry-specific phenotype patterns, with minimal phenotypic associations in individuals of EAS ancestry and individuals of EUR ancestry (Fig. 4 and Extended Data Table 4 ). Conversely, rs9273363 in the HLA-DQB1 locus is associated with increased risk of type 1 diabetes 25 , 26 and diabetic complications across ancestries, but only associates with increased risk of coeliac disease in individuals of EUR ancestry (Extended Data Fig. 5 ). Similarly, the TCF7L2 locus 27 strongly associates with increased risk of type 2 diabetes and associated complications across several ancestries (Extended Data Fig. 6 ). Association testing results are available in Supplementary Dataset 1 .

figure 4

Results of genetic-ancestry-stratified phenome-wide association analysis among unrelated individuals highlighting ancestry-specific disease associations across the four most common genetic ancestries of participant. Bonferroni-adjusted phenome-wide significance threshold (<2.88 × 10 −5 ) is plotted as a red horizontal line. AFR ( n  = 34,037, minor allele fraction (MAF) 0.82); AMR ( n  = 28,901, MAF 0.10); EAS ( n  = 32,55, MAF 0.003); EUR ( n  = 101,613, MAF 0.007).

The cloud-based Researcher Workbench

All of Us genomic data are available in a secure, access-controlled cloud-based analysis environment: the All of Us Researcher Workbench. Unlike traditional data access models that require per-project approval, access in the Researcher Workbench is governed by a data passport model based on a researcher’s authenticated identity, institutional affiliation, and completion of self-service training and compliance attestation 28 . After gaining access, a researcher may create a new workspace at any time to conduct a study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is regularly audited and made accessible publicly on the All of Us Research Projects Directory. This streamlined access model is guided by the principles that: participants are research partners and maintaining their privacy and data security is paramount; their data should be made as accessible as possible for authorized researchers; and we should continually seek to remove unnecessary barriers to accessing and using All of Us data.

For researchers at institutions with an existing institutional data use agreement, access can be gained as soon as they complete the required verification and compliance steps. As of August 2023, 556 institutions have agreements in place, allowing more than 5,000 approved researchers to actively work on more than 4,400 projects. The median time for a researcher from initial registration to completion of these requirements is 28.6 h (10th percentile: 48 min, 90th percentile: 14.9 days), a fraction of the weeks to months it can take to assemble a project-specific application and have it reviewed by an access board with conventional access models.

Given that the size of the project’s phenotypic and genomic dataset is expected to reach 4.75 PB in 2023, the use of a central data store and cloud analysis tools will save funders an estimated US$16.5 million per year when compared to the typical approach of allowing researchers to download genomic data. Storing one copy per institution of this data at 556 registered institutions would cost about US$1.16 billion per year. By contrast, storing a central cloud copy costs about US$1.14 million per year, a 99.9% saving. Importantly, cloud infrastructure also democratizes data access particularly for researchers who do not have high-performance local compute resources.

Here we present the All of Us Research Program’s approach to generating diverse clinical-grade genomic data at an unprecedented scale. We present the data release of about 245,000 genome sequences as part of a scalable framework that will grow to include genetic information and health data for one million or more people living across the USA. Our observations permit several conclusions.

First, the All of Us programme is making a notable contribution to improving the study of human biology through purposeful inclusion of under-represented individuals at scale 29 , 30 . Of the participants with genomic data in All of Us, 45.92% self-identified as a non-European race or ethnicity. This diversity enabled identification of more than 275 million new genetic variants across the dataset not previously captured by other large-scale genome aggregation efforts with diverse participants that have submitted variation to dbSNP v153, such as NHLBI TOPMed 31 freeze 8 (Extended Data Table 1 ). In contrast to gnomAD, All of Us permits individual-level genotype access with detailed phenotype data for all participants. Furthermore, unlike many genomics resources, All of Us is uniformly consented for general research use and enables researchers to go from initial account creation to individual-level data access in as little as a few hours. The All of Us cohort is significantly more diverse than those of other large contemporary research studies generating WGS data 32 , 33 . This enables a more equitable future for precision medicine (for example, through constructing polygenic risk scores that are appropriately calibrated to diverse populations 34 , 35 as the eMERGE programme has done leveraging All of Us data 36 , 37 ). Developing new tools and regulatory frameworks to enable analyses across multiple biobanks in the cloud to harness the unique strengths of each is an active area of investigation addressed in a companion paper to this work 38 .

Second, the All of Us Researcher Workbench embodies the programme’s design philosophy of open science, reproducible research, equitable access and transparency to researchers and to research participants 26 . Importantly, for research studies, no group of data users should have privileged access to All of Us resources based on anything other than data protection criteria. Although the All of Us Researcher Workbench initially targeted onboarding US academic, health care and non-profit organizations, it has recently expanded to international researchers. We anticipate further genomic and phenotypic data releases at regular intervals with data available to all researcher communities. We also anticipate additional derived data and functionality to be made available, such as reference data, structural variants and a service for array imputation using the All of Us genomic data.

Third, All of Us enables studying human biology at an unprecedented scale. The programmatic goal of sequencing one million or more genomes has required harnessing the output of multiple sequencing centres. Previous work has focused on achieving functional equivalence in data processing and joint calling pipelines 39 . To achieve clinical-grade data equivalence, All of Us required protocol equivalence at both sequencing production level and data processing across the sequencing centres. Furthermore, previous work has demonstrated the value of joint calling at scale 10 , 18 . The new GVS framework developed by the All of Us programme enables joint calling at extreme scales (Code availability). Finally, the provision of data access through cloud-native tools enables scalable and secure access and analysis to researchers while simultaneously enabling the trust of research participants and transparency underlying the All of Us data passport access model.

The clinical-grade sequencing carried out by All of Us enables not only research, but also the return of value to participants through clinically relevant genetic results and health-related traits to those who opt-in to receiving this information. In the years ahead, we anticipate that this partnership with All of Us participants will enable researchers to move beyond large-scale genomic discovery to understanding the consequences of implementing genomic medicine at scale.

The All of Us cohort

All of Us aims to engage a longitudinal cohort of one million or more US participants, with a focus on including populations that have historically been under-represented in biomedical research. Details of the All of Us cohort have been described previously 5 . Briefly, the primary objective is to build a robust research resource that can facilitate the exploration of biological, clinical, social and environmental determinants of health and disease. The programme will collect and curate health-related data and biospecimens, and these data and biospecimens will be made broadly available for research uses. Health data are obtained through the electronic medical record and through participant surveys. Survey templates can be found on our public website: https://www.researchallofus.org/data-tools/survey-explorer/ . Adults 18 years and older who have the capacity to consent and reside in the USA or a US territory at present are eligible. Informed consent for all participants is conducted in person or through an eConsent platform that includes primary consent, HIPAA Authorization for Research use of EHRs and other external health data, and Consent for Return of Genomic Results. The protocol was reviewed by the Institutional Review Board (IRB) of the All of Us Research Program. The All of Us IRB follows the regulations and guidance of the NIH Office for Human Research Protections for all studies, ensuring that the rights and welfare of research participants are overseen and protected uniformly.

Data accessibility through a ‘data passport’

Authorization for access to participant-level data in All of Us is based on a ‘data passport’ model, through which authorized researchers do not need IRB review for each research project. The data passport is required for gaining data access to the Researcher Workbench and for creating workspaces to carry out research projects using All of Us data. At present, data passports are authorized through a six-step process that includes affiliation with an institution that has signed a Data Use and Registration Agreement, account creation, identity verification, completion of ethics training, and attestation to a data user code of conduct. Results reported follow the All of Us Data and Statistics Dissemination Policy disallowing disclosure of group counts under 20 to protect participant privacy without seeking prior approval 40 .

At present, All of Us gathers EHR data from about 50 health care organizations that are funded to recruit and enrol participants as well as transfer EHR data for those participants who have consented to provide them. Data stewards at each provider organization harmonize their local data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, and then submit it to the All of Us Data and Research Center (DRC) so that it can be linked with other participant data and further curated for research use. OMOP is a common data model standardizing health information from disparate EHRs to common vocabularies and organized into tables according to data domains. EHR data are updated from the recruitment sites and sent to the DRC quarterly. Updated data releases to the research community occur approximately once a year. Supplementary Table 6 outlines the OMOP concepts collected by the DRC quarterly from the recruitment sites.

Biospecimen collection and processing

Participants who consented to participate in All of Us donated fresh whole blood (4 ml EDTA and 10 ml EDTA) as a primary source of DNA. The All of Us Biobank managed by the Mayo Clinic extracted DNA from 4 ml EDTA whole blood, and DNA was stored at −80 °C at an average concentration of 150 ng µl −1 . The buffy coat isolated from 10 ml EDTA whole blood has been used for extracting DNA in the case of initial extraction failure or absence of 4 ml EDTA whole blood. The Biobank plated 2.4 µg DNA with a concentration of 60 ng µl −1 in duplicate for array and WGS samples. The samples are distributed to All of Us Genome Centers weekly, and a negative (empty well) control and National Institute of Standards and Technology controls are incorporated every two months for QC purposes.

Genome Center sample receipt, accession and QC

On receipt of DNA sample shipments, the All of Us Genome Centers carry out an inspection of the packaging and sample containers to ensure that sample integrity has not been compromised during transport and to verify that the sample containers correspond to the shipping manifest. QC of the submitted samples also includes DNA quantification, using routine procedures to confirm volume and concentration (Supplementary Table 7 ). Any issues or discrepancies are recorded, and affected samples are put on hold until resolved. Samples that meet quality thresholds are accessioned in the Laboratory Information Management System, and sample aliquots are prepared for library construction processing (for example, normalized with respect to concentration and volume).

WGS library construction, sequencing and primary data QC

The DNA sample is first sheared using a Covaris sonicator and is then size-selected using AMPure XP beads to restrict the range of library insert sizes. Using the PCR Free Kapa HyperPrep library construction kit, enzymatic steps are completed to repair the jagged ends of DNA fragments, add proper A-base segments, and ligate indexed adapter barcode sequences onto samples. Excess adaptors are removed using AMPure XP beads for a final clean-up. Libraries are quantified using quantitative PCR with the Illumina Kapa DNA Quantification Kit and then normalized and pooled for sequencing (Supplementary Table 7 ).

Pooled libraries are loaded on the Illumina NovaSeq 6000 instrument. The data from the initial sequencing run are used to QC individual libraries and to remove non-conforming samples from the pipeline. The data are also used to calibrate the pooling volume of each individual library and re-pool the libraries for additional NovaSeq sequencing to reach an average coverage of 30×.

After demultiplexing, WGS analysis occurs on the Illumina DRAGEN platform. The DRAGEN pipeline consists of highly optimized algorithms for mapping, aligning, sorting, duplicate marking and haplotype variant calling and makes use of platform features such as compression and BCL conversion. Alignment uses the GRCh38dh reference genome. QC data are collected at every stage of the analysis protocol, providing high-resolution metrics required to ensure data consistency for large-scale multiplexing. The DRAGEN pipeline produces a large number of metrics that cover lane, library, flow cell, barcode and sample-level metrics for all runs as well as assessing contamination and mapping quality. The All of Us Genome Centers use these metrics to determine pass or fail for each sample before submitting the CRAM files to the All of Us DRC. For mapping and variant calling, all Genome Centers have harmonized on a set of DRAGEN parameters, which ensures consistency in processing (Supplementary Table 2 ).

Every step through the WGS procedure is rigorously controlled by predefined QC measures. Various control mechanisms and acceptance criteria were established during WGS assay validation. Specific metrics for reviewing and releasing genome data are: mean coverage (threshold of ≥30×), genome coverage (threshold of ≥90% at 20×), coverage of hereditary disease risk genes (threshold of ≥95% at 20×), aligned Q30 bases (threshold of ≥8 × 10 10 ), contamination (threshold of ≤1%) and concordance to independently processed array data.

Array genotyping

Samples are processed for genotyping at three All of Us Genome Centers (Broad, Johns Hopkins University and University of Washington). DNA samples are received from the Biobank and the process is facilitated by the All of Us genomics workflow described above. All three centres used an identical array product, scanners, resource files and genotype calling software for array processing to reduce batch effects. Each centre has its own Laboratory Information Management System that manages workflow control, sample and reagent tracking, and centre-specific liquid handling robotics.

Samples are processed using the Illumina Global Diversity Array (GDA) with Illumina Infinium LCG chemistry using the automated protocol and scanned on Illumina iSCANs with Automated Array Loaders. Illumina IAAP software converts raw data (IDAT files; 2 per sample) into a single GTC file per sample using the BPM file (defines strand, probe sequences and illumicode address) and the EGT file (defines the relationship between intensities and genotype calls). Files used for this data release are: GDA-8v1-0_A5.bpm, GDA-8v1-0_A1_ClusterFile.egt, gentrain v3, reference hg19 and gencall cutoff 0.15. The GDA array assays a total of 1,914,935 variant positions including 1,790,654 single-nucleotide variants, 44,172 indels, 9,935 intensity-only probes for CNV calling, and 70,174 duplicates (same position, different probes). Picard GtcToVcf is used to convert the GTC files to VCF format. Resulting VCF and IDAT files are submitted to the DRC for ingestion and further processing. The VCF file contains assay name, chromosome, position, genotype calls, quality score, raw and normalized intensities, B allele frequency and log R ratio values. Each genome centre is running the GDA array under Clinical Laboratory Improvement Amendments-compliant protocols. The GTC files are parsed and metrics are uploaded to in-house Laboratory Information Management System systems for QC review.

At batch level (each set of 96-well plates run together in the laboratory at one time), each genome centre includes positive control samples that are required to have >98% call rate and >99% concordance to existing data to approve release of the batch of data. At the sample level, the call rate and sex are the key QC determinants 41 . Contamination is also measured using BAFRegress 42 and reported out as metadata. Any sample with a call rate below 98% is repeated one time in the laboratory. Genotyped sex is determined by plotting normalized x versus normalized y intensity values for a batch of samples. Any sample discordant with ‘sex at birth’ reported by the All of Us participant is flagged for further detailed review and repeated one time in the laboratory. If several sex-discordant samples are clustered on an array or on a 96-well plate, the entire array or plate will have data production repeated. Samples identified with sex chromosome aneuploidies are also reported back as metadata (XXX, XXY, XYY and so on). A final processing status of ‘pass’, ‘fail’ or ‘abandon’ is determined before release of data to the All of Us DRC. An array sample will pass if the call rate is >98% and the genotyped sex and sex at birth are concordant (or the sex at birth is not applicable). An array sample will fail if the genotyped sex and the sex at birth are discordant. An array sample will have the status of abandon if the call rate is <98% after at least two attempts at the genome centre.

Data from the arrays are used for participant return of genetic ancestry and non-health-related traits for those who consent, and they are also used to facilitate additional QC of the matched WGS data. Contamination is assessed in the array data to determine whether DNA re-extraction is required before WGS. Re-extraction is prompted by level of contamination combined with consent status for return of results. The arrays are also used to confirm sample identity between the WGS data and the matched array data by assessing concordance at 100 unique sites. To establish concordance, a fingerprint file of these 100 sites is provided to the Genome Centers to assess concordance with the same sites in the WGS data before CRAM submission.

Genomic data curation

As seen in Extended Data Fig. 2 , we generate a joint call set for all WGS samples and make these data available in their entirety and by sample subsets to researchers. A breakdown of the frequencies, stratified by computed ancestries for which we had more than 10,000 participants can be found in Extended Data Fig. 3 . The joint call set process allows us to leverage information across samples to improve QC and increase accuracy.

Single-sample QC

If a sample fails single-sample QC, it is excluded from the release and is not reported in this document. These tests detect sample swaps, cross-individual contamination and sample preparation errors. In some cases, we carry out these tests twice (at both the Genome Center and the DRC), for two reasons: to confirm internal consistency between sites; and to mark samples as passing (or failing) QC on the basis of the research pipeline criteria. The single-sample QC process accepts a higher contamination rate than the clinical pipeline (0.03 for the research pipeline versus 0.01 for the clinical pipeline), but otherwise uses identical thresholds. The list of specific QC processes, passing criteria, error modes addressed and an overview of the results can be found in Supplementary Table 3 .

Joint call set QC

During joint calling, we carry out additional QC steps using information that is available across samples including hard thresholds, population outliers, allele-specific filters, and sensitivity and precision evaluation. Supplementary Table 4 summarizes both the steps that we took and the results obtained for the WGS data. More detailed information about the methods and specific parameters can be found in the All of Us Genomic Research Data Quality Report 36 .

Batch effect analysis

We analysed cross-sequencing centre batch effects in the joint call set. To quantify the batch effect, we calculated Cohen’s d (ref.  43 ) for four metrics (insertion/deletion ratio, single-nucleotide polymorphism count, indel count and single-nucleotide polymorphism transition/transversion ratio) across the three genome sequencing centres (Baylor College of Medicine, Broad Institute and University of Washington), stratified by computed ancestry and seven regions of the genome (whole genome, high-confidence calling, repetitive, GC content of >0.85, GC content of <0.15, low mappability, the ACMG59 genes and regions of large duplications (>1 kb)). Using random batches as a control set, all comparisons had a Cohen’s d of <0.35. Here we report any Cohen’s d results >0.5, which we chose before this analysis and is conventionally the threshold of a medium effect size 44 .

We found that there was an effect size in indel counts (Cohen’s d of 0.53) in the entire genome, between Broad Institute and University of Washington, but this was being driven by repetitive and low-mappability regions. We found no batch effects with Cohen’s d of >0.5 in the ratio metrics or in any metrics in the high-confidence calling, low or high GC content, or ACMG59 regions. A complete list of the batch effects with Cohen’s d of >0.5 are found in Supplementary Table 8 .

Sensitivity and precision evaluation

To determine sensitivity and precision, we included four well-characterized control samples (four National Institute of Standards and Technology Genome in a Bottle samples (HG-001, HG-003, HG-004 and HG-005). The samples were sequenced with the same protocol as All of Us. Of note, these samples were not included in data released to researchers. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations. We use the high-confidence calling region, defined by Genome in a Bottle v4.2.1, as the source of ground truth. To be called a true positive, a variant must match the chromosome, position, reference allele, alternate allele and zygosity. In cases of sites with multiple alternative alleles, each alternative allele is considered separately. Sensitivity and precision results are reported in Supplementary Table 5 .

Genetic ancestry inference

We computed categorical ancestry for all WGS samples in All of Us and made these available to researchers. These predictions are also the basis for population allele frequency calculations in the Genomic Variants section of the public Data Browser. We used the high-quality set of sites to determine an ancestry label for each sample. The ancestry categories are based on the same labels used in gnomAD 18 , the Human Genome Diversity Project (HGDP) 45 and 1000 Genomes 1 : African (AFR); Latino/admixed American (AMR); East Asian (EAS); Middle Eastern (MID); European (EUR), composed of Finnish (FIN) and Non-Finnish European (NFE); Other (OTH), not belonging to one of the other ancestries or is an admixture; South Asian (SAS).

We trained a random forest classifier 46 on a training set of the HGDP and 1000 Genomes samples variants on the autosome, obtained from gnomAD 11 . We generated the first 16 principal components (PCs) of the training sample genotypes (using the hwe_normalized_pca in Hail) at the high-quality variant sites for use as the feature vector for each training sample. We used the truth labels from the sample metadata, which can be found alongside the VCFs. Note that we do not train the classifier on the samples labelled as Other. We use the label probabilities (‘confidence’) of the classifier on the other ancestries to determine ancestry of Other.

To determine the ancestry of All of Us samples, we project the All of Us samples into the PCA space of the training data and apply the classifier. As a proxy for the accuracy of our All of Us predictions, we look at the concordance between the survey results and the predicted ancestry. The concordance between self-reported ethnicity and the ancestry predictions was 87.7%.

PC data from All of Us samples and the HGDP and 1000 Genomes samples were used to compute individual participant genetic ancestry fractions for All of Us samples using the Rye program. Rye uses PC data to carry out rapid and accurate genetic ancestry inference on biobank-scale datasets 47 . HGDP and 1000 Genomes reference samples were used to define a set of six distinct and coherent ancestry groups—African, East Asian, European, Middle Eastern, Latino/admixed American and South Asian—corresponding to participant self-identified race and ethnicity groups. Rye was run on the first 16 PCs, using the defined reference ancestry groups to assign ancestry group fractions to individual All of Us participant samples.


We calculated the kinship score using the Hail pc_relate function and reported any pairs with a kinship score above 0.1. The kinship score is half of the fraction of the genetic material shared (ranges from 0.0 to 0.5). We determined the maximal independent set 41 for related samples. We identified a maximally unrelated set of 231,442 samples (94%) for kinship scored greater than 0.1.

LDL-C common variant GWAS

The phenotypic data were extracted from the Curated Data Repository (CDR, Control Tier Dataset v7) in the All of Us Researcher Workbench. The All of Us Cohort Builder and Dataset Builder were used to extract all LDL cholesterol measurements from the Lab and Measurements criteria in EHR data for all participants who have WGS data. The most recent measurements were selected as the phenotype and adjusted for statin use 19 , age and sex. A rank-based inverse normal transformation was applied for this continuous trait to increase power and deflate type I error. Analysis was carried out on the Hail MatrixTable representation of the All of Us WGS joint-called data including removing monomorphic variants, variants with a call rate of <95% and variants with extreme Hardy–Weinberg equilibrium values ( P  < 10 −15 ). A linear regression was carried out with REGENIE 48 on variants with a minor allele frequency >5%, further adjusting for relatedness to the first five ancestry PCs. The final analysis included 34,924 participants and 8,589,520 variants.

Genotype-by-phenotype replication

We tested replication rates of known phenotype–genotype associations in three of the four largest populations: EUR, AFR and EAS. The AMR population was not included because they have no registered GWAS. This method is a conceptual extension of the original GWAS × phenome-wide association study, which replicated 66% of powered associations in a single EHR-linked biobank 49 . The PGRM is an expansion of this work by Bastarache et al., based on associations in the GWAS catalogue 50 in June 2020 (ref.  51 ). After directly matching the Experimental Factor Ontology terms to phecodes, the authors identified 8,085 unique loci and 170 unique phecodes that compose the PGRM. They showed replication rates in several EHR-linked biobanks ranging from 76% to 85%. For this analysis, we used the EUR-, and AFR-based maps, considering only catalogue associations that were P  < 5 × 10 −8 significant.

The main tools used were the Python package Hail for data extraction, plink for genomic associations, and the R packages PheWAS and pgrm for further analysis and visualization. The phenotypes, participant-reported sex at birth, and year of birth were extracted from the All of Us CDR (Controlled Tier Dataset v7). These phenotypes were then loaded into a plink-compatible format using the PheWAS package, and related samples were removed by sub-setting to the maximally unrelated dataset ( n  = 231,442). Only samples with EHR data were kept, filtered by selected loci, annotated with demographic and phenotypic information extracted from the CDR and ancestry prediction information provided by All of Us, ultimately resulting in 181,345 participants for downstream analysis. The variants in the PGRM were filtered by a minimum population-specific allele frequency of >1% or population-specific allele count of >100, leaving 4,986 variants. Results for which there were at least 20 cases in the ancestry group were included. Then, a series of Firth logistic regression tests with phecodes as the outcome and variants as the predictor were carried out, adjusting for age, sex (for non-sex-specific phenotypes) and the first three genomic PC features as covariates. The PGRM was annotated with power calculations based on the case counts and reported allele frequencies. Power of 80% or greater was considered powered for this analysis.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The All of Us Research Hub has a tiered data access data passport model with three data access tiers. The Public Tier dataset contains only aggregate data with identifiers removed. These data are available to the public through Data Snapshots ( https://www.researchallofus.org/data-tools/data-snapshots/ ) and the public Data Browser ( https://databrowser.researchallofus.org/ ). The Registered Tier curated dataset contains individual-level data, available only to approved researchers on the Researcher Workbench. At present, the Registered Tier includes data from EHRs, wearables and surveys, as well as physical measurements taken at the time of participant enrolment. The Controlled Tier dataset contains all data in the Registered Tier and additionally genomic data in the form of WGS and genotyping arrays, previously suppressed demographic data fields from EHRs and surveys, and unshifted dates of events. At present, Registered Tier and Controlled Tier data are available to researchers at academic institutions, non-profit institutions, and both non-profit and for-profit health care institutions. Work is underway to begin extending access to additional audiences, including industry-affiliated researchers. Researchers have the option to register for Registered Tier and/or Controlled Tier access by completing the All of Us Researcher Workbench access process, which includes identity verification and All of Us-specific training in research involving human participants ( https://www.researchallofus.org/register/ ). Researchers may create a new workspace at any time to conduct any research study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is made accessible publicly on the All of Us Research Projects Directory at https://allofus.nih.gov/protecting-data-and-privacy/research-projects-all-us-data .

Code availability

The GVS code is available at https://github.com/broadinstitute/gatk/tree/ah_var_store/scripts/variantstore . The LDL GWAS pipeline is available as a demonstration project in the Featured Workspace Library on the Researcher Workbench ( https://workbench.researchallofus.org/workspaces/aou-rw-5981f9dc/aouldlgwasregeniedsubctv6duplicate/notebooks ).

The 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526 , 68–74 (2015).

Article   Google Scholar  

Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577 , 179–189 (2020).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570 , 514–518 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lewis, A. C. F. et al. Getting genetic ancestry right for science and society. Science 376 , 250–252 (2022).

All of Us Program Investigators. The “All of Us” Research Program. N. Engl. J. Med. 381 , 668–676 (2019).

Ramirez, A. H., Gebo, K. A. & Harris, P. A. Progress with the All of Us Research Program: opening access for researchers. JAMA 325 , 2441–2442 (2021).

Article   PubMed   Google Scholar  

Ramirez, A. H. et al. The All of Us Research Program: data quality, utility, and diversity. Patterns 3 , 100570 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Overhage, J. M., Ryan, P. B., Reich, C. G., Hartzema, A. G. & Stang, P. E. Validation of a common data model for active safety surveillance research. J. Am. Med. Inform. Assoc. 19 , 54–60 (2012).

Venner, E. et al. Whole-genome sequencing as an investigational device for return of hereditary disease risk and pharmacogenomic results as part of the All of Us Research Program. Genome Med. 14 , 34 (2022).

Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536 , 285–291 (2016).

Tiao, G. & Goodrich, J. gnomAD v3.1 New Content, Methods, Annotations, and Data Availability ; https://gnomad.broadinstitute.org/news/2020-10-gnomad-v3-1-new-content-methods-annotations-and-data-availability/ .

Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625 , 92–100 (2022).

Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37 , 561–566 (2019).

Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37 , 555–560 (2019).

Stromberg, M. et al. Nirvana: clinical grade variant annotator. In Proc. 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics 596 (Association for Computing Machinery, 2017).

Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29 , 308–311 (2001).

Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol. https://doi.org/10.1038/s42003-023-05708-y (2024).

Karczewski, S. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581 , 434–443 (2020).

Selvaraj, M. S. et al. Whole genome sequence analysis of blood lipid levels in >66,000 individuals. Nat. Commun. 13 , 5995 (2022).

Wang, X. et al. Common and rare variants associated with cardiometabolic traits across 98,622 whole-genome sequences in the All of Us research program. J. Hum. Genet. 68 , 565–570 (2023).

Bastarache, L. et al. The phenotype-genotype reference map: improving biobank data science through replication. Am. J. Hum. Genet. 110 , 1522–1533 (2023).

Bianchi, D. W. et al. The All of Us Research Program is an opportunity to enhance the diversity of US biomedical research. Nat. Med. https://doi.org/10.1038/s41591-023-02744-3 (2024).

Van Driest, S. L. et al. Association between a common, benign genotype and unnecessary bone marrow biopsies among African American patients. JAMA Intern. Med. 181 , 1100–1105 (2021).

Chen, M.-H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182 , 1198–1213 (2020).

Chiou, J. et al. Interpreting type 1 diabetes risk with genetics and single-cell epigenomics. Nature 594 , 398–402 (2021).

Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47 , 898–905 (2015).

Grant, S. F. A. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat. Genet. 38 , 320–323 (2006).

Article   CAS   PubMed   Google Scholar  

All of Us Research Program. Framework for Access to All of Us Data Resources v1.1 (2021); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/data&tools/data-access-use/AoU_Data_Access_Framework_508.pdf .

Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177 , 58–69 (2019).

Mapes, B. M. et al. Diversity and inclusion for the All of Us research program: A scoping review. PLoS ONE 15 , e0234962 (2020).

Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590 , 290–299 (2021).

Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562 , 203–209 (2018).

Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607 , 732–740 (2022).

Kurniansyah, N. et al. Evaluating the use of blood pressure polygenic risk scores across race/ethnic background groups. Nat. Commun. 14 , 3202 (2023).

Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55 , 549– 558 (2022).

Linder, J. E. et al. Returning integrated genomic risk and clinical recommendations: the eMERGE study. Genet. Med. 25 , 100006 (2023).

Lennon, N. J. et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat. Med. https://doi.org/10.1038/s41591-024-02796-z (2024).

Deflaux, N. et al. Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis. Nat. Commun. 14 , 5419 (2023).

Regier, A. A. et al. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat. Commun. 9 , 4038 (2018).

Article   ADS   PubMed   PubMed Central   Google Scholar  

All of Us Research Program. Data and Statistics Dissemination Policy (2020); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/2020/05/AoU_Policy_Data_and_Statistics_Dissemination_508.pdf .

Laurie, C. C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34 , 591–602 (2010).

Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91 , 839–848 (2012).

Cohen, J. Statistical Power Analysis for the Behavioral Sciences (Routledge, 2013).

Andrade, C. Mean difference, standardized mean difference (SMD), and their use in meta-analysis. J. Clin. Psychiatry 81 , 20f13681 (2020).

Cavalli-Sforza, L. L. The Human Genome Diversity Project: past, present and future. Nat. Rev. Genet. 6 , 333–340 (2005).

Ho, T. K. Random decision forests. In Proc. 3rd International Conference on Document Analysis and Recognition (IEEE Computer Society Press, 2002).

Conley, A. B. et al. Rye: genetic ancestry inference at biobank scale. Nucleic Acids Res. 51 , e44 (2023).

Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53 , 1097–1103 (2021).

Denny, J. C. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotech. 31 , 1102–1111 (2013).

Buniello, A. et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47 , D1005–D1012 (2019).

Bastarache, L. et al. The Phenotype-Genotype Reference Map: improving biobank data science through replication. Am. J. Hum. Genet. 10 , 1522–1533 (2023).

Download references


The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers (OT2 OD026549; OT2 OD026554; OT2 OD026557; OT2 OD026556; OT2 OD026550; OT2 OD 026552; OT2 OD026553; OT2 OD026548; OT2 OD026551; OT2 OD026555); Inter agency agreement AOD 16037; Federally Qualified Health Centers HHSN 263201600085U; Data and Research Center: U2C OD023196; Genome Centers (OT2 OD002748; OT2 OD002750; OT2 OD002751); Biobank: U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: U24 OD023163; Communications and Engagement: OT2 OD023205; OT2 OD023206; and Community Partners (OT2 OD025277; OT2 OD025315; OT2 OD025337; OT2 OD025276). In addition, the All of Us Research Program would not be possible without the partnership of its participants. All of Us and the All of Us logo are service marks of the US Department of Health and Human Services. E.E.E. is an investigator of the Howard Hughes Medical Institute. We acknowledge the foundational contributions of our friend and colleague, the late Deborah A. Nickerson. Debbie’s years of insightful contributions throughout the formation of the All of Us genomics programme are permanently imprinted, and she shares credit for all of the successes of this programme.

Author information

Authors and affiliations.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Alexander G. Bick & Henry R. Condon

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA

Ginger A. Metcalf, Eric Boerwinkle, Richard A. Gibbs, Donna M. Muzny, Eric Venner, Kimberly Walker, Jianhong Hu, Harsha Doddapaneni, Christie L. Kovar, Mullai Murugan, Shannon Dugan, Ziad Khan & Eric Boerwinkle

Vanderbilt Institute of Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA

Kelsey R. Mayo, Jodell E. Linder, Melissa Basford, Ashley Able, Ashley E. Green, Robert J. Carroll, Jennifer Zhang & Yuanyuan Wang

Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA

Lee Lichtenstein, Anthony Philippakis, Sophie Schwartz, M. Morgan T. Aster, Kristian Cibulskis, Andrea Haessly, Rebecca Asch, Aurora Cremer, Kylee Degatano, Akum Shergill, Laura D. Gauthier, Samuel K. Lee, Aaron Hatcher, George B. Grant, Genevieve R. Brandt, Miguel Covarrubias, Eric Banks & Wail Baalawi

Verily, South San Francisco, CA, USA

Shimon Rura, David Glazer, Moira K. Dillon & C. H. Albach

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA

Robert J. Carroll, Paul A. Harris & Dan M. Roden

All of Us Research Program, National Institutes of Health, Bethesda, MD, USA

Anjene Musick, Andrea H. Ramirez, Sokny Lim, Siddhartha Nambiar, Bradley Ozenberger, Anastasia L. Wise, Chris Lunt, Geoffrey S. Ginsburg & Joshua C. Denny

School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA

I. King Jordan, Shashwat Deepali Nagar & Shivam Sharma

Neuroscience Institute, Institute of Translational Genomic Medicine, Morehouse School of Medicine, Atlanta, GA, USA

Robert Meller

Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA

Mine S. Cicek, Stephen N. Thibodeau & Mine S. Cicek

Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA

Kimberly F. Doheny, Michelle Z. Mawhinney, Sean M. L. Griffith, Elvin Hsu, Hua Ling & Marcia K. Adams

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA

Evan E. Eichler, Joshua D. Smith, Christian D. Frazar, Colleen P. Davis, Karynne E. Patterson, Marsha M. Wheeler, Sean McGee, Mitzi L. Murray, Valeria Vasta, Dru Leistritz, Matthew A. Richardson, Aparna Radhakrishnan & Brenna W. Ehmen

Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA

Evan E. Eichler

Broad Institute of MIT and Harvard, Cambridge, MA, USA

Stacey Gabriel, Heidi L. Rehm, Niall J. Lennon, Christina Austin-Tse, Eric Banks, Michael Gatzen, Namrata Gupta, Katie Larsson, Sheli McDonough, Steven M. Harrison, Christopher Kachulis, Matthew S. Lebo, Seung Hoan Choi & Xin Wang

Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA

Gail P. Jarvik & Elisabeth A. Rosenthal

Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Dan M. Roden

Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA

Center for Individualized Medicine, Biorepository Program, Mayo Clinic, Rochester, MN, USA

Stephen N. Thibodeau, Ashley L. Blegen, Samantha J. Wirkus, Victoria A. Wagner, Jeffrey G. Meyer & Mine S. Cicek

Color Health, Burlingame, CA, USA

Scott Topper, Cynthia L. Neben, Marcie Steeves & Alicia Y. Zhou

School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA

Eric Boerwinkle

Laboratory for Molecular Medicine, Massachusetts General Brigham Personalized Medicine, Cambridge, MA, USA

Christina Austin-Tse, Emma Henricks & Matthew S. Lebo

Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA, USA

Christina M. Lockwood, Brian H. Shirts, Colin C. Pritchard, Jillian G. Buchan & Niklas Krumm

Manuscript Writing Group

  • Alexander G. Bick
  • , Ginger A. Metcalf
  • , Kelsey R. Mayo
  • , Lee Lichtenstein
  • , Shimon Rura
  • , Robert J. Carroll
  • , Anjene Musick
  • , Jodell E. Linder
  • , I. King Jordan
  • , Shashwat Deepali Nagar
  • , Shivam Sharma
  •  & Robert Meller

All of Us Research Program Genomics Principal Investigators

  • Melissa Basford
  • , Eric Boerwinkle
  • , Mine S. Cicek
  • , Kimberly F. Doheny
  • , Evan E. Eichler
  • , Stacey Gabriel
  • , Richard A. Gibbs
  • , David Glazer
  • , Paul A. Harris
  • , Gail P. Jarvik
  • , Anthony Philippakis
  • , Heidi L. Rehm
  • , Dan M. Roden
  • , Stephen N. Thibodeau
  •  & Scott Topper

Biobank, Mayo

  • Ashley L. Blegen
  • , Samantha J. Wirkus
  • , Victoria A. Wagner
  • , Jeffrey G. Meyer
  •  & Stephen N. Thibodeau

Genome Center: Baylor-Hopkins Clinical Genome Center

  • Donna M. Muzny
  • , Eric Venner
  • , Michelle Z. Mawhinney
  • , Sean M. L. Griffith
  • , Elvin Hsu
  • , Marcia K. Adams
  • , Kimberly Walker
  • , Jianhong Hu
  • , Harsha Doddapaneni
  • , Christie L. Kovar
  • , Mullai Murugan
  • , Shannon Dugan
  • , Ziad Khan
  •  & Richard A. Gibbs

Genome Center: Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine

  • Niall J. Lennon
  • , Christina Austin-Tse
  • , Eric Banks
  • , Michael Gatzen
  • , Namrata Gupta
  • , Emma Henricks
  • , Katie Larsson
  • , Sheli McDonough
  • , Steven M. Harrison
  • , Christopher Kachulis
  • , Matthew S. Lebo
  • , Cynthia L. Neben
  • , Marcie Steeves
  • , Alicia Y. Zhou
  • , Scott Topper
  •  & Stacey Gabriel

Genome Center: University of Washington

  • Gail P. Jarvik
  • , Joshua D. Smith
  • , Christian D. Frazar
  • , Colleen P. Davis
  • , Karynne E. Patterson
  • , Marsha M. Wheeler
  • , Sean McGee
  • , Christina M. Lockwood
  • , Brian H. Shirts
  • , Colin C. Pritchard
  • , Mitzi L. Murray
  • , Valeria Vasta
  • , Dru Leistritz
  • , Matthew A. Richardson
  • , Jillian G. Buchan
  • , Aparna Radhakrishnan
  • , Niklas Krumm
  •  & Brenna W. Ehmen

Data and Research Center

  • Lee Lichtenstein
  • , Sophie Schwartz
  • , M. Morgan T. Aster
  • , Kristian Cibulskis
  • , Andrea Haessly
  • , Rebecca Asch
  • , Aurora Cremer
  • , Kylee Degatano
  • , Akum Shergill
  • , Laura D. Gauthier
  • , Samuel K. Lee
  • , Aaron Hatcher
  • , George B. Grant
  • , Genevieve R. Brandt
  • , Miguel Covarrubias
  • , Melissa Basford
  • , Alexander G. Bick
  • , Ashley Able
  • , Ashley E. Green
  • , Jennifer Zhang
  • , Henry R. Condon
  • , Yuanyuan Wang
  • , Moira K. Dillon
  • , C. H. Albach
  • , Wail Baalawi
  •  & Dan M. Roden

All of Us Research Demonstration Project Teams

  • Seung Hoan Choi
  • , Elisabeth A. Rosenthal

NIH All of Us Research Program Staff

  • Andrea H. Ramirez
  • , Sokny Lim
  • , Siddhartha Nambiar
  • , Bradley Ozenberger
  • , Anastasia L. Wise
  • , Chris Lunt
  • , Geoffrey S. Ginsburg
  •  & Joshua C. Denny


The All of Us Biobank (Mayo Clinic) collected, stored and plated participant biospecimens. The All of Us Genome Centers (Baylor-Hopkins Clinical Genome Center; Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine; and University of Washington School of Medicine) generated and QCed the whole-genomic data. The All of Us Data and Research Center (Vanderbilt University Medical Center, Broad Institute of MIT and Harvard, and Verily) generated the WGS joint call set, carried out quality assurance and QC analyses and developed the Researcher Workbench. All of Us Research Demonstration Project Teams contributed analyses. The other All of Us Genomics Investigators and NIH All of Us Research Program Staff provided crucial programmatic support. Members of the manuscript writing group (A.G.B., G.A.M., K.R.M., L.L., S.R., R.J.C. and A.M.) wrote the first draft of this manuscript, which was revised with contributions and feedback from all authors.

Corresponding author

Correspondence to Alexander G. Bick .

Ethics declarations

Competing interests.

D.M.M., G.A.M., E.V., K.W., J.H., H.D., C.L.K., M.M., S.D., Z.K., E. Boerwinkle and R.A.G. declare that Baylor Genetics is a Baylor College of Medicine affiliate that derives revenue from genetic testing. Eric Venner is affiliated with Codified Genomics, a provider of genetic interpretation. E.E.E. is a scientific advisory board member of Variant Bio, Inc. A.G.B. is a scientific advisory board member of TenSixteen Bio. The remaining authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Timothy Frayling and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 historic availability of ehr records in all of us v7 controlled tier curated data repository (n = 413,457)..

For better visibility, the plot shows growth starting in 2010.

Extended Data Fig. 2 Overview of the Genomic Data Curation Pipeline for WGS samples.

The Data and Research Center (DRC) performs additional single sample quality control (QC) on the data as it arrives from the Genome Centers. The variants from samples that pass this QC are loaded into the Genomic Variant Store (GVS), where we jointly call the variants and apply additional QC. We apply a joint call set QC process, which is stored with the call set. The entire joint call set is rendered as a Hail Variant Dataset (VDS), which can be accessed from the analysis notebooks in the Researcher Workbench. Subsections of the genome are extracted from the VDS and rendered in different formats with all participants. Auxiliary data can also be accessed through the Researcher Workbench. This includes variant functional annotations, joint call set QC results, predicted ancestry, and relatedness. Auxiliary data are derived from GVS (arrow not shown) and the VDS. The Cohort Builder directly queries GVS when researchers request genomic data for subsets of samples. Aligned reads, as cram files, are available in the Researcher Workbench (not shown). The graphics of the dish, gene and computer and the All of Us logo are reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Extended Data Fig. 3 Proportion of allelic frequencies (AF), stratified by computed ancestry with over 10,000 participants.

Bar counts are not cumulative (eg, “pop AF < 0.01” does not include “pop AF < 0.001”).

Extended Data Fig. 4 Distribution of pathogenic, and likely pathogenic ClinVar variants.

Stratified by ancestry filtered to only those variants that are found in allele count (AC) < 40 individuals for 245,388 short read WGS samples.

Extended Data Fig. 5 Ancestry specific HLA-DQB1 ( rs9273363 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight ancestry specific consequences across ancestries.

Extended Data Fig. 6 Ancestry specific TCF7L2 ( rs7903146 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight diabetic consequences across ancestries.

Supplementary information

Supplementary information.

Supplementary Figs. 1–7, Tables 1–8 and Note.

Reporting Summary

Supplementary dataset 1.

Associations of ACKR1, HLA-DQB1 and TCF7L2 loci with all Phecodes stratified by genetic ancestry.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature (2024). https://doi.org/10.1038/s41586-023-06957-x

Download citation

Received : 22 July 2022

Accepted : 08 December 2023

Published : 19 February 2024

DOI : https://doi.org/10.1038/s41586-023-06957-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

research paper on electronic record

Research on application framework of electronic document business based on big data technology

  • Published: 24 February 2024

Cite this article

  • Rui Guo 1 &
  • Yuansu Zhao 1  

With the rapid development of big data technology, electronic documents are more and more widely used. Aiming at the characteristics of large amount of data, complex format and non-standard information data of electronic documents, this paper uses data mining technology to realize the correlation construction between documents. Firstly, on the basis of dynamic incremental association rules, according to the characteristics of electronic document business, the association rules are improved. Secondly, the advantages of Boolean matrix operation, original data association rules and frequent 2-item set are used to improve the acquisition ability of the system for frequent items of document features. The experimental results show: that algorithm in this paper has higher computational efficiency in the process of file increment change and deletion change, and is suitable for electronic file processing in the case of big data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research paper on electronic record

Data availability

All datasets generated for this study are included within the article.

Meng X, Yu X (2009) Literature review of domestic research on trust management of electronic documents[J/OL]. Shanxi Archives 2009(01):1–9

Qianqian Y, Shuona W (2022) Study on the implementation paths of single-set filing of digital records oriented in datafication[J/OL]. Shanxi Archives 2022(2):114–124

Yuenan L (2021) The challenges and countermeasures of electronic records management under the trend of data management[J]. Beijing Arch 06:4–9

Google Scholar  

Bralić V, Stančić H, Stengård M (2020) A blockchain approach to digital archiving: digital signature certification chain preservation[J]. Rec Manag J 30(3):345–362

Jie Y (2021) Discussion on metadata standardization management scheme of electronic archives[J]. China Standardization 2021(07):105–108

Ruihong Z, Xu W, Yuliang H (2020) Thoughts on the standardization of electronic records management mode in universities under the big data environment[J]. Lantai World (03):51–53. https://doi.org/10.16565/j.cnki.1006-7744.2020.03.14

Hong W (2017) An Overview of automatic e-documents classification[J]. Arch Construct 01:29–32

Constantinou AC (2020) Learning Bayesian networks with the Saiyan algorithm[J]. ACM Transactions on Knowledge Discovery from Data (TKDD) 14(4):1–21

Article   Google Scholar  

Baratov D, Astanaliev E (2021) Developing a new monitoring mechanism of electronic document management of technical documentation for railway automation[J]. E3S Web Conf 264(3):05018

Zarubin A, Koval A, Moshkin V (2020) Ontological Model of an Electronic Archive Document[C]ICGDA 2020: 2020 3rd International Conference on Geoinformatics and Data Analysis, pp 147–151

Iiritano S, Ruffolo M (2021) Managing the knowledge contained in electronic documents: a clustering method for text mining. [C]//International Workshop on Database & Expert Systems Applications. IEEE, pp 454–458

Xin Ge (2021) Design of electronic file archiving management system based on AVIDM framework[J]. Mod Electron Technique 44(06):39–42

Fang L, Yingqi W, Haitao W (2021) Design and implementation of electronic document integrity detection system[J]. Comput Era 2021(03):44–48

Li X, Siyi Li, Linqing Ma (2020) Electronic records management relevant laws and electronic records management-reflections on the case of British Columbia[J]. Arch Sci Bull 06:38–44

Qiang W, Zhijie W (2020) Archiving integrated framework of business system and archives management system: construction and connotation analysis[J]. Arch Sci Bull 2020(06):45–53

Fan Y, Qiang W (2020) Long-term preservation framework and strategy of enterprise digital archives-based on practice and enlightenment of CNPC[J]. Lantai World 10:112–115

Qiang W (2020) Digital transition of archives management in enterprise: practical exploration and framework design[J]. Zhejiang Arch 09:16–20

Zhijie Wu, Qiang W (2020) Archiving of business system electronic records from the perspective of organization: problems, ideas and framework of strategy[J]. Arch Sci Bull 04:79–86

Download references


The study was supported by “Research on the Application of Blockchain Technology in Electronic Documents (Grant No. BGY2023KY-20)”.

Author information

Authors and affiliations.

Beijing Polytechnic College, Beijing, 100043, China

Rui Guo & Yuansu Zhao

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Rui Guo .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Guo, R., Zhao, Y. Research on application framework of electronic document business based on big data technology. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18635-9

Download citation

Received : 12 July 2023

Revised : 20 October 2023

Accepted : 12 February 2024

Published : 24 February 2024

DOI : https://doi.org/10.1007/s11042-024-18635-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Association rules
  • Electronic documents
  • Application framework
  • Data mining
  • Find a journal
  • Publish with us
  • Track your research

Help | Advanced Search

Computer Science > Artificial Intelligence

Title: an interactive agent foundation model.

Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction, enabling a versatile and adaptable AI framework. We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare. Our model demonstrates its ability to generate meaningful and contextually relevant outputs in each area. The strength of our approach lies in its generality, leveraging a variety of data sources such as robotics sequences, gameplay data, large-scale video datasets, and textual information for effective multimodal and multi-task learning. Our approach provides a promising avenue for developing generalist, action-taking, multimodal systems.

Submission history

Access paper:.

  • Download PDF
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Research by Topic

National Archives Logo

Records Related to Unidentified Anomalous Phenomena (UAPs) at the National Archives

The National Archives and Records Administration (NARA) has established an ‘‘Unidentified Anomalous Phenomena Records Collection," per sections 1841–1843 of the 2024 National Defense Authorization Act (Public Law 118-31) .  

Please explore the links below to find out more about records related to unidentified anomalous phenomena (UAPs)/unidentified flying objects (UFOs) in NARA’s holdings. All links to items in the National Archives Catalog are downloadable and can be republished with attribution to the National Archives and Records Administration.

research paper on electronic record

Still Pictures and Photographs UAP Related Records

RG 255: Records of the National Aeronautics and Space Administration

  • Items from the series “Photographs Relating to Agency Activities, Facilities and Personnel, 1960–1991” (National Archives Identifier: 5956182 , Local Identifier: 255-GS)

RG 342: Records of U.S. Air Force Commands, Activities, and Organizations, 1900–2003

  • Items include 342-AF-63708AC, 342-AF-163969AC, 342-AF-34920AC, 342-AF-34923 AC, 342-AF-34919AC, 342-AF-163969AC, and 342-AF-34919AC.  A finding aid for these items is available in the Still Picture Research Room.
  • Items from the series “Black and White and Color Photographs of U.S. Air Force Activities, Facilities, and Personnel, Domestic and Foreign” (National Archives Identifier: 542326 , Local Identifier: 342-B)

RG 341: Records of Headquarters U.S. Air Force (Air Staff)

  • “Project “BLUE BOOK”, 1954–1966.” (National Archives Identifier: 542184 , Local Identifier: 341-PBB)

Moving Images and Sound UAP Related Records

RG 111: Records of the Office of the Chief Signal Officer

  • MAJ. GEN. JOHN A. SAMFORD'S STATEMENT ON "FLYING SAUCERS", PENTAGON, WASHINGTON, D.C (National Archives Identifier: 25738 , Local Identifier: 111-LC-30875)

RG 255: Records of the National Aeronautics and Space Administration, 1903–2006

  • Walter Cronkite and Gordon Cooper on UFOs (National Archives Identifier: 86027191 , Local Identifier: 255-PAOa-807-AAE).
  • An Executive Summary of the Greatest Secret of the 20th Century (National Archives Identifier: 5833930 , Local Identifier: 255-GOLDIN-233).  

RG 263: Records of the Central Intelligence Agency, 1894–2002

  • Unidentified Flying Objects, 1956 (National Archives Identifier: 617148 , Local Identifier: 263-95). This film is edited, with sound. 
  • Unidentified Flying Objects, 1956 (National Archives Identifier: 5954651 and 617916 , Local Identifier: 263-124). 

RG 306: Records of the U.S. Information Agency, 1900–2003

  • Doctor Edward Condon, University of Colorado Physicist Studying Unidentified Flying Objects (National Archives Identifier: 127614 , Local Identifier: 306-EN-S-T-2808). 
  • Alderman Interview with Doctor Page on Unidentified Flying Objects (National Archives Identifier: 130003 , Local Identifier: 306-EN-W-T-8990)
  • Foreign Press Center Briefing with B. Maccabee, L. Koss, J. Shandera, and B. Hopkins (National Archives Identifier: 56103 , Local Identifier: 306-FP-17)

RG 330: Records of the Office of the Secretary of Defense

  • The Case of the Flying Saucer (National Archives Identifier: 2386432 , Local Identifier: 330a.85)
  • Unidentified Flying Object (UFO) Sighting (National Archives Identifier: 614788 , Local Identifier: 330-DVIC-653)

RG 341: Records of Headquarters U.S. Air Force (Air Staff) 

  • “Project Blue Book Motion Picture Films, 1950-1966” (National Archives Identifier: 61934 , Local Identifier: 341-PBB)
  • “Sound Recordings Relating to Project Blue Book Unidentified Flying Object (UFO) Investigations, 1953-1967” (National Archives Identifier: 1142703 , Local Identifier: 341-PBBa)
  • “Moving Images Relating to “The Roswell Reports” Source Data Research Files, 1946-1996” (National Archives Identifier: 566658 , Local Identifier: 341-ROSWELL)
  • “Sound Recordings Relating to “The Roswell Reports”, 1991-1996” (National Archives Identifier: 566843 , Local Identifier: 341-ROSWELLa)

RG 342: Records of U.S. Air Force Commands, Activities, and Organizations

  • DFD Avrocar I Progress Report, February 1, 1958 – May 1959 (National Archives Identifier: 68170 , Local Identifier: 342-USAF-29668).
  • Disc Flight Development, Avrocar I Progress Report, May 2, 1959–April 12, 1960 (National Archives Identifier: 68175 , Local Identifier: 342-USAF-29673). 
  • Avrocar Continuation Test Program and Terrain Test Program, June 1, 1960–June 14, 1961 (National Archives Identifier: 68405 , Local Identifier: 342-USAF-31135). 
  • Friend, Foe, or Fantasy, 1966 (National Archives Identifier: 69861 , Local Identifier: 342-USAF-41040). 
  • UFO Interview, 1966 (National Archives Identifier: 70511 , Local Identifier: 342-USAF-42990).
  • USAF UFO sightings, California, 1952–1975 (National Archives Identifier: 72035 , Local Identifier: 342-USAF-49377).

RG 517: Records of the U.S. Agency for Global Media

  • UFO Sighting Over Alaska, January 13, 1987 (National Archives Identifier: 262327376,   Local Identifier: 517-VOAa-87-306.)
  • Science World 1030, 2002 (National Archives Identifier: 77179268 , Local Identifier: 517-BBG-50046)

Donated Collections:

  • Unidentified Flying Objects (UFOs): Fact or Fiction, November 1974 (National Archives Identifier: 2838871 , Local Identifier: 200.1572)
  • Paramount News [Mar. 7] (1951) Vol. 10. No. 52 (National Archives Identifier: 99581 ,  Local Identifier: PARA-PN-10.52)
  • Paramount News [July 30] (1952) Vol. 11, No. 100 (National Archives Identifier: 99731 , Local Identifier: PARA-PN-11.100)
  • Universal Newsreel Volume 22, Release 276, August 22, 1949 (National Archives Identifier: 234273290 , Local Identifier: UN-UN-22-276)
  • Universal Newsreel Volume 25, Release 586, August 11, 1952 (National Archives Identifier: 234273597 , Local Identifier: UN-UN-25-586)

Textual Records and Microfilm UAP Related Records

RG 64: Records of the National Archives and Records Administration  

  • Project Blue Book: UFO Sightings  (National Archives Identifier: 40027753 )

RG 181: Records of Navy Installations Command, Navy Regions, Naval Districts, and Shore Establishments

  • Collection of A8-2 Information, 1959 (National Archives Identifier: 291645977 )

RG 237: Records of The Federal Aviation Administration

  • Information Releases Relating to Unidentified Flying Object, 1986 (FAA—Japan Airlines Flight 1628) (National Archives Identifier: 733667 )
  • Gemini VII Air-to-Ground Transcript Volume I (National Archives Identifier: 5011500 )
  • Records of Investigations of Unidentified Flying Objects (UFOs) Relating to the Office of Special Investigations, 1948–1968 (National Archives Identifier: 45484701 )
  • Project Blue Book Administrative Files, 1947–1969 (National Archives Identifier: 595175 )
  • Copies of the Case Files of the 4602D Air Intelligence Service Squadron on Sightings of Unidentified Flying Objects (UFOs), 1954–1956 (National Archives Identifier: 23857158 )
  • Case Files of the 4602 D Intelligence Service Squadron on Sightings of Unidentified Flying Objects (UFOs) (National Archives Identifier: 23857157 )
  • Roswell Report Source Files, 1987–1996 (National Archives Identifier: 17618564 )
  • Air Intelligence Reports, 1948–1953 (National Archives Identifier: 23857122 )
  • Project Blue Book Artifacts, 1952–1969 (National Archives Identifier: 23857160 )
  • Sanitized Version of Project Blue Book Case Files on Sightings of Unidentified Flying Objects, 1947–1969 (National Archives Identifier: 597821 )
  • Case files on Sightings of Unidentified Flying Objects (UFOs), 1953-1960 (National Archives Identifier: 23857159 )
  • Project Blue Book Case Files on Sightings of Unidentified Flying Objects (UFOs), June 1947–December 1969 (National Archives Identifier: 595466 )
  • Miscellaneous Case Files On Sightings Of Unidentified Flying Objects (UFOs), 1953–1960 (National Archives Identifier: 23857159 )

RG 342: Records of the U. S. Air Force Commands, Activities, and Organizations  

  • AFR 80-17/OCAMA-TAFB Sup Unidentified Flying Objects (UFO) (National Archives Identifier: 37294296 )
  • Obsolete During 1969: 4600 Air Base Wing Supplement 1 to Air Force Regulation 80-17, Unidentified Flying Objects (UFO), 10 January 1967; Superseded, 15 April 1969 (National Archives Identifier: 68875395 )
  • REL-2-4-1 UFOs 1965 (National Archives Identifier: 311003081 )
  • File 5: 2, Community Relations, 1970 (National Archives Identifier: 47323287 )
  • 471.6 Guided Missiles, 1 January 1952 (National Archives Identifier: 333334712 )
  • 471.6 Guided Missiles, 1 July 1952 (National Archives Identifier: 333334717 )

National Archives Blog Posts and Articles

  • Project BLUE BOOK - Unidentified Flying Objects (Updated 2020)
  • National Archives News: Public Interest in UFOs Persists 50 Years After Project Blue Book Termination (2019)
  • Featured Document Display: 50 Years Ago: Government Stops Investigating UFOs (2019)
  • Pieces of History: Saucers Over Washington: the History of Project Blue Book (2019)
  • Pieces of History: INVASION! (of privacy) (2018)
  • Pieces of History: UFOs: Natural Explanations (2018)
  • Pieces of History: UFOs: Man-Made, Made Up, and Unknown (2018)
  • National Archives News: Do Records Show Proof of UFOs? (2018)
  • The Unwritten Record: The Roswell Reports: What crashed in the desert? (2014)
  • The Unwritten Record: Avrocar: The U.S. Military’s Flying Saucer (2014)
  • The NDC Blog: What on Earth Is It? (2014)
  • Pieces of History: Flying Saucers, Popular Mechanics, and the National Archives (2013)
  • The Unwritten Record: Project Blue Book: Home Movies in UFO Reports (2013)
  • The Unwritten Record: Project Blue Book: Spotting UFOs in the Film Record (2013)
  • [VIDEO]: UFO Project Blue Book at National Archives Museum

research paper on electronic record

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Perspect Clin Res
  • v.6(2); Apr-Jun 2015

Ethical issues in electronic health records: A general overview

Fouzia f. ozair.

Department of Health Services, Jawahar Lal Nehru University, New Delhi, India

Nayer Jamshed

1 Department of Emergency Medicine, All India Institute of Medical Sciences, New Delhi, India

Amit Sharma

2 Department of Forensic Medicine, Hamdard Institute of Medical Sciences and Research, New Delhi, India

Praveen Aggarwal

Electronic health record (EHR) is increasingly being implemented in many developing countries. It is the need of the hour because it improves the quality of health care and is also cost-effective. Technologies can introduce some hazards hence safety of information in the system is a real challenge. Recent news of security breaches has put a question mark on this system. Despite its increased usefulness, and increasing enthusiasm in its adoption, not much attention is being paid to the ethical issues that might arise. Securing EHR with an encrypted password is a probable option. The purpose of this article is to discuss the various ethical issues arising in the use of the EHRs and their possible solutions.


An electronic health record (EHR) is a record of a patient's medical details (including history, physical examination, investigations and treatment) in digital format. Physicians and hospitals are implementing EHRs because they offer several advantages over paper records. They increase access to health care, improve the quality of care and decrease costs. However, ethical issues related to EHRs confront health personnel. When patient's health data are shared or linked without the patients' knowledge, autonomy is jeopardized. The patient may conceal information due to lack of confidence in the security of the system having their data. As a consequence, their treatment may be compromised. There is the risk of revelation of thousands of patients' health data through mistakes or theft. Leaders, health personnel and policy makers should discuss the ethical implications of EHRs and formulate policies in this regard. The electronic medical record (EMR) is the tool that promises to provide the platform from which new functionality and new services can be provided for patients.


A medical record in the past was information documented on paper for research, clinical, administrative and financial purposes. Its major drawback was in terms of accessibility, and it was available to one user at a time. Its completion was delayed anywhere from 1 to 6 months or more because it was updated manually.[ 1 ]

The purpose of documentation through electronic media remains the same even today that is to support patient care. EHRs have several advantages over paper records. Production of legible records reduces many problems of wrong prescriptions, doses and procedure.[ 2 ] Moreover adverse drug reactions can be reduced substantially when the EHRs are connected to drug banks and pharmacies. This can be done by not permitting prescription and order for drugs for which a known adverse reaction is known for a certain patient.[ 2 ] Easy accessibility from anywhere at any given time is also beneficial.[ 3 ] They require less storage space and can be stored indefinitely. They reduce the number of lost records, help research activities, allow for a complete set of backup records at low cost, speed data transfer and are cost-effective.[ 4 , 5 ] Hence, EHRs have been shown to improve patient compliance, facilitate quality assurance and reduce medical errors.[ 6 ]

The office of the National Coordinator for Health Information Technology (IT) refers to the health record as “not just a collection of data that you are guarding, it is life.”[ 7 ] The patient owns the information in the record. The physician and the organization is the owner of the physical medical record.[ 8 ] There are four major ethical priorities for EHRS: Privacy and confidentiality, security breaches, system implementation, and data inaccuracies.


Justice Samuel Dennis Warren and Justice Louis Brandeis define privacy as the right “to be let alone.”[ 9 ] The other definition given by Richard Rognehaugh is as the right of an individual to keep information about themselves from being disclosed to others; the claim of individuals to be let alone, from surveillance or interference from other individuals, organizations or the government.[ 10 ] Information of a patient should be released to others only with the patient's permission or allowed by law. When a patient is unable to do so because of age, mental incapacity the decisions about information sharing should be made by the legal representative or legal guardian of the patient. Information shared as a result of clinical interaction is considered confidential and must be protected.[ 11 ] Information from which the identity of the patient cannot be ascertained for example, the number of patients with breast carcinoma in a government hospital, is not in this category.[ 12 ]

Health care institutions, insurance companies and others will require access to the data if EHRs are to function as designed. The key to preserving confidentiality is to allow only authorized individuals to have access to information. This begins with authorizing users. The user's access is based on preestablished role-based privileges. The administrator identifies the user, determines the level of information to be shared and assigns usernames and passwords. The user should be aware that they will be accountable for the use and misuse of the information they view. They have access to the information they need to carry out their responsibilities. Hence assigning user privileges is a major aspect of medical record security.[ 13 ]

Although controlling access to health information is important, but is not sufficient for protecting the confidentiality. Additional security steps such as strong privacy and security policies are essential to secure patient's information.


Security breaches threaten patient privacy when confidential health information is made available to others without the individual's consent or authorization. Two recent incidents at Howard University Hospital, Washington showed that inadequate data security can affect a large number of people. On May 14, 2013, federal prosecutors charged one of the hospital's medical technicians with violating the Health Insurance Portability and Accountability Act (HIPAA). Prosecutors said that over a 17-month period, Laurie Napper used her position at the hospital to gain access to patients' names, addresses and Medicare numbers in order to sell their information. A plea hearing had been set for June 12, 2013 in which she was found guilty and sentenced for 6 months in a half-way house and fined $2,100. A few weeks earlier, the same hospital informed more than 34,000 patients that their medical data had been compromised. A contractor working with the hospital had downloaded the patient's files onto a personal laptop, which was stolen from his car. The data were password protected, but unencrypted, which means anyone who guessed the password could have accessed the patient files without a randomly generated key. By encryption, we mean encoding of information in such a way that only authorized parties can read it. It is usually done with the help of encryption key, which specifies that how the information should be decoded. According to a hospital press release, those files included names, addresses, and Social Security numbers and in a few cases, “diagnosis related information”. Recently a hospital chain named Prime Health care Services Inc. has agreed to pay $275,000 to settle a federal investigation into alleged violation of patient privacy. Keeping records secure is a challenge that doctors, public health officials and federal regulators are just beginning to understand. Cloud storage, password protection, and encryption are all measures health care providers can take to make portable EHRs more secure. A survey conducted found that 73% of physicians text other physicians about work.[ 14 ] Mobile devices are for individual use and are not designed for centralized management by an IT Department.[ 15 ] Mobile devices can easily be misplaced, damaged, or stolen. Emphasis must be laid on encrypting mobile devices that are used to transmit confidential information. Portable EHRs can be made more secure by using cloud storage, password protection, and encryption. Usage of two factor authentication system with security tokens and password are helpful in securing EHRs.

Security measures such as firewalls, antivirus software, and intrusion detection software must be included to protect data integrity. Specific policies and procedures serve to maintain patient privacy and confidentiality. For example, employees must not share their ID with anyone, always log off when leaving a terminal and use their own ID to access patient digital records. A security officer must be designated by the organization to work with a team of health IT experts.

Routine random audits should be conducted on a regular basis to ensure compliance with hospital policy. All system activity can be tracked by audit trails. This includes detailed listings of content, duration and the user; generating date and time for entries and logs of all modifications to EHRs.[ 16 ] When there is inappropriate access to a medical record, the system can yield information about the name of the individual gaining access; the time, date, screens accessed and the duration of the review. This information is useful when determining whether the access is the result of an error or an intentional, unauthorized view. The HIPAA Security Rule requires organizations to conduct audit trails, requiring that they document information systems activity[ 17 ] and have the hardware, software, and procedures to record and examine activity in systems that contain health information.[ 18 ]

Outside vendors create special privacy issues. Employee-only access to the EMR requires any external vendor to access and navigate the record under the authorization and oversight of an employee.


Health care organizations encounter major challenges in the course of EHR implementation these challenges result in wasted resources, frustrated providers, loss of confidence by patients and patient safety issues. The development, implementation, and maintenance of EHRs requires adequate funds and the involvement of many individuals, including clinicians, information technologists, educators, and consultants.[ 19 ]

Hospitals and health care institutions are making improvements without significant clinician engagements. Many EHR implementation projects fail because they underestimate the importance of one or more clinician to serve as opinion leaders for providers in the clinic. Thus, clinician must guide colleagues in understanding their roles in the implementation and enlisting their involvement in tasks as EHR selection, workflow design, and quality improvement.[ 20 ]

Clinical personnel often have little knowledge of the clinic's workflow and the roles others play in care delivery. This blind spot results in inadequate planning for successful implementation. Without identifying a standardized best practice method to do the work, every user is left to struggle. Clinics should map and standardize their workflows before EHR selection.

When any two systems are integrated, an interface is created. By the user interface, we mean an interface between the user and the computer system. These interfaces are critical to the overall success of the implementation process. Interface issues are the greatest system risk because these failures can be invisible initially. Lack of systemic consideration of users and tasks often results in poor user interface. Poorly designed user interface account for unintended adverse consequence leading to decreased time efficiency, poor quality of care and increased threat to patient safety. Improperly designed user interface fail to deliver the much needed quality of care, which lead to user dissatisfaction. The faulty user interface issue, which was small earlier on, increases over a period of time that leads to abandonment of EHR. Maintenance and testing of these interfaces on a routine basis is essential in controlling this major risk. Practice disruption during EHR implementation can negatively impact the quality of care or endanger patient safety along with financial loss.[ 21 ]


Integrity assures that the data is accurate and has not been changed. EHRs serve as a way to improve the patient's safety by reducing healthcare errors, reduce health disparities and improve the health of the public.[ 22 ] However, concerns have been raised about the accuracy and reliability of data entered into the electronic record.

Inaccurate representation of the patient's current condition and treatment occurs due to improper use of options such as “cut and paste”. This practice is unacceptable because it increases the risk for patients and liability for clinicians and organizations.[ 16 , 23 ] Another feature that can cause a problem in the data integrity is the drop down menu and disposition of relevant information in the trash. Such menus limit the choices available to the clinician who in a hurry may choose the wrong one leading to major errors. Clinicians and vendors have been working to resolve software problems to make EHRs both user-friendly and accurate.[ 23 ]

Loss or destruction of data occurs during data transfer; this raises concerns about the accuracy of the data base as patient care decisions are based on them.[ 24 ] A growing problem is of medical identity theft. This results in the input of inaccurate information into the record of the victim. The person's insurance company is billed for medical services not provided to the actual policy holder and the patient's future treatment is guided by misinformation that neither the patient nor provider immediately recognize.


India is providing quality health care of international standards at a relatively low cost and has attracted the patients from across the globe. India is now one of the favorite destinations for the health care services. Considering rapid pace of growth of health care sector in India, Government of India in April 2013, came out with definitive guidelines for EHR standards in India. Guidelines were based on the recommendations made by EMR standards committee, which was constituted by an order of Ministry of Health and Family Welfare. It was coordinated by Federation of Indian Chambers of Commerce and Industry on its behalf. The guidelines recommend set of standards to be followed by different health care service providers in India and hence that medical data becomes portable and easily transferable.[ 25 ] India having a population of 1.27 billion people with only 160 million internet users maintenance of EHR is a daunting task, but with the interest and support of the Government of India in its implementation, it will a success soon.

Regardless of one's role, everyone will need the assistance of the computer. Creating a useful EHR system will require the expertise of physicians, technology professionals, ethicists, administrative personnel, and patients. Although EMRs offer many significant benefits, the future of health care demands that their risks be recognized and properly managed or overcome. Multiple strategies are available to reduce risks and overcome barriers in the implementation of digital health records. Leadership, teamwork, flexibility, and adaptability are keys to finding solutions. EMRs capacities must be maximized in order to enhance improve the quality, safety, efficiency, and effectiveness of health care and health care delivery systems.

Source of Support: Nil.

Conflict of Interest: None declared.


  1. Introduction to the Electronic Medical Record

    research paper on electronic record

  2. Theoretical framework of Electronic Health Record (EHR) implementation

    research paper on electronic record

  3. PPT

    research paper on electronic record

  4. paper

    research paper on electronic record

  5. Electronic Health Record (EHR) Research Paper Example

    research paper on electronic record

  6. Electronic Health Records EHR Research paper

    research paper on electronic record


  1. Workshop on how to write a research paper. Registration Link in comments #research #lawstudent #law

  2. Research Paper Topics 😮😮😯 Best for Beginners 👍

  3. Annual Examination 2023-24 Electronic &Hardware Theory question paper#ncert

  4. University exam question paper solved problems on electronic devices and circuits

  5. Check out the details of my research paper 😇

  6. [4K] Love is Paper (Electronic Music Vol 2)


  1. Effects of Electronic Health Record Implementation and Barriers to Adoption and Use: A Scoping Review and Qualitative Analysis of the Content

    "Information relevant to the wellness, health and healthcare of an individual, in computer-processable form and represented according to a standardized information model, or the longitudinal electronic record of an individual that contains or virtually interlines to data in multiple EMRs and EPRs, which is to be shared and/or interoperable acros...

  2. A Qualitative Analysis of the Impact of Electronic Health Records (EHR

    The implementation of Electronic Health Record (EHR) systems is meant to assist providers' evidence-based decision-making 1 and streamline providers' workflow via efficient coordination for patient care. 2 Extant literature has highlighted the benefits of implementing EHR, including improved patient outcomes, enhanced patient safety measures, as...

  3. Electronic Health Record Implementation: A Review of Resources and

    The content published in Cureus is the result of clinical experience and/or research by independent individuals or organizations. Cureus is not responsible for the scientific accuracy or reliability of data or conclusions published herein. ... Electronic health record EHR implementation: easy transition from paper to electronic health records ...

  4. Electronic Health Records: Then, Now, and in the Future

    Literature search based on "Electronic Health Record", "Medical Record", and "Medical Chart" using Medline, Google, Wikipedia Medical, and Cochrane Libraries resulted in an initial review of 2,356 abstracts and other information in papers and books.

  5. A narrative review on the validity of electronic health record-based

    The proliferation of electronic health records (EHRs) spurred on by federal government incentives over the past few decades has resulted in greater than an 80% adoption-rate at hospitals [ 1] and close to 90% in office-based practices [ 2] in the United States.

  6. The Evolving Use of Electronic Health Records (EHR) for Research

    After decades of trying to move from a paper-based record system to an electronic one, attention is now turning away from implementation/adoption and moving toward realizing greater benefits from digital records. ... Secondary use of electronic medical records for clinical research: Challenges and opportunities. Convergent Sci Phys Oncol, 4 ...

  7. Impact of patient access to their electronic health record: systematic

    An electronic health record (EHR) is the systematized collection of patient and population electronically stored health information in a digital format 1 and providing patients with access to EHRs has the potential to decrease these costs, improve self-care, quality of care, and health and patient-centered outcome. 1, 2.

  8. Open-source electronic health record systems: A systematic review of

    Generally, evaluation-based assessment of an OS-EHRs should have eight prominent parameters: Management of health data, decision support system, management of clinical results, inventory system, communication of documents using electronic means, in and outpatient record management support, report generation using health and population administra...

  9. Research Use of Electronic Health Records: Patients' Views on

    Research Use of Electronic Health Records: Patients' Views on Alternative Approaches to Permission Catherine M. Hammack-Aviran , Kathleen M. Brelsford , Kevin C. McKenna , Ross D. Graham , Zachary M. Lampron & Laura M. Beskow Pages 172-186 | Published online: 27 Apr 2020 Cite this article https://doi.org/10.1080/23294515.2020.1755383

  10. (PDF) Electronic Health Records (EHR)

    Electronic Health Records are electronic versions of patients' healthcare records. An electronic health record gathers, creates, and stores the health record electronically. The...

  11. A Systematic Review and Comparative Study of Electronic Medical Record

    An Electronic Health Record (EHR) is a digital collection and retrieval of a patient's medical records. Widespread implementation of EHR systems decreases health care expenditures.

  12. Implementing electronic health records in hospitals: a systematic

    55 Altmetric Metrics Abstract Background The literature on implementing Electronic Health Records (EHR) in hospitals is very diverse. The objective of this study is to create an overview of the existing literature on EHR implementation in hospitals and to identify generally applicable findings and lessons for implementers. Methods

  13. 'It depends': what 86 systematic reviews tell us about what strategies

    To identify potentially relevant peer-reviewed research papers, we developed a comprehensive systematic literature search strategy based on the terms used in the Grimshaw et al. and Boaz, Baeza and Fraser ... They examined the impact of electronic health record, computerised provider order-entry, or decision support system. ...

  14. Electronic Medical Records implementation in hospital: An ...

    The implementation of hospital-wide Electronic Medical Records (EMRs) is still an unsolved quest for many hospital managers. EMRs have long been considered a key factor for improving healthcare quality and safety, reducing adverse events for patients, decreasing costs, optimizing processes, improving clinical research and obtaining best clinical performances. However, hospitals continue to ...

  15. Electronic Health Records: Privacy, Confidentiality, and Security

    The medical record, either paper-based or electronic, is a communication tool that supports clinical decision making, coordination of services, evaluation of the quality and efficacy of care, research, legal protection, education, and accreditation and regulatory processes.

  16. Electronic health records to facilitate clinical research

    This manuscript identifies the key steps required to advance the role of electronic health records in cardiovascular clinical research. Keywords: Electronic health records, Clinical trials as topic, Pragmatic clinical trials as topic, Cardiovascular diseases Go to: Introduction

  17. Electronic health records: Three decades of bibliometric research

    Rana Jamal Hourani d Add to Mendeley https://doi.org/10.1016/j.imu.2022.100872 Get rights and content Under a Creative Commons license open access Abstract This study aimed to assess the scientific literature published on "Electronic health records (EHRs)" using bibliometric techniques.

  18. The Impact of the Electronic Health Record on Moving New ...

    Background: Anecdotal reports from across the country highlight the fact that nurses are facing major challenges in moving new evidence-based practice (EBP) initiatives into the electronic health record (EHR). Purpose: The purpose of this study was to: (a) learn current processes for embedding EBP into EHRs, (b) uncover facilitators and barriers associated with rapid movement of new evidence ...

  19. Transitioning from One Electronic Health Record to Another: A

    Background Transitioning to a new electronic health record (EHR) presents different challenges than transitions from paper to electronic records. We synthesized the body of peer-reviewed literature on EHR-to-EHR transitions to evaluate the generalizability of published work and identify knowledge gaps where more evidence is needed. Methods We conducted a broad search in PubMed through July ...

  20. Electronic Health Records and Quality of Care

    Electronic health records were originally built for billing purposes, not for research and quality improvement efforts. 5 Accordingly, the impact of EHRs on quality healthcare delivery has focused on physician performance and billing precision. 6 EHR studies often concentrate on process quality metrics, analyzing physician-level variability ...

  21. [2402.11177] A Question Answering Based Pipeline for Comprehensive

    Electronic health records (EHRs) hold significant value for research and applications. As a new way of information extraction, question answering (QA) can extract more flexible information than conventional methods and is more accessible to clinical researchers, but its progress is impeded by the scarcity of annotated data. In this paper, we propose a novel approach that automatically ...

  22. Electronic Health Record: A review

    The Electronic Health Record (EHR) is becoming the central information object for various aspects of healthcare and medical related industries, from pharmaceuticals to bioengineering. This review provides a presentation of the state of affairs in several aspects of EHR, including security and privacy, data mining, design of decision support systems, acceptance by users and producers of health ...

  23. Genomic data in the All of Us Research Program

    Participant data include a rich combination of phenotypic and genomic data (Fig. 1b).Participants are asked to complete consent for research use of data, sharing of electronic health records (EHRs ...

  24. Research on application framework of electronic document ...

    With the rapid development of big data technology, electronic documents are more and more widely used. Aiming at the characteristics of large amount of data, complex format and non-standard information data of electronic documents, this paper uses data mining technology to realize the correlation construction between documents. Firstly, on the basis of dynamic incremental association rules ...

  25. [2402.05929] An Interactive Agent Foundation Model

    The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training ...

  26. Impact of Electronic versus Paper-Based Recording before EHR

    PMID: 30895574 Impact of Electronic versus Paper-Based Recording before EHR Implementation on Health Care Professionals' Perceptions of EHR Use, Data Quality, and Data Reuse Erik Joukes, 1 Nicolette F. de Keizer, 1 Martine C. de Bruijne, 2 Ameen Abu-Hanna, 1 and Ronald Cornet 1

  27. Records Related to Unidentified Anomalous Phenomena (UAPs) at the

    The National Archives and Records Administration (NARA) has established an ''Unidentified Anomalous Phenomena Records Collection," per sections 1841-1843 of the 2024 National Defense Authorization Act (Public Law 118-31). Please explore the links below to find out more about records related to unidentified anomalous phenomena (UAPs)/unidentified flying objects (UFOs) in NARA's holdings ...

  28. Ethical issues in electronic health records: A general overview

    An electronic health record (EHR) is a record of a patient's medical details (including history, physical examination, investigations and treatment) in digital format. Physicians and hospitals are implementing EHRs because they offer several advantages over paper records.

  29. How bubonic plague rewired the human immune system

    The Black Death is thought to have killed about 50 million Europeans by the mid 1300s, according to estimates based on historical records and accounts. More recent research on agricultural ...