Research articles

Exposure to air pollution and hospital admission for cardiovascular diseases, short term exposure to low level ambient fine particulate matter and natural cause, cardiovascular, and respiratory morbidity, optimal timing of influenza vaccination in young children, effect of exercise for depression, association of non-alcoholic fatty liver disease with cardiovascular disease and all cause death in patients with type 2 diabetes, duration of cpr and outcomes for adults with in-hospital cardiac arrest, clinical effectiveness of an online physical and mental health rehabilitation programme for post-covid-19 condition, atypia detected during breast screening and subsequent development of cancer, publishers’ and journals’ instructions to authors on use of generative ai in academic and scientific publishing, effectiveness of glp-1 receptor agonists on glycaemic control, body weight, and lipid profile for type 2 diabetes, neurological development in children born moderately or late preterm, invasive breast cancer and breast cancer death after non-screen detected ductal carcinoma in situ, all cause and cause specific mortality in obsessive-compulsive disorder, acute rehabilitation following traumatic anterior shoulder dislocation, perinatal depression and risk of mortality, undisclosed financial conflicts of interest in dsm-5-tr, effect of risk mitigation guidance opioid and stimulant dispensations on mortality and acute care visits, update to living systematic review on sars-cov-2 positivity in offspring and timing of mother-to-child transmission, perinatal depression and its health impact, christmas 2023: common healthcare related instruments subjected to magnetic attraction study, using autoregressive integrated moving average models for time series analysis of observational data, demand for morning after pill following new year holiday, christmas 2023: christmas recipes from the great british bake off, effect of a doctor working during the festive period on population health: experiment using doctor who episodes, christmas 2023: analysis of barbie medical and science career dolls, christmas 2023: effect of chair placement on physicians’ behavior and patients’ satisfaction, management of chronic pain secondary to temporomandibular disorders, christmas 2023: projecting complete redaction of clinical trial protocols, christmas 2023: a drug target for erectile dysfunction to help improve fertility, sexual activity, and wellbeing, christmas 2023: efficacy of cola ingestion for oesophageal food bolus impaction, conservative management versus laparoscopic cholecystectomy in adults with gallstone disease, social media use and health risk behaviours in young people, untreated cervical intraepithelial neoplasia grade 2 and cervical cancer, air pollution deaths attributable to fossil fuels, implementation of a high sensitivity cardiac troponin i assay and risk of myocardial infarction or death at five years, covid-19 vaccine effectiveness against post-covid-19 condition, association between patient-surgeon gender concordance and mortality after surgery, intravascular imaging guided versus coronary angiography guided percutaneous coronary intervention, treatment of lower urinary tract symptoms in men in primary care using a conservative intervention, autism intervention meta-analysis of early childhood studies, effectiveness of the live zoster vaccine during the 10 years following vaccination, effects of a multimodal intervention in primary care to reduce second line antibiotic prescriptions for urinary tract infections in women, pyrotinib versus placebo in combination with trastuzumab and docetaxel in patients with her2 positive metastatic breast cancer, association of dcis size and margin status with risk of developing breast cancer post-treatment, racial differences in low value care among older patients in the us, pharmaceutical industry payments and delivery of low value cancer drugs, rosuvastatin versus atorvastatin in adults with coronary artery disease, clinical effectiveness of septoplasty versus medical management for nasal airways obstruction, ultrasound guided lavage with corticosteroid injection versus sham lavage with and without corticosteroid injection for calcific tendinopathy of shoulder, early versus delayed antihypertensive treatment in patients with acute ischaemic stroke, mortality risks associated with floods in 761 communities worldwide, interactive effects of ambient fine particulate matter and ozone on daily mortality in 372 cities, association between changes in carbohydrate intake and long term weight changes, future-case control crossover analysis for adjusting bias in case crossover studies, association between recently raised anticholinergic burden and risk of acute cardiovascular events, suboptimal gestational weight gain and neonatal outcomes in low and middle income countries: individual participant data meta-analysis, efficacy and safety of an inactivated virus-particle vaccine for sars-cov-2, effect of invitation letter in language of origin on screening attendance: randomised controlled trial in breastscreen norway, visits by nurse practitioners and physician assistants in the usa, non-erosive gastro-oesophageal reflux disease and oesophageal adenocarcinoma, venous thromboembolism with use of hormonal contraception and nsaids, food additive emulsifiers and risk of cardiovascular disease, balancing risks and benefits of cannabis use, promoting activity, independence, and stability in early dementia and mild cognitive impairment, effect of home cook interventions for salt reduction in china, cancer mortality after low dose exposure to ionising radiation, effect of a smartphone intervention among university students with unhealthy alcohol use, long term risk of death and readmission after hospital admission with covid-19 among older adults, mortality rates among patients successfully treated for hepatitis c, association between antenatal corticosteroids and risk of serious infection in children, the proportions of term or late preterm births after exposure to early antenatal corticosteroids, and outcomes, safety of ba.4-5 or ba.1 bivalent mrna booster vaccines, comparative effectiveness of booster vaccines among adults aged ≥50 years, third dose vaccine schedules against severe covid-19 during omicron predominance in nordic countries, private equity ownership and impacts on health outcomes, costs, and quality, healthcare disruption due to covid-19 and avoidable hospital admission, educational inequalities in mortality and their mediators among generations across four decades, prevalence and predictors of data and code sharing in the medical and health sciences, medicare eligibility and in-hospital treatment patterns and health outcomes for patients with trauma, therapeutic value of first versus supplemental indications of drugs in us and europe, hospital admissions linked to sars-cov-2 infection in children and adolescents, vitamin d supplementation and major cardiovascular events, menopausal hormone therapy and dementia, associations between modest reductions in kidney function and adverse outcomes in young adults, association between surgeon volume and patient outcomes after elective shoulder replacement surgery, risk prediction of covid-19 related death or hospital admission in adults testing positive for sars-cov-2, effectiveness of grace risk score in patients admitted to hospital with non-st elevation acute coronary syndrome, peer reviewers’ response to invitations by gender and geographical region, breast cancer mortality in women with early invasive breast cancer, mmr vaccine at age 6 months and hospitalisation for infection before age 12 months, recovery up to two years after sars-cov-2 infection, race-free estimated glomerular filtration rate equation in kidney transplant recipients, risk stratified monitoring for methotrexate toxicity in immune mediated inflammatory diseases, optimising prescribing in older adults with multimorbidity and polypharmacy in primary care, sars-cov-2 infection in vaccinated versus unvaccinated us veterans, women and non-white people among lasker award recipients from 1946 to 2022, effectiveness of spironolactone for women with acne vulgaris, association between inequalities in human resources for health and all cause and cause specific mortality, development and internal-external validation of statistical and machine learning models for breast cancer prognostication, fda approval, clinical trial evidence, efficacy, epidemiology, and price for non-orphan and ultra-rare, rare, and common orphan cancer drug indications, follow us on, content links.

  • Collections
  • Health in South Asia
  • Women’s, children’s & adolescents’ health
  • News and views
  • BMJ Opinion
  • Rapid responses
  • Editorial staff
  • BMJ in the USA
  • BMJ in South Asia
  • Submit your paper
  • BMA members
  • Subscribers
  • Advertisers and sponsors

Explore BMJ

  • Our company
  • BMJ Careers
  • BMJ Learning
  • BMJ Masterclasses
  • BMJ Journals
  • BMJ Student
  • Academic edition of The BMJ
  • BMJ Best Practice
  • The BMJ Awards
  • Email alerts
  • Activate subscription

Information

AMBOSS Blog

  • Student Life

How to Write and Publish Clinical Research in Medical School

A woman's hands typing her clinical research paper beside a notebook and papers.

From working hard on the USMLE® exams to holding leadership positions in a specialty’s academic society, there are many ways medical students can work towards matching into the residency of their choice. One such activity that looks great on residency applications is finding clinical research opportunities in medical school to write and publish papers. No one knows this better than Dr. Eve Bowers. 

An Otolaryngology resident at the University of Miami/Jackson Memorial Hospital, Eve became an expert in writing, submitting, and publishing manuscripts during her final years in medical school. Check out Eve’s blog post below to get valuable insights on how to get published in medical school. 

As medical students, we’re told that research is important and that publications are “good”, and even “necessary to match ” into residency, but we often aren’t given the tools we need to turn ideas into manuscripts. This is especially true given our rigorous schedules. 

When I looked through my CV, I saw I had a few abstracts and presentations, but no manuscripts. I wanted to write, but publishing seemed like just checking another resume box. On top of that, I didn’t know where to begin. 

My writing journey started with a case report I nervously picked up during my surgery clerkship . Then, over ten months of typing, editing, and sending unanswered emails, I went from writing 0 to ten manuscripts. The process was sometimes painful but mostly gratifying (yes, research can be gratifying), and you can do it, too.

To make finding, starting, and publishing high-quality research articles a little bit easier and a lot more enjoyable, check out my five tips for publishing clinical research in medical school.

1. Build your network to find publication opportunities in medical school

When looking for projects, finding great mentors is often more useful than finding the perfect project. This is especially true when starting out. Use your time on clerkships to identify attending and resident mentors who you trust to support your budding author ambitions.

At this stage, residents especially are your friends . When you demonstrate follow-through and receptiveness to feedback, you will be given more research opportunities. Don’t be shy about asking mentors for tasks if you can juggle multiple projects, but don’t bite off more than you can chew. It’s important to communicate honestly and be transparent about the amount of time you have.

2. Kickstarting your research during medical school: start small 

If you have no research experience, start with a case report. Volunteer to write an article about an interesting case you saw in the operating room or clinic. It’s much easier and more rewarding to write about patients you have experience with, and case reports are a great way to demonstrate your writing ability to more senior authors.

Pro tip : Try to figure out as much as you can independently by using published reports as blueprints before asking for help. Nevertheless, don’t be afraid to seek guidance when you need it! If you approach a mentor with a problem, come prepared with 2-3 realistic solutions or examples of how you tried to figure it out on your own.

3. Know the criteria for writing a clinical research paper 

Before you begin, ask your mentor where they would like to submit the completed work. Each journal has specific standards, styles, and submission criteria. For guidance, look to papers previously published in that journal. 

As far as annotations and citations are concerned, download and learn how to use Endnote or Zotero right now! You’ll save days of work formatting your references.

Additionally, consider creating folders and spreadsheets to keep track of projects. Set goals and timelines for yourself from the beginning, and block off dedicated time to conduct a literature review, analyze data, and write.

Pro tip : If you are the first author and overseeing a large team, improve communication and efficiency by making everyone’s roles and expectations very clear to the group via email.

4. Follow up with your mentor

Sometimes you’ll send your mentor a draft, but she won’t get back to you with edits and feedback in a reasonable timeframe. Surprisingly, many projects do not get past this point because of insufficient persistence. Here’s what to do if this happens:

  • Politely nudge your mentor with follow-up emails and schedule a meeting to discuss in person or via Zoom.
  • Set deadlines and give specific reasons why the paper needs to be submitted. Some reasons could include, “I need this submission for my residency application ” or “this is a requirement for my school.”
  • Ask your co-author resident and/or fellow to advocate for edits and submission.

Whatever happens, don’t give up at this point. You’ve put in the work, and persistence makes or breaks a successful student-author.

5. Write about the medical topics that you love

Writing is fun when you focus on subjects you’re really passionate about. You also don’t have to stay within your institution: feel free to branch out if you come across an interesting research opportunity at a different program. A little cold email can go a long way!

If your goal is quantity, you can increase output by asking around about “productive” research mentors and sticking to topics related to clinical practice or medical education. However, my advice is to never let relatively quick publication opportunities compromise the quality of your work. Remember — every paper you write gets easier and more enjoyable, and your work will be truly important to advancing the field you care about. Good luck!

Eve Bowers who wrote 10 clinical research papers in medical school.

About the Author : Eve is an Otolaryngology Resident at the University of Miami/Jackson Memorial Hospital. She attended medical school at the University of Pittsburgh School of Medicine and undergrad at the University of Pennsylvania. She is passionate about medical education, mentorship, and increasing minority and female leadership in surgical fields. For more tips and tricks, follow her on Twitter and Instagram !

For more information on residency applications, check out the AMBOSS Residency Applications Clerkship Survival Guide. 

GO TO SURVIVAL GUIDE

Related posts you might like

Amboss apps: for medical students during clinical rotations.

No matter the specialty, the clinical clerkship period is full of challenging moments critical to ...

ChalkTalk: Medical Topics Explained

Some topics just get you every time. There are diseases and conditions that are so thick with ...

How to Get a Good Letter of Recommendation for Residency

Getting a good letter of recommendation is a crucial part of the ERAS® residency application ...

Join Our Mailing List

And get updates on new posts and news about AMBOSS sent directly to your inbox.

All-in-one resource for residents and medical students.

Book cover

Basic Methods Handbook for Clinical Orthopaedic Research pp 235–242 Cite as

How to Write a Clinical Paper

  • Brendan Coleman 8  
  • First Online: 02 February 2019

2291 Accesses

Publishing the results of your research is a fulfilling experience, both from a personal viewpoint, but also helping advance the orthopaedic knowledge. This chapter discusses the process from completing the study to formulating the final manuscript that is ready for submission, providing a structure to follow in preparation of the manuscript and tips on improving your writing style.

Research is essential in advancing the knowledge of the orthopaedic community, and publishing the results of research is an important part of the advancement of the knowledge. Writing and publishing research can be intimidating when you first begin. There are different forms of publication from case studies or series, controlled trials and systematic reviews or meta-analyses. Each requires a different approach to research, but the fundamentals of preparing the manuscript are similar. Following the prescribed technique gives the manuscript structure allowing improved writing and giving the reader the information they want.

Once writing of the manuscript is complete, submission to a journal needs to be managed. Most articles will not be accepted immediately by the journal. Formal peer review will provide feedback that needs to be addressed in a timely manner prior to resubmission to the journal.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Andrade C. How to write a good abstract for a scientific paper or conference presentation. Indian J Psychiatry. 2011;53(2):172–5.

Article   Google Scholar  

Belcher WL. Writing your journal article in 12 weeks: a guide to academic publishing success. Thousand Oaks: SAGE Publications; 2009.

Google Scholar  

Earnshaw JJ. How to write a clinical paper for publication. Surgery. 2012;30(9):437–41.

Faber J. Writing scientific manuscripts: most common mistakes. Dental Press J Orthod. 2017;22(5):113–7.

Faigley L, Witte SP. Analyzing revision. Coll Compos Commun. 1981;32(4):400–14.

Kallestinova ED. How to write your first research paper. Yale J Biol Med. 2011;84(3):181–90.

PubMed   PubMed Central   Google Scholar  

Karlson J, Marx RG, Nakamura N, Bhandari M, editors. A practical guide to research: design, execution and publication. Art Ther. 2011;27(Suppl 2):1–112.

Silvia PJ. How to write a lot. Washington, DC: American Psychological Association; 2007.

Swales JM, Feak CB. Academic writing for graduate students. 2nd ed. Ann Arbor: University of Michigan Press; 2004.

Whitesides GM. Writing a paper. Adv Mater. 2004;16(15):1375–7.

Article   CAS   Google Scholar  

www.prisma-statement.org .

www.strobe-statement.org .

Zeiger M. Essentials of writing biomedical research papers. 2nd ed. San Francisco: McGraw-Hill Companies, Inc; 2000.

Download references

Author information

Authors and affiliations.

Department of Orthopaedic Surgery, Middlemore Hospital, Auckland, New Zealand

Brendan Coleman

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Brendan Coleman .

Editor information

Editors and affiliations.

UPMC Rooney Sports Complex, University of Pittsburgh, Pittsburgh, PA, USA

Volker Musahl

Department of Orthopaedics, Sahlgrenska Academy, Gothenburg University, Sahlgrenska University Hospital, Gothenburg, Sweden

Jón Karlsson

Department of Orthopaedic Surgery and Traumatology, Kantonsspital Baselland (Bruderholz, Laufen und Liestal), Bruderholz, Switzerland

Michael T. Hirschmann

McMaster University, Hamilton, ON, Canada

Olufemi R. Ayeni

Hospital for Special Surgery, New York, NY, USA

Robert G. Marx

Department of Orthopaedic Surgery, NorthShore University HealthSystem, Evanston, IL, USA

Jason L. Koh

Institute for Medical Science in Sports, Osaka Health Science University, Osaka, Japan

Norimasa Nakamura

Rights and permissions

Reprints and permissions

Copyright information

© 2019 ISAKOS

About this chapter

Cite this chapter.

Coleman, B. (2019). How to Write a Clinical Paper. In: Musahl, V., et al. Basic Methods Handbook for Clinical Orthopaedic Research. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58254-1_25

Download citation

DOI : https://doi.org/10.1007/978-3-662-58254-1_25

Published : 02 February 2019

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-662-58253-4

Online ISBN : 978-3-662-58254-1

eBook Packages : Medicine Medicine (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Systematic review
  • Open access
  • Published: 19 February 2024

‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice

  • Annette Boaz   ORCID: orcid.org/0000-0003-0557-1294 1 ,
  • Juan Baeza 2 ,
  • Alec Fraser   ORCID: orcid.org/0000-0003-1121-1551 2 &
  • Erik Persson 3  

Implementation Science volume  19 , Article number:  15 ( 2024 ) Cite this article

1758 Accesses

68 Altmetric

Metrics details

The gap between research findings and clinical practice is well documented and a range of strategies have been developed to support the implementation of research into clinical practice. The objective of this study was to update and extend two previous reviews of systematic reviews of strategies designed to implement research evidence into clinical practice.

We developed a comprehensive systematic literature search strategy based on the terms used in the previous reviews to identify studies that looked explicitly at interventions designed to turn research evidence into practice. The search was performed in June 2022 in four electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched from January 2010 up to June 2022 and applied no language restrictions. Two independent reviewers appraised the quality of included studies using a quality assessment checklist. To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes.

We identified 32 reviews conducted between 2010 and 2022. The reviews are mainly of multi-faceted interventions ( n  = 20) although there are reviews focusing on single strategies (ICT, educational, reminders, local opinion leaders, audit and feedback, social media and toolkits). The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Furthermore, a lot of nuance lies behind these headline findings, and this is increasingly commented upon in the reviews themselves.

Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been identified. We need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of research perspectives (including social science) in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed.

Peer Review reports

Contribution to the literature

Considerable time and money is invested in implementing and evaluating strategies to increase the implementation of research into clinical practice.

The growing body of evidence is not providing the anticipated clear lessons to support improved implementation.

Instead what is needed is better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice.

This would involve a more central role in implementation science for a wider range of perspectives, especially from the social, economic, political and behavioural sciences and for greater use of different types of synthesis, such as realist synthesis.

Introduction

The gap between research findings and clinical practice is well documented and a range of interventions has been developed to increase the implementation of research into clinical practice [ 1 , 2 ]. In recent years researchers have worked to improve the consistency in the ways in which these interventions (often called strategies) are described to support their evaluation. One notable development has been the emergence of Implementation Science as a field focusing explicitly on “the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice” ([ 3 ] p. 1). The work of implementation science focuses on closing, or at least narrowing, the gap between research and practice. One contribution has been to map existing interventions, identifying 73 discreet strategies to support research implementation [ 4 ] which have been grouped into 9 clusters [ 5 ]. The authors note that they have not considered the evidence of effectiveness of the individual strategies and that a next step is to understand better which strategies perform best in which combinations and for what purposes [ 4 ]. Other authors have noted that there is also scope to learn more from other related fields of study such as policy implementation [ 6 ] and to draw on methods designed to support the evaluation of complex interventions [ 7 ].

The increase in activity designed to support the implementation of research into practice and improvements in reporting provided the impetus for an update of a review of systematic reviews of the effectiveness of interventions designed to support the use of research in clinical practice [ 8 ] which was itself an update of the review conducted by Grimshaw and colleagues in 2001. The 2001 review [ 9 ] identified 41 reviews considering a range of strategies including educational interventions, audit and feedback, computerised decision support to financial incentives and combined interventions. The authors concluded that all the interventions had the potential to promote the uptake of evidence in practice, although no one intervention seemed to be more effective than the others in all settings. They concluded that combined interventions were more likely to be effective than single interventions. The 2011 review identified a further 13 systematic reviews containing 313 discrete primary studies. Consistent with the previous review, four main strategy types were identified: audit and feedback; computerised decision support; opinion leaders; and multi-faceted interventions (MFIs). Nine of the reviews reported on MFIs. The review highlighted the small effects of single interventions such as audit and feedback, computerised decision support and opinion leaders. MFIs claimed an improvement in effectiveness over single interventions, although effect sizes remained small to moderate and this improvement in effectiveness relating to MFIs has been questioned in a subsequent review [ 10 ]. In updating the review, we anticipated a larger pool of reviews and an opportunity to consolidate learning from more recent systematic reviews of interventions.

This review updates and extends our previous review of systematic reviews of interventions designed to implement research evidence into clinical practice. To identify potentially relevant peer-reviewed research papers, we developed a comprehensive systematic literature search strategy based on the terms used in the Grimshaw et al. [ 9 ] and Boaz, Baeza and Fraser [ 8 ] overview articles. To ensure optimal retrieval, our search strategy was refined with support from an expert university librarian, considering the ongoing improvements in the development of search filters for systematic reviews since our first review [ 11 ]. We also wanted to include technology-related terms (e.g. apps, algorithms, machine learning, artificial intelligence) to find studies that explored interventions based on the use of technological innovations as mechanistic tools for increasing the use of evidence into practice (see Additional file 1 : Appendix A for full search strategy).

The search was performed in June 2022 in the following electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched for articles published since the 2011 review. We searched from January 2010 up to June 2022 and applied no language restrictions. Reference lists of relevant papers were also examined.

We uploaded the results using EPPI-Reviewer, a web-based tool that facilitated semi-automation of the screening process and removal of duplicate studies. We made particular use of a priority screening function to reduce screening workload and avoid ‘data deluge’ [ 12 ]. Through machine learning, one reviewer screened a smaller number of records ( n  = 1200) to train the software to predict whether a given record was more likely to be relevant or irrelevant, thus pulling the relevant studies towards the beginning of the screening process. This automation did not replace manual work but helped the reviewer to identify eligible studies more quickly. During the selection process, we included studies that looked explicitly at interventions designed to turn research evidence into practice. Studies were included if they met the following pre-determined inclusion criteria:

The study was a systematic review

Search terms were included

Focused on the implementation of research evidence into practice

The methodological quality of the included studies was assessed as part of the review

Study populations included healthcare providers and patients. The EPOC taxonomy [ 13 ] was used to categorise the strategies. The EPOC taxonomy has four domains: delivery arrangements, financial arrangements, governance arrangements and implementation strategies. The implementation strategies domain includes 20 strategies targeted at healthcare workers. Numerous EPOC strategies were assessed in the review including educational strategies, local opinion leaders, reminders, ICT-focused approaches and audit and feedback. Some strategies that did not fit easily within the EPOC categories were also included. These were social media strategies and toolkits, and multi-faceted interventions (MFIs) (see Table  2 ). Some systematic reviews included comparisons of different interventions while other reviews compared one type of intervention against a control group. Outcomes related to improvements in health care processes or patient well-being. Numerous individual study types (RCT, CCT, BA, ITS) were included within the systematic reviews.

We excluded papers that:

Focused on changing patient rather than provider behaviour

Had no demonstrable outcomes

Made unclear or no reference to research evidence

The last of these criteria was sometimes difficult to judge, and there was considerable discussion amongst the research team as to whether the link between research evidence and practice was sufficiently explicit in the interventions analysed. As we discussed in the previous review [ 8 ] in the field of healthcare, the principle of evidence-based practice is widely acknowledged and tools to change behaviour such as guidelines are often seen to be an implicit codification of evidence, despite the fact that this is not always the case.

Reviewers employed a two-stage process to select papers for inclusion. First, all titles and abstracts were screened by one reviewer to determine whether the study met the inclusion criteria. Two papers [ 14 , 15 ] were identified that fell just before the 2010 cut-off. As they were not identified in the searches for the first review [ 8 ] they were included and progressed to assessment. Each paper was rated as include, exclude or maybe. The full texts of 111 relevant papers were assessed independently by at least two authors. To reduce the risk of bias, papers were excluded following discussion between all members of the team. 32 papers met the inclusion criteria and proceeded to data extraction. The study selection procedure is documented in a PRISMA literature flow diagram (see Fig.  1 ). We were able to include French, Spanish and Portuguese papers in the selection reflecting the language skills in the study team, but none of the papers identified met the inclusion criteria. Other non- English language papers were excluded.

figure 1

PRISMA flow diagram. Source: authors

One reviewer extracted data on strategy type, number of included studies, local, target population, effectiveness and scope of impact from the included studies. Two reviewers then independently read each paper and noted key findings and broad themes of interest which were then discussed amongst the wider authorial team. Two independent reviewers appraised the quality of included studies using a Quality Assessment Checklist based on Oxman and Guyatt [ 16 ] and Francke et al. [ 17 ]. Each study was rated a quality score ranging from 1 (extensive flaws) to 7 (minimal flaws) (see Additional file 2 : Appendix B). All disagreements were resolved through discussion. Studies were not excluded in this updated overview based on methodological quality as we aimed to reflect the full extent of current research into this topic.

The extracted data were synthesised using descriptive and narrative techniques to identify themes and patterns in the data linked to intervention strategies, targeted behaviours, study settings and study outcomes.

Thirty-two studies were included in the systematic review. Table 1. provides a detailed overview of the included systematic reviews comprising reference, strategy type, quality score, number of included studies, local, target population, effectiveness and scope of impact (see Table  1. at the end of the manuscript). Overall, the quality of the studies was high. Twenty-three studies scored 7, six studies scored 6, one study scored 5, one study scored 4 and one study scored 3. The primary focus of the review was on reviews of effectiveness studies, but a small number of reviews did include data from a wider range of methods including qualitative studies which added to the analysis in the papers [ 18 , 19 , 20 , 21 ]. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. In this section, we discuss the different EPOC-defined implementation strategies in turn. Interestingly, we found only two ‘new’ approaches in this review that did not fit into the existing EPOC approaches. These are a review focused on the use of social media and a review considering toolkits. In addition to single interventions, we also discuss multi-faceted interventions. These were the most common intervention approach overall. A summary is provided in Table  2 .

Educational strategies

The overview identified three systematic reviews focusing on educational strategies. Grudniewicz et al. [ 22 ] explored the effectiveness of printed educational materials on primary care physician knowledge, behaviour and patient outcomes and concluded they were not effective in any of these aspects. Koota, Kääriäinen and Melender [ 23 ] focused on educational interventions promoting evidence-based practice among emergency room/accident and emergency nurses and found that interventions involving face-to-face contact led to significant or highly significant effects on patient benefits and emergency nurses’ knowledge, skills and behaviour. Interventions using written self-directed learning materials also led to significant improvements in nurses’ knowledge of evidence-based practice. Although the quality of the studies was high, the review primarily included small studies with low response rates, and many of them relied on self-assessed outcomes; consequently, the strength of the evidence for these outcomes is modest. Wu et al. [ 20 ] questioned if educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes. Although based on evaluation projects and qualitative data, their results also suggest that positive changes on patient outcomes can be made following the implementation of specific evidence-based approaches (or projects). The differing positive outcomes for educational strategies aimed at nurses might indicate that the target audience is important.

Local opinion leaders

Flodgren et al. [ 24 ] was the only systemic review focusing solely on opinion leaders. The review found that local opinion leaders alone, or in combination with other interventions, can be effective in promoting evidence‐based practice, but this varies both within and between studies and the effect on patient outcomes is uncertain. The review found that, overall, any intervention involving opinion leaders probably improves healthcare professionals’ compliance with evidence-based practice but varies within and across studies. However, how opinion leaders had an impact could not be determined because of insufficient details were provided, illustrating that reporting specific details in published studies is important if diffusion of effective methods of increasing evidence-based practice is to be spread across a system. The usefulness of this review is questionable because it cannot provide evidence of what is an effective opinion leader, whether teams of opinion leaders or a single opinion leader are most effective, or the most effective methods used by opinion leaders.

Pantoja et al. [ 26 ] was the only systemic review focusing solely on manually generated reminders delivered on paper included in the overview. The review explored how these affected professional practice and patient outcomes. The review concluded that manually generated reminders delivered on paper as a single intervention probably led to small to moderate increases in adherence to clinical recommendations, and they could be used as a single quality improvement intervention. However, the authors indicated that this intervention would make little or no difference to patient outcomes. The authors state that such a low-tech intervention may be useful in low- and middle-income countries where paper records are more likely to be the norm.

ICT-focused approaches

The three ICT-focused reviews [ 14 , 27 , 28 ] showed mixed results. Jamal, McKenzie and Clark [ 14 ] explored the impact of health information technology on the quality of medical and health care. They examined the impact of electronic health record, computerised provider order-entry, or decision support system. This showed a positive improvement in adherence to evidence-based guidelines but not to patient outcomes. The number of studies included in the review was low and so a conclusive recommendation could not be reached based on this review. Similarly, Brown et al. [ 28 ] found that technology-enabled knowledge translation interventions may improve knowledge of health professionals, but all eight studies raised concerns of bias. The De Angelis et al. [ 27 ] review was more promising, reporting that ICT can be a good way of disseminating clinical practice guidelines but conclude that it is unclear which type of ICT method is the most effective.

Audit and feedback

Sykes, McAnuff and Kolehmainen [ 29 ] examined whether audit and feedback were effective in dementia care and concluded that it remains unclear which ingredients of audit and feedback are successful as the reviewed papers illustrated large variations in the effectiveness of interventions using audit and feedback.

Non-EPOC listed strategies: social media, toolkits

There were two new (non-EPOC listed) intervention types identified in this review compared to the 2011 review — fewer than anticipated. We categorised a third — ‘care bundles’ [ 36 ] as a multi-faceted intervention due to its description in practice and a fourth — ‘Technology Enhanced Knowledge Transfer’ [ 28 ] was classified as an ICT-focused approach. The first new strategy was identified in Bhatt et al.’s [ 30 ] systematic review of the use of social media for the dissemination of clinical practice guidelines. They reported that the use of social media resulted in a significant improvement in knowledge and compliance with evidence-based guidelines compared with more traditional methods. They noted that a wide selection of different healthcare professionals and patients engaged with this type of social media and its global reach may be significant for low- and middle-income countries. This review was also noteworthy for developing a simple stepwise method for using social media for the dissemination of clinical practice guidelines. However, it is debatable whether social media can be classified as an intervention or just a different way of delivering an intervention. For example, the review discussed involving opinion leaders and patient advocates through social media. However, this was a small review that included only five studies, so further research in this new area is needed. Yamada et al. [ 31 ] draw on 39 studies to explore the application of toolkits, 18 of which had toolkits embedded within larger KT interventions, and 21 of which evaluated toolkits as standalone interventions. The individual component strategies of the toolkits were highly variable though the authors suggest that they align most closely with educational strategies. The authors conclude that toolkits as either standalone strategies or as part of MFIs hold some promise for facilitating evidence use in practice but caution that the quality of many of the primary studies included is considered weak limiting these findings.

Multi-faceted interventions

The majority of the systematic reviews ( n  = 20) reported on more than one intervention type. Some of these systematic reviews focus exclusively on multi-faceted interventions, whilst others compare different single or combined interventions aimed at achieving similar outcomes in particular settings. While these two approaches are often described in a similar way, they are actually quite distinct from each other as the former report how multiple strategies may be strategically combined in pursuance of an agreed goal, whilst the latter report how different strategies may be incidentally used in sometimes contrasting settings in the pursuance of similar goals. Ariyo et al. [ 35 ] helpfully summarise five key elements often found in effective MFI strategies in LMICs — but which may also be transferrable to HICs. First, effective MFIs encourage a multi-disciplinary approach acknowledging the roles played by different professional groups to collectively incorporate evidence-informed practice. Second, they utilise leadership drawing on a wide set of clinical and non-clinical actors including managers and even government officials. Third, multiple types of educational practices are utilised — including input from patients as stakeholders in some cases. Fourth, protocols, checklists and bundles are used — most effectively when local ownership is encouraged. Finally, most MFIs included an emphasis on monitoring and evaluation [ 35 ]. In contrast, other studies offer little information about the nature of the different MFI components of included studies which makes it difficult to extrapolate much learning from them in relation to why or how MFIs might affect practice (e.g. [ 28 , 38 ]). Ultimately, context matters, which some review authors argue makes it difficult to say with real certainty whether single or MFI strategies are superior (e.g. [ 21 , 27 ]). Taking all the systematic reviews together we may conclude that MFIs appear to be more likely to generate positive results than single interventions (e.g. [ 34 , 45 ]) though other reviews should make us cautious (e.g. [ 32 , 43 ]).

While multi-faceted interventions still seem to be more effective than single-strategy interventions, there were important distinctions between how the results of reviews of MFIs are interpreted in this review as compared to the previous reviews [ 8 , 9 ], reflecting greater nuance and debate in the literature. This was particularly noticeable where the effectiveness of MFIs was compared to single strategies, reflecting developments widely discussed in previous studies [ 10 ]. We found that most systematic reviews are bounded by their clinical, professional, spatial, system, or setting criteria and often seek to draw out implications for the implementation of evidence in their areas of specific interest (such as nursing or acute care). Frequently this means combining all relevant studies to explore the respective foci of each systematic review. Therefore, most reviews we categorised as MFIs actually include highly variable numbers and combinations of intervention strategies and highly heterogeneous original study designs. This makes statistical analyses of the type used by Squires et al. [ 10 ] on the three reviews in their paper not possible. Further, it also makes extrapolating findings and commenting on broad themes complex and difficult. This may suggest that future research should shift its focus from merely examining ‘what works’ to ‘what works where and what works for whom’ — perhaps pointing to the value of realist approaches to these complex review topics [ 48 , 49 ] and other more theory-informed approaches [ 50 ].

Some reviews have a relatively small number of studies (i.e. fewer than 10) and the authors are often understandably reluctant to engage with wider debates about the implications of their findings. Other larger studies do engage in deeper discussions about internal comparisons of findings across included studies and also contextualise these in wider debates. Some of the most informative studies (e.g. [ 35 , 40 ]) move beyond EPOC categories and contextualise MFIs within wider systems thinking and implementation theory. This distinction between MFIs and single interventions can actually be very useful as it offers lessons about the contexts in which individual interventions might have bounded effectiveness (i.e. educational interventions for individual change). Taken as a whole, this may also then help in terms of how and when to conjoin single interventions into effective MFIs.

In the two previous reviews, a consistent finding was that MFIs were more effective than single interventions [ 8 , 9 ]. However, like Squires et al. [ 10 ] this overview is more equivocal on this important issue. There are four points which may help account for the differences in findings in this regard. Firstly, the diversity of the systematic reviews in terms of clinical topic or setting is an important factor. Secondly, there is heterogeneity of the studies within the included systematic reviews themselves. Thirdly, there is a lack of consistency with regards to the definition and strategies included within of MFIs. Finally, there are epistemological differences across the papers and the reviews. This means that the results that are presented depend on the methods used to measure, report, and synthesise them. For instance, some reviews highlight that education strategies can be useful to improve provider understanding — but without wider organisational or system-level change, they may struggle to deliver sustained transformation [ 19 , 44 ].

It is also worth highlighting the importance of the theory of change underlying the different interventions. Where authors of the systematic reviews draw on theory, there is space to discuss/explain findings. We note a distinction between theoretical and atheoretical systematic review discussion sections. Atheoretical reviews tend to present acontextual findings (for instance, one study found very positive results for one intervention, and this gets highlighted in the abstract) whilst theoretically informed reviews attempt to contextualise and explain patterns within the included studies. Theory-informed systematic reviews seem more likely to offer more profound and useful insights (see [ 19 , 35 , 40 , 43 , 45 ]). We find that the most insightful systematic reviews of MFIs engage in theoretical generalisation — they attempt to go beyond the data of individual studies and discuss the wider implications of the findings of the studies within their reviews drawing on implementation theory. At the same time, they highlight the active role of context and the wider relational and system-wide issues linked to implementation. It is these types of investigations that can help providers further develop evidence-based practice.

This overview has identified a small, but insightful set of papers that interrogate and help theorise why, how, for whom, and in which circumstances it might be the case that MFIs are superior (see [ 19 , 35 , 40 ] once more). At the level of this overview — and in most of the systematic reviews included — it appears to be the case that MFIs struggle with the question of attribution. In addition, there are other important elements that are often unmeasured, or unreported (e.g. costs of the intervention — see [ 40 ]). Finally, the stronger systematic reviews [ 19 , 35 , 40 , 43 , 45 ] engage with systems issues, human agency and context [ 18 ] in a way that was not evident in the systematic reviews identified in the previous reviews [ 8 , 9 ]. The earlier reviews lacked any theory of change that might explain why MFIs might be more effective than single ones — whereas now some systematic reviews do this, which enables them to conclude that sometimes single interventions can still be more effective.

As Nilsen et al. ([ 6 ] p. 7) note ‘Study findings concerning the effectiveness of various approaches are continuously synthesized and assembled in systematic reviews’. We may have gone as far as we can in understanding the implementation of evidence through systematic reviews of single and multi-faceted interventions and the next step would be to conduct more research exploring the complex and situated nature of evidence used in clinical practice and by particular professional groups. This would further build on the nuanced discussion and conclusion sections in a subset of the papers we reviewed. This might also support the field to move away from isolating individual implementation strategies [ 6 ] to explore the complex processes involving a range of actors with differing capacities [ 51 ] working in diverse organisational cultures. Taxonomies of implementation strategies do not fully account for the complex process of implementation, which involves a range of different actors with different capacities and skills across multiple system levels. There is plenty of work to build on, particularly in the social sciences, which currently sits at the margins of debates about evidence implementation (see for example, Normalisation Process Theory [ 52 ]).

There are several changes that we have identified in this overview of systematic reviews in comparison to the review we published in 2011 [ 8 ]. A consistent and welcome finding is that the overall quality of the systematic reviews themselves appears to have improved between the two reviews, although this is not reflected upon in the papers. This is exhibited through better, clearer reporting mechanisms in relation to the mechanics of the reviews, alongside a greater attention to, and deeper description of, how potential biases in included papers are discussed. Additionally, there is an increased, but still limited, inclusion of original studies conducted in low- and middle-income countries as opposed to just high-income countries. Importantly, we found that many of these systematic reviews are attuned to, and comment upon the contextual distinctions of pursuing evidence-informed interventions in health care settings in different economic settings. Furthermore, systematic reviews included in this updated article cover a wider set of clinical specialities (both within and beyond hospital settings) and have a focus on a wider set of healthcare professions — discussing both similarities, differences and inter-professional challenges faced therein, compared to the earlier reviews. These wider ranges of studies highlight that a particular intervention or group of interventions may work well for one professional group but be ineffective for another. This diversity of study settings allows us to consider the important role context (in its many forms) plays on implementing evidence into practice. Examining the complex and varied context of health care will help us address what Nilsen et al. ([ 6 ] p. 1) described as, ‘society’s health problems [that] require research-based knowledge acted on by healthcare practitioners together with implementation of political measures from governmental agencies’. This will help us shift implementation science to move, ‘beyond a success or failure perspective towards improved analysis of variables that could explain the impact of the implementation process’ ([ 6 ] p. 2).

This review brings together 32 papers considering individual and multi-faceted interventions designed to support the use of evidence in clinical practice. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been conducted. As a whole, this substantial body of knowledge struggles to tell us more about the use of individual and MFIs than: ‘it depends’. To really move forwards in addressing the gap between research evidence and practice, we may need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of perspectives, especially from the social, economic, political and behavioural sciences in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed. Harvey et al. [ 53 ] suggest that when context is likely to be critical to implementation success there are a range of primary research approaches (participatory research, realist evaluation, developmental evaluation, ethnography, quality/ rapid cycle improvement) that are likely to be appropriate and insightful. While these approaches often form part of implementation studies in the form of process evaluations, they are usually relatively small scale in relation to implementation research as a whole. As a result, the findings often do not make it into the subsequent systematic reviews. This review provides further evidence that we need to bring qualitative approaches in from the periphery to play a central role in many implementation studies and subsequent evidence syntheses. It would be helpful for systematic reviews, at the very least, to include more detail about the interventions and their implementation in terms of how and why they worked.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Before and after study

Controlled clinical trial

Effective Practice and Organisation of Care

High-income countries

Information and Communications Technology

Interrupted time series

Knowledge translation

Low- and middle-income countries

Randomised controlled trial

Grol R, Grimshaw J. From best evidence to best practice: effective implementation of change in patients’ care. Lancet. 2003;362:1225–30. https://doi.org/10.1016/S0140-6736(03)14546-1 .

Article   PubMed   Google Scholar  

Green LA, Seifert CM. Translation of research into practice: why we can’t “just do it.” J Am Board Fam Pract. 2005;18:541–5. https://doi.org/10.3122/jabfm.18.6.541 .

Eccles MP, Mittman BS. Welcome to Implementation Science. Implement Sci. 2006;1:1–3. https://doi.org/10.1186/1748-5908-1-1 .

Article   PubMed Central   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:2–14. https://doi.org/10.1186/s13012-015-0209-1 .

Article   Google Scholar  

Waltz TJ, Powell BJ, Matthieu MM, Damschroder LJ, et al. Use of concept mapping to characterize relationships among implementation strategies and assess their feasibility and importance: results from the Expert Recommendations for Implementing Change (ERIC) study. Implement Sci. 2015;10:1–8. https://doi.org/10.1186/s13012-015-0295-0 .

Nilsen P, Ståhl C, Roback K, et al. Never the twain shall meet? - a comparison of implementation science and policy implementation research. Implementation Sci. 2013;8:2–12. https://doi.org/10.1186/1748-5908-8-63 .

Rycroft-Malone J, Seers K, Eldh AC, et al. A realist process evaluation within the Facilitating Implementation of Research Evidence (FIRE) cluster randomised controlled international trial: an exemplar. Implementation Sci. 2018;13:1–15. https://doi.org/10.1186/s13012-018-0811-0 .

Boaz A, Baeza J, Fraser A, European Implementation Score Collaborative Group (EIS). Effective implementation of research into practice: an overview of systematic reviews of the health literature. BMC Res Notes. 2011;4:212. https://doi.org/10.1186/1756-0500-4-212 .

Article   PubMed   PubMed Central   Google Scholar  

Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, et al. Changing provider behavior – an overview of systematic reviews of interventions. Med Care. 2001;39 8Suppl 2:II2–45.

Google Scholar  

Squires JE, Sullivan K, Eccles MP, et al. Are multifaceted interventions more effective than single-component interventions in changing health-care professionals’ behaviours? An overview of systematic reviews. Implement Sci. 2014;9:1–22. https://doi.org/10.1186/s13012-014-0152-6 .

Salvador-Oliván JA, Marco-Cuenca G, Arquero-Avilés R. Development of an efficient search filter to retrieve systematic reviews from PubMed. J Med Libr Assoc. 2021;109:561–74. https://doi.org/10.5195/jmla.2021.1223 .

Thomas JM. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.

Effective Practice and Organisation of Care (EPOC). The EPOC taxonomy of health systems interventions. EPOC Resources for review authors. Oslo: Norwegian Knowledge Centre for the Health Services; 2016. epoc.cochrane.org/epoc-taxonomy . Accessed 9 Oct 2023.

Jamal A, McKenzie K, Clark M. The impact of health information technology on the quality of medical and health care: a systematic review. Health Inf Manag. 2009;38:26–37. https://doi.org/10.1177/183335830903800305 .

Menon A, Korner-Bitensky N, Kastner M, et al. Strategies for rehabilitation professionals to move evidence-based knowledge into practice: a systematic review. J Rehabil Med. 2009;41:1024–32. https://doi.org/10.2340/16501977-0451 .

Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J Clin Epidemiol. 1991;44:1271–8. https://doi.org/10.1016/0895-4356(91)90160-b .

Article   CAS   PubMed   Google Scholar  

Francke AL, Smit MC, de Veer AJ, et al. Factors influencing the implementation of clinical guidelines for health care professionals: a systematic meta-review. BMC Med Inform Decis Mak. 2008;8:1–11. https://doi.org/10.1186/1472-6947-8-38 .

Jones CA, Roop SC, Pohar SL, et al. Translating knowledge in rehabilitation: systematic review. Phys Ther. 2015;95:663–77. https://doi.org/10.2522/ptj.20130512 .

Scott D, Albrecht L, O’Leary K, Ball GDC, et al. Systematic review of knowledge translation strategies in the allied health professions. Implement Sci. 2012;7:1–17. https://doi.org/10.1186/1748-5908-7-70 .

Wu Y, Brettle A, Zhou C, Ou J, et al. Do educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes? A systematic review. Nurse Educ Today. 2018;70:109–14. https://doi.org/10.1016/j.nedt.2018.08.026 .

Yost J, Ganann R, Thompson D, Aloweni F, et al. The effectiveness of knowledge translation interventions for promoting evidence-informed decision-making among nurses in tertiary care: a systematic review and meta-analysis. Implement Sci. 2015;10:1–15. https://doi.org/10.1186/s13012-015-0286-1 .

Grudniewicz A, Kealy R, Rodseth RN, Hamid J, et al. What is the effectiveness of printed educational materials on primary care physician knowledge, behaviour, and patient outcomes: a systematic review and meta-analyses. Implement Sci. 2015;10:2–12. https://doi.org/10.1186/s13012-015-0347-5 .

Koota E, Kääriäinen M, Melender HL. Educational interventions promoting evidence-based practice among emergency nurses: a systematic review. Int Emerg Nurs. 2018;41:51–8. https://doi.org/10.1016/j.ienj.2018.06.004 .

Flodgren G, O’Brien MA, Parmelli E, et al. Local opinion leaders: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD000125.pub5 .

Arditi C, Rège-Walther M, Durieux P, et al. Computer-generated reminders delivered on paper to healthcare professionals: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2017. https://doi.org/10.1002/14651858.CD001175.pub4 .

Pantoja T, Grimshaw JM, Colomer N, et al. Manually-generated reminders delivered on paper: effects on professional practice and patient outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD001174.pub4 .

De Angelis G, Davies B, King J, McEwan J, et al. Information and communication technologies for the dissemination of clinical practice guidelines to health professionals: a systematic review. JMIR Med Educ. 2016;2:e16. https://doi.org/10.2196/mededu.6288 .

Brown A, Barnes C, Byaruhanga J, McLaughlin M, et al. Effectiveness of technology-enabled knowledge translation strategies in improving the use of research in public health: systematic review. J Med Internet Res. 2020;22:e17274. https://doi.org/10.2196/17274 .

Sykes MJ, McAnuff J, Kolehmainen N. When is audit and feedback effective in dementia care? A systematic review. Int J Nurs Stud. 2018;79:27–35. https://doi.org/10.1016/j.ijnurstu.2017.10.013 .

Bhatt NR, Czarniecki SW, Borgmann H, et al. A systematic review of the use of social media for dissemination of clinical practice guidelines. Eur Urol Focus. 2021;7:1195–204. https://doi.org/10.1016/j.euf.2020.10.008 .

Yamada J, Shorkey A, Barwick M, Widger K, et al. The effectiveness of toolkits as knowledge translation strategies for integrating evidence into clinical care: a systematic review. BMJ Open. 2015;5:e006808. https://doi.org/10.1136/bmjopen-2014-006808 .

Afari-Asiedu S, Abdulai MA, Tostmann A, et al. Interventions to improve dispensing of antibiotics at the community level in low and middle income countries: a systematic review. J Glob Antimicrob Resist. 2022;29:259–74. https://doi.org/10.1016/j.jgar.2022.03.009 .

Boonacker CW, Hoes AW, Dikhoff MJ, Schilder AG, et al. Interventions in health care professionals to improve treatment in children with upper respiratory tract infections. Int J Pediatr Otorhinolaryngol. 2010;74:1113–21. https://doi.org/10.1016/j.ijporl.2010.07.008 .

Al Zoubi FM, Menon A, Mayo NE, et al. The effectiveness of interventions designed to increase the uptake of clinical practice guidelines and best practices among musculoskeletal professionals: a systematic review. BMC Health Serv Res. 2018;18:2–11. https://doi.org/10.1186/s12913-018-3253-0 .

Ariyo P, Zayed B, Riese V, Anton B, et al. Implementation strategies to reduce surgical site infections: a systematic review. Infect Control Hosp Epidemiol. 2019;3:287–300. https://doi.org/10.1017/ice.2018.355 .

Borgert MJ, Goossens A, Dongelmans DA. What are effective strategies for the implementation of care bundles on ICUs: a systematic review. Implement Sci. 2015;10:1–11. https://doi.org/10.1186/s13012-015-0306-1 .

Cahill LS, Carey LM, Lannin NA, et al. Implementation interventions to promote the uptake of evidence-based practices in stroke rehabilitation. Cochrane Database Syst Rev. 2020. https://doi.org/10.1002/14651858.CD012575.pub2 .

Pedersen ER, Rubenstein L, Kandrack R, Danz M, et al. Elusive search for effective provider interventions: a systematic review of provider interventions to increase adherence to evidence-based treatment for depression. Implement Sci. 2018;13:1–30. https://doi.org/10.1186/s13012-018-0788-8 .

Jenkins HJ, Hancock MJ, French SD, Maher CG, et al. Effectiveness of interventions designed to reduce the use of imaging for low-back pain: a systematic review. CMAJ. 2015;187:401–8. https://doi.org/10.1503/cmaj.141183 .

Bennett S, Laver K, MacAndrew M, Beattie E, et al. Implementation of evidence-based, non-pharmacological interventions addressing behavior and psychological symptoms of dementia: a systematic review focused on implementation strategies. Int Psychogeriatr. 2021;33:947–75. https://doi.org/10.1017/S1041610220001702 .

Noonan VK, Wolfe DL, Thorogood NP, et al. Knowledge translation and implementation in spinal cord injury: a systematic review. Spinal Cord. 2014;52:578–87. https://doi.org/10.1038/sc.2014.62 .

Albrecht L, Archibald M, Snelgrove-Clarke E, et al. Systematic review of knowledge translation strategies to promote research uptake in child health settings. J Pediatr Nurs. 2016;31:235–54. https://doi.org/10.1016/j.pedn.2015.12.002 .

Campbell A, Louie-Poon S, Slater L, et al. Knowledge translation strategies used by healthcare professionals in child health settings: an updated systematic review. J Pediatr Nurs. 2019;47:114–20. https://doi.org/10.1016/j.pedn.2019.04.026 .

Bird ML, Miller T, Connell LA, et al. Moving stroke rehabilitation evidence into practice: a systematic review of randomized controlled trials. Clin Rehabil. 2019;33:1586–95. https://doi.org/10.1177/0269215519847253 .

Goorts K, Dizon J, Milanese S. The effectiveness of implementation strategies for promoting evidence informed interventions in allied healthcare: a systematic review. BMC Health Serv Res. 2021;21:1–11. https://doi.org/10.1186/s12913-021-06190-0 .

Zadro JR, O’Keeffe M, Allison JL, Lembke KA, et al. Effectiveness of implementation strategies to improve adherence of physical therapist treatment choices to clinical practice guidelines for musculoskeletal conditions: systematic review. Phys Ther. 2020;100:1516–41. https://doi.org/10.1093/ptj/pzaa101 .

Van der Veer SN, Jager KJ, Nache AM, et al. Translating knowledge on best practice into improving quality of RRT care: a systematic review of implementation strategies. Kidney Int. 2011;80:1021–34. https://doi.org/10.1038/ki.2011.222 .

Pawson R, Greenhalgh T, Harvey G, et al. Realist review–a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy. 2005;10Suppl 1:21–34. https://doi.org/10.1258/1355819054308530 .

Rycroft-Malone J, McCormack B, Hutchinson AM, et al. Realist synthesis: illustrating the method for implementation research. Implementation Sci. 2012;7:1–10. https://doi.org/10.1186/1748-5908-7-33 .

Johnson MJ, May CR. Promoting professional behaviour change in healthcare: what interventions work, and why? A theory-led overview of systematic reviews. BMJ Open. 2015;5:e008592. https://doi.org/10.1136/bmjopen-2015-008592 .

Metz A, Jensen T, Farley A, Boaz A, et al. Is implementation research out of step with implementation practice? Pathways to effective implementation support over the last decade. Implement Res Pract. 2022;3:1–11. https://doi.org/10.1177/26334895221105585 .

May CR, Finch TL, Cornford J, Exley C, et al. Integrating telecare for chronic disease management in the community: What needs to be done? BMC Health Serv Res. 2011;11:1–11. https://doi.org/10.1186/1472-6963-11-131 .

Harvey G, Rycroft-Malone J, Seers K, Wilson P, et al. Connecting the science and practice of implementation – applying the lens of context to inform study design in implementation research. Front Health Serv. 2023;3:1–15. https://doi.org/10.3389/frhs.2023.1162762 .

Download references

Acknowledgements

The authors would like to thank Professor Kathryn Oliver for her support in the planning the review, Professor Steve Hanney for reading and commenting on the final manuscript and the staff at LSHTM library for their support in planning and conducting the literature search.

This study was supported by LSHTM’s Research England QR strategic priorities funding allocation and the National Institute for Health and Care Research (NIHR) Applied Research Collaboration South London (NIHR ARC South London) at King’s College Hospital NHS Foundation Trust. Grant number NIHR200152. The views expressed are those of the author(s) and not necessarily those of the NIHR, the Department of Health and Social Care or Research England.

Author information

Authors and affiliations.

Health and Social Care Workforce Research Unit, The Policy Institute, King’s College London, Virginia Woolf Building, 22 Kingsway, London, WC2B 6LE, UK

Annette Boaz

King’s Business School, King’s College London, 30 Aldwych, London, WC2B 4BG, UK

Juan Baeza & Alec Fraser

Federal University of Santa Catarina (UFSC), Campus Universitário Reitor João Davi Ferreira Lima, Florianópolis, SC, 88.040-900, Brazil

Erik Persson

You can also search for this author in PubMed   Google Scholar

Contributions

AB led the conceptual development and structure of the manuscript. EP conducted the searches and data extraction. All authors contributed to screening and quality appraisal. EP and AF wrote the first draft of the methods section. AB, JB and AF performed result synthesis and contributed to the analyses. AB wrote the first draft of the manuscript and incorporated feedback and revisions from all other authors. All authors revised and approved the final manuscript.

Corresponding author

Correspondence to Annette Boaz .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: appendix a., additional file 2: appendix b., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Boaz, A., Baeza, J., Fraser, A. et al. ‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice. Implementation Sci 19 , 15 (2024). https://doi.org/10.1186/s13012-024-01337-z

Download citation

Received : 01 November 2023

Accepted : 05 January 2024

Published : 19 February 2024

DOI : https://doi.org/10.1186/s13012-024-01337-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation
  • Interventions
  • Clinical practice
  • Research evidence
  • Multi-faceted

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

clinical research paper

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 19 February 2024

Genomic data in the All of Us Research Program

The all of us research program genomics investigators.

Nature ( 2024 ) Cite this article

56k Accesses

546 Altmetric

Metrics details

  • Genetic variation
  • Genome-wide association studies

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics 1 , 2 , 3 , 4 . The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health 5 , 6 . Here we describe the programme’s genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.

Comprehensively identifying genetic variation and cataloguing its contribution to health and disease, in conjunction with environmental and lifestyle factors, is a central goal of human health research 1 , 2 . A key limitation in efforts to build this catalogue has been the historic under-representation of large subsets of individuals in biomedical research including individuals from diverse ancestries, individuals with disabilities and individuals from disadvantaged backgrounds 3 , 4 . The All of Us Research Program (All of Us) aims to address this gap by enrolling and collecting comprehensive health data on at least one million individuals who reflect the diversity across the USA 5 , 6 . An essential component of All of Us is the generation of whole-genome sequence (WGS) and genotyping data on one million participants. All of Us is committed to making this dataset broadly useful—not only by democratizing access to this dataset across the scientific community but also to return value to the participants themselves by returning individual DNA results, such as genetic ancestry, hereditary disease risk and pharmacogenetics according to clinical standards, to those who wish to receive these research results.

Here we describe the release of WGS data from 245,388 All of Us participants and demonstrate the impact of this high-quality data in genetic and health studies. We carried out a series of data harmonization and quality control (QC) procedures and conducted analyses characterizing the properties of the dataset including genetic ancestry and relatedness. We validated the data by replicating well-established genotype–phenotype associations including low-density lipoprotein cholesterol (LDL-C) and 117 additional diseases. These data are available through the All of Us Researcher Workbench, a cloud platform that embodies and enables programme priorities, facilitating equitable data and compute access while ensuring responsible conduct of research and protecting participant privacy through a passport data access model.

The All of Us Research Program

To accelerate health research, All of Us is committed to curating and releasing research data early and often 6 . Less than five years after national enrolment began in 2018, this fifth data release includes data from more than 413,000 All of Us participants. Summary data are made available through a public Data Browser, and individual-level participant data are made available to researchers through the Researcher Workbench (Fig. 1a and Data availability).

figure 1

a , The All of Us Research Hub contains a publicly accessible Data Browser for exploration of summary phenotypic and genomic data. The Researcher Workbench is a secure cloud-based environment of participant-level data in a Controlled Tier that is widely accessible to researchers. b , All of Us participants have rich phenotype data from a combination of physical measurements, survey responses, EHRs, wearables and genomic data. Dots indicate the presence of the specific data type for the given number of participants. c , Overall summary of participants under-represented in biomedical research (UBR) with data available in the Controlled Tier. The All of Us logo in a is reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Participant data include a rich combination of phenotypic and genomic data (Fig. 1b ). Participants are asked to complete consent for research use of data, sharing of electronic health records (EHRs), donation of biospecimens (blood or saliva, and urine), in-person provision of physical measurements (height, weight and blood pressure) and surveys initially covering demographics, lifestyle and overall health 7 . Participants are also consented for recontact. EHR data, harmonized using the Observational Medical Outcomes Partnership Common Data Model 8 ( Methods ), are available for more than 287,000 participants (69.42%) from more than 50 health care provider organizations. The EHR dataset is longitudinal, with a quarter of participants having 10 years of EHR data (Extended Data Fig. 1 ). Data include 245,388 WGSs and genome-wide genotyping on 312,925 participants. Sequenced and genotyped individuals in this data release were not prioritized on the basis of any clinical or phenotypic feature. Notably, 99% of participants with WGS data also have survey data and physical measurements, and 84% also have EHR data. In this data release, 77% of individuals with genomic data identify with groups historically under-represented in biomedical research, including 46% who self-identify with a racial or ethnic minority group (Fig. 1c , Supplementary Table 1 and Supplementary Note ).

Scaling the All of Us infrastructure

The genomic dataset generated from All of Us participants is a resource for research and discovery and serves as the basis for return of individual health-related DNA results to participants. Consequently, the US Food and Drug Administration determined that All of Us met the criteria for a significant risk device study. As such, the entire All of Us genomics effort from sample acquisition to sequencing meets clinical laboratory standards 9 .

All of Us participants were recruited through a national network of partners, starting in 2018, as previously described 5 . Participants may enrol through All of Us - funded health care provider organizations or direct volunteer pathways and all biospecimens, including blood and saliva, are sent to the central All of Us Biobank for processing and storage. Genomics data for this release were generated from blood-derived DNA. The programme began return of actionable genomic results in December 2022. As of April 2023, approximately 51,000 individuals were sent notifications asking whether they wanted to view their results, and approximately half have accepted. Return continues on an ongoing basis.

The All of Us Data and Research Center maintains all participant information and biospecimen ID linkage to ensure that participant confidentiality and coded identifiers (participant and aliquot level) are used to track each sample through the All of Us genomics workflow. This workflow facilitates weekly automated aliquot and plating requests to the Biobank, supplies relevant metadata for the sample shipments to the Genome Centers, and contains a feedback loop to inform action on samples that fail QC at any stage. Further, the consent status of each participant is checked before sample shipment to confirm that they are still active. Although all participants with genomic data are consented for the same general research use category, the programme accommodates different preferences for the return of genomic data to participants and only data for those individuals who have consented for return of individual health-related DNA results are distributed to the All of Us Clinical Validation Labs for further evaluation and health-related clinical reporting. All participants in All of Us that choose to get health-related DNA results have the option to schedule a genetic counselling appointment to discuss their results. Individuals with positive findings who choose to obtain results are required to schedule an appointment with a genetic counsellor to receive those findings.

Genome sequencing

To satisfy the requirements for clinical accuracy, precision and consistency across DNA sample extraction and sequencing, the All of Us Genome Centers and Biobank harmonized laboratory protocols, established standard QC methodologies and metrics, and conducted a series of validation experiments using previously characterized clinical samples and commercially available reference standards 9 . Briefly, PCR-free barcoded WGS libraries were constructed with the Illumina Kapa HyperPrep kit. Libraries were pooled and sequenced on the Illumina NovaSeq 6000 instrument. After demultiplexing, initial QC analysis is performed with the Illumina DRAGEN pipeline (Supplementary Table 2 ) leveraging lane, library, flow cell, barcode and sample level metrics as well as assessing contamination, mapping quality and concordance to genotyping array data independently processed from a different aliquot of DNA. The Genome Centers use these metrics to determine whether each sample meets programme specifications and then submits sequencing data to the Data and Research Center for further QC, joint calling and distribution to the research community ( Methods ).

This effort to harmonize sequencing methods, multi-level QC and use of identical data processing protocols mitigated the variability in sequencing location and protocols that often leads to batch effects in large genomic datasets 9 . As a result, the data are not only of clinical-grade quality, but also consistent in coverage (≥30× mean) and uniformity across Genome Centers (Supplementary Figs. 1 – 5 ).

Joint calling and variant discovery

We carried out joint calling across the entire All of Us WGS dataset (Extended Data Fig. 2 ). Joint calling leverages information across samples to prune artefact variants, which increases sensitivity, and enables flagging samples with potential issues that were missed during single-sample QC 10 (Supplementary Table 3 ). Scaling conventional approaches to whole-genome joint calling beyond 50,000 individuals is a notable computational challenge 11 , 12 . To address this, we developed a new cloud variant storage solution, the Genomic Variant Store (GVS), which is based on a schema designed for querying and rendering variants in which the variants are stored in GVS and rendered to an analysable variant file, as opposed to the variant file being the primary storage mechanism (Code availability). We carried out QC on the joint call set on the basis of the approach developed for gnomAD 3.1 (ref.  13 ). This included flagging samples with outlying values in eight metrics (Supplementary Table 4 , Supplementary Fig. 2 and Methods ).

To calculate the sensitivity and precision of the joint call dataset, we included four well-characterized samples. We sequenced the National Institute of Standards and Technology reference materials (DNA samples) from the Genome in a Bottle consortium 13 and carried out variant calling as described above. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations 14 . The overall sensitivity for single-nucleotide variants was over 98.7% and precision was more than 99.9%. For short insertions or deletions, the sensitivity was over 97% and precision was more than 99.6% (Supplementary Table 5 and Methods ).

The joint call set included more than 1 billion genetic variants. We annotated the joint call dataset on the basis of functional annotation (for example, gene symbol and protein change) using Illumina Nirvana 15 . We defined coding variants as those inducing an amino acid change on a canonical ENSEMBL transcript and found 272,051,104 non-coding and 3,913,722 coding variants that have not been described previously in dbSNP 16 v153 (Extended Data Table 1 ). A total of 3,912,832 (99.98%) of the coding variants are rare (allelic frequency < 0.01) and the remaining 883 (0.02%) are common (allelic frequency > 0.01). Of the coding variants, 454 (0.01%) are common in one or more of the non-European computed ancestries in All of Us, rare among participants of European ancestry, and have an allelic number greater than 1,000 (Extended Data Table 2 and Extended Data Fig. 3 ). The distributions of pathogenic, or likely pathogenic, ClinVar variant counts per participant, stratified by computed ancestry, filtered to only those variants that are found in individuals with an allele count of <40 are shown in Extended Data Fig. 4 . The potential medical implications of these known and new variants with respect to variant pathogenicity by ancestry are highlighted in a companion paper 17 . In particular, we find that the European ancestry subset has the highest rate of pathogenic variation (2.1%), which was twice the rate of pathogenic variation in individuals of East Asian ancestry 17 .The lower frequency of variants in East Asian individuals may be partially explained by the fact the sample size in that group is small and there may be knowledge bias in the variant databases that is reducing the number of findings in some of the less-studied ancestry groups.

Genetic ancestry and relatedness

Genetic ancestry inference confirmed that 51.1% of the All of Us WGS dataset is derived from individuals of non-European ancestry. Briefly, the ancestry categories are based on the same labels used in gnomAD 18 . We trained a classifier on a 16-dimensional principal component analysis (PCA) space of a diverse reference based on 3,202 samples and 151,159 autosomal single-nucleotide polymorphisms. We projected the All of Us samples into the PCA space of the training data, based on the same single-nucleotide polymorphisms from the WGS data, and generated categorical ancestry predictions from the trained classifier ( Methods ). Continuous genetic ancestry fractions for All of Us samples were inferred using the same PCA data, and participants’ patterns of ancestry and admixture were compared to their self-identified race and ethnicity (Fig. 2 and Methods ). Continuous ancestry inference carried out using genome-wide genotypes yields highly concordant estimates.

figure 2

a , b , Uniform manifold approximation and projection (UMAP) representations of All of Us WGS PCA data with self-described race ( a ) and ethnicity ( b ) labels. c , Proportion of genetic ancestry per individual in six distinct and coherent ancestry groups defined by Human Genome Diversity Project and 1000 Genomes samples.

Kinship estimation confirmed that All of Us WGS data consist largely of unrelated individuals with about 85% (215,107) having no first- or second-degree relatives in the dataset (Supplementary Fig. 6 ). As many genomic analyses leverage unrelated individuals, we identified the smallest set of samples that are required to be removed from the remaining individuals that had first- or second-degree relatives and retained one individual from each kindred. This procedure yielded a maximal independent set of 231,442 individuals (about 94%) with genome sequence data in the current release ( Methods ).

Genetic determinants of LDL-C

As a measure of data quality and utility, we carried out a single-variant genome-wide association study (GWAS) for LDL-C, a trait with well-established genomic architecture ( Methods ). Of the 245,388 WGS participants, 91,749 had one or more LDL-C measurements. The All of Us LDL-C GWAS identified 20 well-established genome-wide significant loci, with minimal genomic inflation (Fig. 3 , Extended Data Table 3 and Supplementary Fig. 7 ). We compared the results to those of a recent multi-ethnic LDL-C GWAS in the National Heart, Lung, and Blood Institute (NHLBI) TOPMed study that included 66,329 ancestrally diverse (56% non-European ancestry) individuals 19 . We found a strong correlation between the effect estimates for NHLBI TOPMed genome-wide significant loci and those of All of Us ( R 2  = 0.98, P  < 1.61 × 10 −45 ; Fig. 3 , inset). Notably, the per-locus effect sizes observed in All of Us are decreased compared to those in TOPMed, which is in part due to differences in the underlying statistical model, differences in the ancestral composition of these datasets and differences in laboratory value ascertainment between EHR-derived data and epidemiology studies. A companion manuscript extended this work to identify common and rare genetic associations for three diseases (atrial fibrillation, coronary artery disease and type 2 diabetes) and two quantitative traits (height and LDL-C) in the All of Us dataset and identified very high concordance with previous efforts across all of these diseases and traits 20 .

figure 3

Manhattan plot demonstrating robust replication of 20 well-established LDL-C genetic loci among 91,749 individuals with 1 or more LDL-C measurements. The red horizontal line denotes the genome wide significance threshold of P = 5 × 10 –8 . Inset, effect estimate ( β ) comparison between NHLBI TOPMed LDL-C GWAS ( x  axis) and All of Us LDL-C GWAS ( y  axis) for the subset of 194 independent variants clumped (window 250 kb, r2 0.5) that reached genome-wide significance in NHLBI TOPMed.

Genotype-by-phenotype associations

As another measure of data quality and utility, we tested replication rates of previously reported phenotype–genotype associations in the five predicted genetic ancestry populations present in the Phenotype/Genotype Reference Map (PGRM): AFR, African ancestry; AMR, Latino/admixed American ancestry; EAS, East Asian ancestry; EUR, European ancestry; SAS, South Asian ancestry. The PGRM contains published associations in the GWAS catalogue in these ancestry populations that map to International Classification of Diseases-based phenotype codes 21 . This replication study specifically looked across 4,947 variants, calculating replication rates for powered associations in each ancestry population. The overall replication rates for associations powered at 80% were: 72.0% (18/25) in AFR, 100% (13/13) in AMR, 46.6% (7/15) in EAS, 74.9% (1,064/1,421) in EUR, and 100% (1/1) in SAS. With the exception of the EAS ancestry results, these powered replication rates are comparable to those of the published PGRM analysis where the replication rates of several single-site EHR-linked biobanks ranges from 76% to 85%. These results demonstrate the utility of the data and also highlight opportunities for further work understanding the specifics of the All of Us population and the potential contribution of gene–environment interactions to genotype–phenotype mapping and motivates the development of methods for multi-site EHR phenotype data extraction, harmonization and genetic association studies.

More broadly, the All of Us resource highlights the opportunities to identify genotype–phenotype associations that differ across diverse populations 22 . For example, the Duffy blood group locus ( ACKR1 ) is more prevalent in individuals of AFR ancestry and individuals of AMR ancestry than in individuals of EUR ancestry. Although the phenome-wide association study of this locus highlights the well-established association of the Duffy blood group with lower white blood cell counts both in individuals of AFR and AMR ancestry 23 , 24 , it also revealed genetic-ancestry-specific phenotype patterns, with minimal phenotypic associations in individuals of EAS ancestry and individuals of EUR ancestry (Fig. 4 and Extended Data Table 4 ). Conversely, rs9273363 in the HLA-DQB1 locus is associated with increased risk of type 1 diabetes 25 , 26 and diabetic complications across ancestries, but only associates with increased risk of coeliac disease in individuals of EUR ancestry (Extended Data Fig. 5 ). Similarly, the TCF7L2 locus 27 strongly associates with increased risk of type 2 diabetes and associated complications across several ancestries (Extended Data Fig. 6 ). Association testing results are available in Supplementary Dataset 1 .

figure 4

Results of genetic-ancestry-stratified phenome-wide association analysis among unrelated individuals highlighting ancestry-specific disease associations across the four most common genetic ancestries of participant. Bonferroni-adjusted phenome-wide significance threshold (<2.88 × 10 −5 ) is plotted as a red horizontal line. AFR ( n  = 34,037, minor allele fraction (MAF) 0.82); AMR ( n  = 28,901, MAF 0.10); EAS ( n  = 32,55, MAF 0.003); EUR ( n  = 101,613, MAF 0.007).

The cloud-based Researcher Workbench

All of Us genomic data are available in a secure, access-controlled cloud-based analysis environment: the All of Us Researcher Workbench. Unlike traditional data access models that require per-project approval, access in the Researcher Workbench is governed by a data passport model based on a researcher’s authenticated identity, institutional affiliation, and completion of self-service training and compliance attestation 28 . After gaining access, a researcher may create a new workspace at any time to conduct a study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is regularly audited and made accessible publicly on the All of Us Research Projects Directory. This streamlined access model is guided by the principles that: participants are research partners and maintaining their privacy and data security is paramount; their data should be made as accessible as possible for authorized researchers; and we should continually seek to remove unnecessary barriers to accessing and using All of Us data.

For researchers at institutions with an existing institutional data use agreement, access can be gained as soon as they complete the required verification and compliance steps. As of August 2023, 556 institutions have agreements in place, allowing more than 5,000 approved researchers to actively work on more than 4,400 projects. The median time for a researcher from initial registration to completion of these requirements is 28.6 h (10th percentile: 48 min, 90th percentile: 14.9 days), a fraction of the weeks to months it can take to assemble a project-specific application and have it reviewed by an access board with conventional access models.

Given that the size of the project’s phenotypic and genomic dataset is expected to reach 4.75 PB in 2023, the use of a central data store and cloud analysis tools will save funders an estimated US$16.5 million per year when compared to the typical approach of allowing researchers to download genomic data. Storing one copy per institution of this data at 556 registered institutions would cost about US$1.16 billion per year. By contrast, storing a central cloud copy costs about US$1.14 million per year, a 99.9% saving. Importantly, cloud infrastructure also democratizes data access particularly for researchers who do not have high-performance local compute resources.

Here we present the All of Us Research Program’s approach to generating diverse clinical-grade genomic data at an unprecedented scale. We present the data release of about 245,000 genome sequences as part of a scalable framework that will grow to include genetic information and health data for one million or more people living across the USA. Our observations permit several conclusions.

First, the All of Us programme is making a notable contribution to improving the study of human biology through purposeful inclusion of under-represented individuals at scale 29 , 30 . Of the participants with genomic data in All of Us, 45.92% self-identified as a non-European race or ethnicity. This diversity enabled identification of more than 275 million new genetic variants across the dataset not previously captured by other large-scale genome aggregation efforts with diverse participants that have submitted variation to dbSNP v153, such as NHLBI TOPMed 31 freeze 8 (Extended Data Table 1 ). In contrast to gnomAD, All of Us permits individual-level genotype access with detailed phenotype data for all participants. Furthermore, unlike many genomics resources, All of Us is uniformly consented for general research use and enables researchers to go from initial account creation to individual-level data access in as little as a few hours. The All of Us cohort is significantly more diverse than those of other large contemporary research studies generating WGS data 32 , 33 . This enables a more equitable future for precision medicine (for example, through constructing polygenic risk scores that are appropriately calibrated to diverse populations 34 , 35 as the eMERGE programme has done leveraging All of Us data 36 , 37 ). Developing new tools and regulatory frameworks to enable analyses across multiple biobanks in the cloud to harness the unique strengths of each is an active area of investigation addressed in a companion paper to this work 38 .

Second, the All of Us Researcher Workbench embodies the programme’s design philosophy of open science, reproducible research, equitable access and transparency to researchers and to research participants 26 . Importantly, for research studies, no group of data users should have privileged access to All of Us resources based on anything other than data protection criteria. Although the All of Us Researcher Workbench initially targeted onboarding US academic, health care and non-profit organizations, it has recently expanded to international researchers. We anticipate further genomic and phenotypic data releases at regular intervals with data available to all researcher communities. We also anticipate additional derived data and functionality to be made available, such as reference data, structural variants and a service for array imputation using the All of Us genomic data.

Third, All of Us enables studying human biology at an unprecedented scale. The programmatic goal of sequencing one million or more genomes has required harnessing the output of multiple sequencing centres. Previous work has focused on achieving functional equivalence in data processing and joint calling pipelines 39 . To achieve clinical-grade data equivalence, All of Us required protocol equivalence at both sequencing production level and data processing across the sequencing centres. Furthermore, previous work has demonstrated the value of joint calling at scale 10 , 18 . The new GVS framework developed by the All of Us programme enables joint calling at extreme scales (Code availability). Finally, the provision of data access through cloud-native tools enables scalable and secure access and analysis to researchers while simultaneously enabling the trust of research participants and transparency underlying the All of Us data passport access model.

The clinical-grade sequencing carried out by All of Us enables not only research, but also the return of value to participants through clinically relevant genetic results and health-related traits to those who opt-in to receiving this information. In the years ahead, we anticipate that this partnership with All of Us participants will enable researchers to move beyond large-scale genomic discovery to understanding the consequences of implementing genomic medicine at scale.

The All of Us cohort

All of Us aims to engage a longitudinal cohort of one million or more US participants, with a focus on including populations that have historically been under-represented in biomedical research. Details of the All of Us cohort have been described previously 5 . Briefly, the primary objective is to build a robust research resource that can facilitate the exploration of biological, clinical, social and environmental determinants of health and disease. The programme will collect and curate health-related data and biospecimens, and these data and biospecimens will be made broadly available for research uses. Health data are obtained through the electronic medical record and through participant surveys. Survey templates can be found on our public website: https://www.researchallofus.org/data-tools/survey-explorer/ . Adults 18 years and older who have the capacity to consent and reside in the USA or a US territory at present are eligible. Informed consent for all participants is conducted in person or through an eConsent platform that includes primary consent, HIPAA Authorization for Research use of EHRs and other external health data, and Consent for Return of Genomic Results. The protocol was reviewed by the Institutional Review Board (IRB) of the All of Us Research Program. The All of Us IRB follows the regulations and guidance of the NIH Office for Human Research Protections for all studies, ensuring that the rights and welfare of research participants are overseen and protected uniformly.

Data accessibility through a ‘data passport’

Authorization for access to participant-level data in All of Us is based on a ‘data passport’ model, through which authorized researchers do not need IRB review for each research project. The data passport is required for gaining data access to the Researcher Workbench and for creating workspaces to carry out research projects using All of Us data. At present, data passports are authorized through a six-step process that includes affiliation with an institution that has signed a Data Use and Registration Agreement, account creation, identity verification, completion of ethics training, and attestation to a data user code of conduct. Results reported follow the All of Us Data and Statistics Dissemination Policy disallowing disclosure of group counts under 20 to protect participant privacy without seeking prior approval 40 .

At present, All of Us gathers EHR data from about 50 health care organizations that are funded to recruit and enrol participants as well as transfer EHR data for those participants who have consented to provide them. Data stewards at each provider organization harmonize their local data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, and then submit it to the All of Us Data and Research Center (DRC) so that it can be linked with other participant data and further curated for research use. OMOP is a common data model standardizing health information from disparate EHRs to common vocabularies and organized into tables according to data domains. EHR data are updated from the recruitment sites and sent to the DRC quarterly. Updated data releases to the research community occur approximately once a year. Supplementary Table 6 outlines the OMOP concepts collected by the DRC quarterly from the recruitment sites.

Biospecimen collection and processing

Participants who consented to participate in All of Us donated fresh whole blood (4 ml EDTA and 10 ml EDTA) as a primary source of DNA. The All of Us Biobank managed by the Mayo Clinic extracted DNA from 4 ml EDTA whole blood, and DNA was stored at −80 °C at an average concentration of 150 ng µl −1 . The buffy coat isolated from 10 ml EDTA whole blood has been used for extracting DNA in the case of initial extraction failure or absence of 4 ml EDTA whole blood. The Biobank plated 2.4 µg DNA with a concentration of 60 ng µl −1 in duplicate for array and WGS samples. The samples are distributed to All of Us Genome Centers weekly, and a negative (empty well) control and National Institute of Standards and Technology controls are incorporated every two months for QC purposes.

Genome Center sample receipt, accession and QC

On receipt of DNA sample shipments, the All of Us Genome Centers carry out an inspection of the packaging and sample containers to ensure that sample integrity has not been compromised during transport and to verify that the sample containers correspond to the shipping manifest. QC of the submitted samples also includes DNA quantification, using routine procedures to confirm volume and concentration (Supplementary Table 7 ). Any issues or discrepancies are recorded, and affected samples are put on hold until resolved. Samples that meet quality thresholds are accessioned in the Laboratory Information Management System, and sample aliquots are prepared for library construction processing (for example, normalized with respect to concentration and volume).

WGS library construction, sequencing and primary data QC

The DNA sample is first sheared using a Covaris sonicator and is then size-selected using AMPure XP beads to restrict the range of library insert sizes. Using the PCR Free Kapa HyperPrep library construction kit, enzymatic steps are completed to repair the jagged ends of DNA fragments, add proper A-base segments, and ligate indexed adapter barcode sequences onto samples. Excess adaptors are removed using AMPure XP beads for a final clean-up. Libraries are quantified using quantitative PCR with the Illumina Kapa DNA Quantification Kit and then normalized and pooled for sequencing (Supplementary Table 7 ).

Pooled libraries are loaded on the Illumina NovaSeq 6000 instrument. The data from the initial sequencing run are used to QC individual libraries and to remove non-conforming samples from the pipeline. The data are also used to calibrate the pooling volume of each individual library and re-pool the libraries for additional NovaSeq sequencing to reach an average coverage of 30×.

After demultiplexing, WGS analysis occurs on the Illumina DRAGEN platform. The DRAGEN pipeline consists of highly optimized algorithms for mapping, aligning, sorting, duplicate marking and haplotype variant calling and makes use of platform features such as compression and BCL conversion. Alignment uses the GRCh38dh reference genome. QC data are collected at every stage of the analysis protocol, providing high-resolution metrics required to ensure data consistency for large-scale multiplexing. The DRAGEN pipeline produces a large number of metrics that cover lane, library, flow cell, barcode and sample-level metrics for all runs as well as assessing contamination and mapping quality. The All of Us Genome Centers use these metrics to determine pass or fail for each sample before submitting the CRAM files to the All of Us DRC. For mapping and variant calling, all Genome Centers have harmonized on a set of DRAGEN parameters, which ensures consistency in processing (Supplementary Table 2 ).

Every step through the WGS procedure is rigorously controlled by predefined QC measures. Various control mechanisms and acceptance criteria were established during WGS assay validation. Specific metrics for reviewing and releasing genome data are: mean coverage (threshold of ≥30×), genome coverage (threshold of ≥90% at 20×), coverage of hereditary disease risk genes (threshold of ≥95% at 20×), aligned Q30 bases (threshold of ≥8 × 10 10 ), contamination (threshold of ≤1%) and concordance to independently processed array data.

Array genotyping

Samples are processed for genotyping at three All of Us Genome Centers (Broad, Johns Hopkins University and University of Washington). DNA samples are received from the Biobank and the process is facilitated by the All of Us genomics workflow described above. All three centres used an identical array product, scanners, resource files and genotype calling software for array processing to reduce batch effects. Each centre has its own Laboratory Information Management System that manages workflow control, sample and reagent tracking, and centre-specific liquid handling robotics.

Samples are processed using the Illumina Global Diversity Array (GDA) with Illumina Infinium LCG chemistry using the automated protocol and scanned on Illumina iSCANs with Automated Array Loaders. Illumina IAAP software converts raw data (IDAT files; 2 per sample) into a single GTC file per sample using the BPM file (defines strand, probe sequences and illumicode address) and the EGT file (defines the relationship between intensities and genotype calls). Files used for this data release are: GDA-8v1-0_A5.bpm, GDA-8v1-0_A1_ClusterFile.egt, gentrain v3, reference hg19 and gencall cutoff 0.15. The GDA array assays a total of 1,914,935 variant positions including 1,790,654 single-nucleotide variants, 44,172 indels, 9,935 intensity-only probes for CNV calling, and 70,174 duplicates (same position, different probes). Picard GtcToVcf is used to convert the GTC files to VCF format. Resulting VCF and IDAT files are submitted to the DRC for ingestion and further processing. The VCF file contains assay name, chromosome, position, genotype calls, quality score, raw and normalized intensities, B allele frequency and log R ratio values. Each genome centre is running the GDA array under Clinical Laboratory Improvement Amendments-compliant protocols. The GTC files are parsed and metrics are uploaded to in-house Laboratory Information Management System systems for QC review.

At batch level (each set of 96-well plates run together in the laboratory at one time), each genome centre includes positive control samples that are required to have >98% call rate and >99% concordance to existing data to approve release of the batch of data. At the sample level, the call rate and sex are the key QC determinants 41 . Contamination is also measured using BAFRegress 42 and reported out as metadata. Any sample with a call rate below 98% is repeated one time in the laboratory. Genotyped sex is determined by plotting normalized x versus normalized y intensity values for a batch of samples. Any sample discordant with ‘sex at birth’ reported by the All of Us participant is flagged for further detailed review and repeated one time in the laboratory. If several sex-discordant samples are clustered on an array or on a 96-well plate, the entire array or plate will have data production repeated. Samples identified with sex chromosome aneuploidies are also reported back as metadata (XXX, XXY, XYY and so on). A final processing status of ‘pass’, ‘fail’ or ‘abandon’ is determined before release of data to the All of Us DRC. An array sample will pass if the call rate is >98% and the genotyped sex and sex at birth are concordant (or the sex at birth is not applicable). An array sample will fail if the genotyped sex and the sex at birth are discordant. An array sample will have the status of abandon if the call rate is <98% after at least two attempts at the genome centre.

Data from the arrays are used for participant return of genetic ancestry and non-health-related traits for those who consent, and they are also used to facilitate additional QC of the matched WGS data. Contamination is assessed in the array data to determine whether DNA re-extraction is required before WGS. Re-extraction is prompted by level of contamination combined with consent status for return of results. The arrays are also used to confirm sample identity between the WGS data and the matched array data by assessing concordance at 100 unique sites. To establish concordance, a fingerprint file of these 100 sites is provided to the Genome Centers to assess concordance with the same sites in the WGS data before CRAM submission.

Genomic data curation

As seen in Extended Data Fig. 2 , we generate a joint call set for all WGS samples and make these data available in their entirety and by sample subsets to researchers. A breakdown of the frequencies, stratified by computed ancestries for which we had more than 10,000 participants can be found in Extended Data Fig. 3 . The joint call set process allows us to leverage information across samples to improve QC and increase accuracy.

Single-sample QC

If a sample fails single-sample QC, it is excluded from the release and is not reported in this document. These tests detect sample swaps, cross-individual contamination and sample preparation errors. In some cases, we carry out these tests twice (at both the Genome Center and the DRC), for two reasons: to confirm internal consistency between sites; and to mark samples as passing (or failing) QC on the basis of the research pipeline criteria. The single-sample QC process accepts a higher contamination rate than the clinical pipeline (0.03 for the research pipeline versus 0.01 for the clinical pipeline), but otherwise uses identical thresholds. The list of specific QC processes, passing criteria, error modes addressed and an overview of the results can be found in Supplementary Table 3 .

Joint call set QC

During joint calling, we carry out additional QC steps using information that is available across samples including hard thresholds, population outliers, allele-specific filters, and sensitivity and precision evaluation. Supplementary Table 4 summarizes both the steps that we took and the results obtained for the WGS data. More detailed information about the methods and specific parameters can be found in the All of Us Genomic Research Data Quality Report 36 .

Batch effect analysis

We analysed cross-sequencing centre batch effects in the joint call set. To quantify the batch effect, we calculated Cohen’s d (ref.  43 ) for four metrics (insertion/deletion ratio, single-nucleotide polymorphism count, indel count and single-nucleotide polymorphism transition/transversion ratio) across the three genome sequencing centres (Baylor College of Medicine, Broad Institute and University of Washington), stratified by computed ancestry and seven regions of the genome (whole genome, high-confidence calling, repetitive, GC content of >0.85, GC content of <0.15, low mappability, the ACMG59 genes and regions of large duplications (>1 kb)). Using random batches as a control set, all comparisons had a Cohen’s d of <0.35. Here we report any Cohen’s d results >0.5, which we chose before this analysis and is conventionally the threshold of a medium effect size 44 .

We found that there was an effect size in indel counts (Cohen’s d of 0.53) in the entire genome, between Broad Institute and University of Washington, but this was being driven by repetitive and low-mappability regions. We found no batch effects with Cohen’s d of >0.5 in the ratio metrics or in any metrics in the high-confidence calling, low or high GC content, or ACMG59 regions. A complete list of the batch effects with Cohen’s d of >0.5 are found in Supplementary Table 8 .

Sensitivity and precision evaluation

To determine sensitivity and precision, we included four well-characterized control samples (four National Institute of Standards and Technology Genome in a Bottle samples (HG-001, HG-003, HG-004 and HG-005). The samples were sequenced with the same protocol as All of Us. Of note, these samples were not included in data released to researchers. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations. We use the high-confidence calling region, defined by Genome in a Bottle v4.2.1, as the source of ground truth. To be called a true positive, a variant must match the chromosome, position, reference allele, alternate allele and zygosity. In cases of sites with multiple alternative alleles, each alternative allele is considered separately. Sensitivity and precision results are reported in Supplementary Table 5 .

Genetic ancestry inference

We computed categorical ancestry for all WGS samples in All of Us and made these available to researchers. These predictions are also the basis for population allele frequency calculations in the Genomic Variants section of the public Data Browser. We used the high-quality set of sites to determine an ancestry label for each sample. The ancestry categories are based on the same labels used in gnomAD 18 , the Human Genome Diversity Project (HGDP) 45 and 1000 Genomes 1 : African (AFR); Latino/admixed American (AMR); East Asian (EAS); Middle Eastern (MID); European (EUR), composed of Finnish (FIN) and Non-Finnish European (NFE); Other (OTH), not belonging to one of the other ancestries or is an admixture; South Asian (SAS).

We trained a random forest classifier 46 on a training set of the HGDP and 1000 Genomes samples variants on the autosome, obtained from gnomAD 11 . We generated the first 16 principal components (PCs) of the training sample genotypes (using the hwe_normalized_pca in Hail) at the high-quality variant sites for use as the feature vector for each training sample. We used the truth labels from the sample metadata, which can be found alongside the VCFs. Note that we do not train the classifier on the samples labelled as Other. We use the label probabilities (‘confidence’) of the classifier on the other ancestries to determine ancestry of Other.

To determine the ancestry of All of Us samples, we project the All of Us samples into the PCA space of the training data and apply the classifier. As a proxy for the accuracy of our All of Us predictions, we look at the concordance between the survey results and the predicted ancestry. The concordance between self-reported ethnicity and the ancestry predictions was 87.7%.

PC data from All of Us samples and the HGDP and 1000 Genomes samples were used to compute individual participant genetic ancestry fractions for All of Us samples using the Rye program. Rye uses PC data to carry out rapid and accurate genetic ancestry inference on biobank-scale datasets 47 . HGDP and 1000 Genomes reference samples were used to define a set of six distinct and coherent ancestry groups—African, East Asian, European, Middle Eastern, Latino/admixed American and South Asian—corresponding to participant self-identified race and ethnicity groups. Rye was run on the first 16 PCs, using the defined reference ancestry groups to assign ancestry group fractions to individual All of Us participant samples.

Relatedness

We calculated the kinship score using the Hail pc_relate function and reported any pairs with a kinship score above 0.1. The kinship score is half of the fraction of the genetic material shared (ranges from 0.0 to 0.5). We determined the maximal independent set 41 for related samples. We identified a maximally unrelated set of 231,442 samples (94%) for kinship scored greater than 0.1.

LDL-C common variant GWAS

The phenotypic data were extracted from the Curated Data Repository (CDR, Control Tier Dataset v7) in the All of Us Researcher Workbench. The All of Us Cohort Builder and Dataset Builder were used to extract all LDL cholesterol measurements from the Lab and Measurements criteria in EHR data for all participants who have WGS data. The most recent measurements were selected as the phenotype and adjusted for statin use 19 , age and sex. A rank-based inverse normal transformation was applied for this continuous trait to increase power and deflate type I error. Analysis was carried out on the Hail MatrixTable representation of the All of Us WGS joint-called data including removing monomorphic variants, variants with a call rate of <95% and variants with extreme Hardy–Weinberg equilibrium values ( P  < 10 −15 ). A linear regression was carried out with REGENIE 48 on variants with a minor allele frequency >5%, further adjusting for relatedness to the first five ancestry PCs. The final analysis included 34,924 participants and 8,589,520 variants.

Genotype-by-phenotype replication

We tested replication rates of known phenotype–genotype associations in three of the four largest populations: EUR, AFR and EAS. The AMR population was not included because they have no registered GWAS. This method is a conceptual extension of the original GWAS × phenome-wide association study, which replicated 66% of powered associations in a single EHR-linked biobank 49 . The PGRM is an expansion of this work by Bastarache et al., based on associations in the GWAS catalogue 50 in June 2020 (ref.  51 ). After directly matching the Experimental Factor Ontology terms to phecodes, the authors identified 8,085 unique loci and 170 unique phecodes that compose the PGRM. They showed replication rates in several EHR-linked biobanks ranging from 76% to 85%. For this analysis, we used the EUR-, and AFR-based maps, considering only catalogue associations that were P  < 5 × 10 −8 significant.

The main tools used were the Python package Hail for data extraction, plink for genomic associations, and the R packages PheWAS and pgrm for further analysis and visualization. The phenotypes, participant-reported sex at birth, and year of birth were extracted from the All of Us CDR (Controlled Tier Dataset v7). These phenotypes were then loaded into a plink-compatible format using the PheWAS package, and related samples were removed by sub-setting to the maximally unrelated dataset ( n  = 231,442). Only samples with EHR data were kept, filtered by selected loci, annotated with demographic and phenotypic information extracted from the CDR and ancestry prediction information provided by All of Us, ultimately resulting in 181,345 participants for downstream analysis. The variants in the PGRM were filtered by a minimum population-specific allele frequency of >1% or population-specific allele count of >100, leaving 4,986 variants. Results for which there were at least 20 cases in the ancestry group were included. Then, a series of Firth logistic regression tests with phecodes as the outcome and variants as the predictor were carried out, adjusting for age, sex (for non-sex-specific phenotypes) and the first three genomic PC features as covariates. The PGRM was annotated with power calculations based on the case counts and reported allele frequencies. Power of 80% or greater was considered powered for this analysis.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The All of Us Research Hub has a tiered data access data passport model with three data access tiers. The Public Tier dataset contains only aggregate data with identifiers removed. These data are available to the public through Data Snapshots ( https://www.researchallofus.org/data-tools/data-snapshots/ ) and the public Data Browser ( https://databrowser.researchallofus.org/ ). The Registered Tier curated dataset contains individual-level data, available only to approved researchers on the Researcher Workbench. At present, the Registered Tier includes data from EHRs, wearables and surveys, as well as physical measurements taken at the time of participant enrolment. The Controlled Tier dataset contains all data in the Registered Tier and additionally genomic data in the form of WGS and genotyping arrays, previously suppressed demographic data fields from EHRs and surveys, and unshifted dates of events. At present, Registered Tier and Controlled Tier data are available to researchers at academic institutions, non-profit institutions, and both non-profit and for-profit health care institutions. Work is underway to begin extending access to additional audiences, including industry-affiliated researchers. Researchers have the option to register for Registered Tier and/or Controlled Tier access by completing the All of Us Researcher Workbench access process, which includes identity verification and All of Us-specific training in research involving human participants ( https://www.researchallofus.org/register/ ). Researchers may create a new workspace at any time to conduct any research study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is made accessible publicly on the All of Us Research Projects Directory at https://allofus.nih.gov/protecting-data-and-privacy/research-projects-all-us-data .

Code availability

The GVS code is available at https://github.com/broadinstitute/gatk/tree/ah_var_store/scripts/variantstore . The LDL GWAS pipeline is available as a demonstration project in the Featured Workspace Library on the Researcher Workbench ( https://workbench.researchallofus.org/workspaces/aou-rw-5981f9dc/aouldlgwasregeniedsubctv6duplicate/notebooks ).

The 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526 , 68–74 (2015).

Article   Google Scholar  

Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577 , 179–189 (2020).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570 , 514–518 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lewis, A. C. F. et al. Getting genetic ancestry right for science and society. Science 376 , 250–252 (2022).

All of Us Program Investigators. The “All of Us” Research Program. N. Engl. J. Med. 381 , 668–676 (2019).

Ramirez, A. H., Gebo, K. A. & Harris, P. A. Progress with the All of Us Research Program: opening access for researchers. JAMA 325 , 2441–2442 (2021).

Article   PubMed   Google Scholar  

Ramirez, A. H. et al. The All of Us Research Program: data quality, utility, and diversity. Patterns 3 , 100570 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Overhage, J. M., Ryan, P. B., Reich, C. G., Hartzema, A. G. & Stang, P. E. Validation of a common data model for active safety surveillance research. J. Am. Med. Inform. Assoc. 19 , 54–60 (2012).

Venner, E. et al. Whole-genome sequencing as an investigational device for return of hereditary disease risk and pharmacogenomic results as part of the All of Us Research Program. Genome Med. 14 , 34 (2022).

Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536 , 285–291 (2016).

Tiao, G. & Goodrich, J. gnomAD v3.1 New Content, Methods, Annotations, and Data Availability ; https://gnomad.broadinstitute.org/news/2020-10-gnomad-v3-1-new-content-methods-annotations-and-data-availability/ .

Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625 , 92–100 (2022).

Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37 , 561–566 (2019).

Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37 , 555–560 (2019).

Stromberg, M. et al. Nirvana: clinical grade variant annotator. In Proc. 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics 596 (Association for Computing Machinery, 2017).

Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29 , 308–311 (2001).

Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol. https://doi.org/10.1038/s42003-023-05708-y (2024).

Karczewski, S. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581 , 434–443 (2020).

Selvaraj, M. S. et al. Whole genome sequence analysis of blood lipid levels in >66,000 individuals. Nat. Commun. 13 , 5995 (2022).

Wang, X. et al. Common and rare variants associated with cardiometabolic traits across 98,622 whole-genome sequences in the All of Us research program. J. Hum. Genet. 68 , 565–570 (2023).

Bastarache, L. et al. The phenotype-genotype reference map: improving biobank data science through replication. Am. J. Hum. Genet. 110 , 1522–1533 (2023).

Bianchi, D. W. et al. The All of Us Research Program is an opportunity to enhance the diversity of US biomedical research. Nat. Med. https://doi.org/10.1038/s41591-023-02744-3 (2024).

Van Driest, S. L. et al. Association between a common, benign genotype and unnecessary bone marrow biopsies among African American patients. JAMA Intern. Med. 181 , 1100–1105 (2021).

Chen, M.-H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182 , 1198–1213 (2020).

Chiou, J. et al. Interpreting type 1 diabetes risk with genetics and single-cell epigenomics. Nature 594 , 398–402 (2021).

Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47 , 898–905 (2015).

Grant, S. F. A. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat. Genet. 38 , 320–323 (2006).

Article   CAS   PubMed   Google Scholar  

All of Us Research Program. Framework for Access to All of Us Data Resources v1.1 (2021); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/data&tools/data-access-use/AoU_Data_Access_Framework_508.pdf .

Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177 , 58–69 (2019).

Mapes, B. M. et al. Diversity and inclusion for the All of Us research program: A scoping review. PLoS ONE 15 , e0234962 (2020).

Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590 , 290–299 (2021).

Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562 , 203–209 (2018).

Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607 , 732–740 (2022).

Kurniansyah, N. et al. Evaluating the use of blood pressure polygenic risk scores across race/ethnic background groups. Nat. Commun. 14 , 3202 (2023).

Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55 , 549– 558 (2022).

Linder, J. E. et al. Returning integrated genomic risk and clinical recommendations: the eMERGE study. Genet. Med. 25 , 100006 (2023).

Lennon, N. J. et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat. Med. https://doi.org/10.1038/s41591-024-02796-z (2024).

Deflaux, N. et al. Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis. Nat. Commun. 14 , 5419 (2023).

Regier, A. A. et al. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat. Commun. 9 , 4038 (2018).

Article   ADS   PubMed   PubMed Central   Google Scholar  

All of Us Research Program. Data and Statistics Dissemination Policy (2020); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/2020/05/AoU_Policy_Data_and_Statistics_Dissemination_508.pdf .

Laurie, C. C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34 , 591–602 (2010).

Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91 , 839–848 (2012).

Cohen, J. Statistical Power Analysis for the Behavioral Sciences (Routledge, 2013).

Andrade, C. Mean difference, standardized mean difference (SMD), and their use in meta-analysis. J. Clin. Psychiatry 81 , 20f13681 (2020).

Cavalli-Sforza, L. L. The Human Genome Diversity Project: past, present and future. Nat. Rev. Genet. 6 , 333–340 (2005).

Ho, T. K. Random decision forests. In Proc. 3rd International Conference on Document Analysis and Recognition (IEEE Computer Society Press, 2002).

Conley, A. B. et al. Rye: genetic ancestry inference at biobank scale. Nucleic Acids Res. 51 , e44 (2023).

Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53 , 1097–1103 (2021).

Denny, J. C. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotech. 31 , 1102–1111 (2013).

Buniello, A. et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47 , D1005–D1012 (2019).

Bastarache, L. et al. The Phenotype-Genotype Reference Map: improving biobank data science through replication. Am. J. Hum. Genet. 10 , 1522–1533 (2023).

Download references

Acknowledgements

The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers (OT2 OD026549; OT2 OD026554; OT2 OD026557; OT2 OD026556; OT2 OD026550; OT2 OD 026552; OT2 OD026553; OT2 OD026548; OT2 OD026551; OT2 OD026555); Inter agency agreement AOD 16037; Federally Qualified Health Centers HHSN 263201600085U; Data and Research Center: U2C OD023196; Genome Centers (OT2 OD002748; OT2 OD002750; OT2 OD002751); Biobank: U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: U24 OD023163; Communications and Engagement: OT2 OD023205; OT2 OD023206; and Community Partners (OT2 OD025277; OT2 OD025315; OT2 OD025337; OT2 OD025276). In addition, the All of Us Research Program would not be possible without the partnership of its participants. All of Us and the All of Us logo are service marks of the US Department of Health and Human Services. E.E.E. is an investigator of the Howard Hughes Medical Institute. We acknowledge the foundational contributions of our friend and colleague, the late Deborah A. Nickerson. Debbie’s years of insightful contributions throughout the formation of the All of Us genomics programme are permanently imprinted, and she shares credit for all of the successes of this programme.

Author information

Authors and affiliations.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Alexander G. Bick & Henry R. Condon

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA

Ginger A. Metcalf, Eric Boerwinkle, Richard A. Gibbs, Donna M. Muzny, Eric Venner, Kimberly Walker, Jianhong Hu, Harsha Doddapaneni, Christie L. Kovar, Mullai Murugan, Shannon Dugan, Ziad Khan & Eric Boerwinkle

Vanderbilt Institute of Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA

Kelsey R. Mayo, Jodell E. Linder, Melissa Basford, Ashley Able, Ashley E. Green, Robert J. Carroll, Jennifer Zhang & Yuanyuan Wang

Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA

Lee Lichtenstein, Anthony Philippakis, Sophie Schwartz, M. Morgan T. Aster, Kristian Cibulskis, Andrea Haessly, Rebecca Asch, Aurora Cremer, Kylee Degatano, Akum Shergill, Laura D. Gauthier, Samuel K. Lee, Aaron Hatcher, George B. Grant, Genevieve R. Brandt, Miguel Covarrubias, Eric Banks & Wail Baalawi

Verily, South San Francisco, CA, USA

Shimon Rura, David Glazer, Moira K. Dillon & C. H. Albach

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA

Robert J. Carroll, Paul A. Harris & Dan M. Roden

All of Us Research Program, National Institutes of Health, Bethesda, MD, USA

Anjene Musick, Andrea H. Ramirez, Sokny Lim, Siddhartha Nambiar, Bradley Ozenberger, Anastasia L. Wise, Chris Lunt, Geoffrey S. Ginsburg & Joshua C. Denny

School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA

I. King Jordan, Shashwat Deepali Nagar & Shivam Sharma

Neuroscience Institute, Institute of Translational Genomic Medicine, Morehouse School of Medicine, Atlanta, GA, USA

Robert Meller

Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA

Mine S. Cicek, Stephen N. Thibodeau & Mine S. Cicek

Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA

Kimberly F. Doheny, Michelle Z. Mawhinney, Sean M. L. Griffith, Elvin Hsu, Hua Ling & Marcia K. Adams

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA

Evan E. Eichler, Joshua D. Smith, Christian D. Frazar, Colleen P. Davis, Karynne E. Patterson, Marsha M. Wheeler, Sean McGee, Mitzi L. Murray, Valeria Vasta, Dru Leistritz, Matthew A. Richardson, Aparna Radhakrishnan & Brenna W. Ehmen

Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA

Evan E. Eichler

Broad Institute of MIT and Harvard, Cambridge, MA, USA

Stacey Gabriel, Heidi L. Rehm, Niall J. Lennon, Christina Austin-Tse, Eric Banks, Michael Gatzen, Namrata Gupta, Katie Larsson, Sheli McDonough, Steven M. Harrison, Christopher Kachulis, Matthew S. Lebo, Seung Hoan Choi & Xin Wang

Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA

Gail P. Jarvik & Elisabeth A. Rosenthal

Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Dan M. Roden

Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA

Center for Individualized Medicine, Biorepository Program, Mayo Clinic, Rochester, MN, USA

Stephen N. Thibodeau, Ashley L. Blegen, Samantha J. Wirkus, Victoria A. Wagner, Jeffrey G. Meyer & Mine S. Cicek

Color Health, Burlingame, CA, USA

Scott Topper, Cynthia L. Neben, Marcie Steeves & Alicia Y. Zhou

School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA

Eric Boerwinkle

Laboratory for Molecular Medicine, Massachusetts General Brigham Personalized Medicine, Cambridge, MA, USA

Christina Austin-Tse, Emma Henricks & Matthew S. Lebo

Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA, USA

Christina M. Lockwood, Brian H. Shirts, Colin C. Pritchard, Jillian G. Buchan & Niklas Krumm

Manuscript Writing Group

  • Alexander G. Bick
  • , Ginger A. Metcalf
  • , Kelsey R. Mayo
  • , Lee Lichtenstein
  • , Shimon Rura
  • , Robert J. Carroll
  • , Anjene Musick
  • , Jodell E. Linder
  • , I. King Jordan
  • , Shashwat Deepali Nagar
  • , Shivam Sharma
  •  & Robert Meller

All of Us Research Program Genomics Principal Investigators

  • Melissa Basford
  • , Eric Boerwinkle
  • , Mine S. Cicek
  • , Kimberly F. Doheny
  • , Evan E. Eichler
  • , Stacey Gabriel
  • , Richard A. Gibbs
  • , David Glazer
  • , Paul A. Harris
  • , Gail P. Jarvik
  • , Anthony Philippakis
  • , Heidi L. Rehm
  • , Dan M. Roden
  • , Stephen N. Thibodeau
  •  & Scott Topper

Biobank, Mayo

  • Ashley L. Blegen
  • , Samantha J. Wirkus
  • , Victoria A. Wagner
  • , Jeffrey G. Meyer
  •  & Stephen N. Thibodeau

Genome Center: Baylor-Hopkins Clinical Genome Center

  • Donna M. Muzny
  • , Eric Venner
  • , Michelle Z. Mawhinney
  • , Sean M. L. Griffith
  • , Elvin Hsu
  • , Marcia K. Adams
  • , Kimberly Walker
  • , Jianhong Hu
  • , Harsha Doddapaneni
  • , Christie L. Kovar
  • , Mullai Murugan
  • , Shannon Dugan
  • , Ziad Khan
  •  & Richard A. Gibbs

Genome Center: Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine

  • Niall J. Lennon
  • , Christina Austin-Tse
  • , Eric Banks
  • , Michael Gatzen
  • , Namrata Gupta
  • , Emma Henricks
  • , Katie Larsson
  • , Sheli McDonough
  • , Steven M. Harrison
  • , Christopher Kachulis
  • , Matthew S. Lebo
  • , Cynthia L. Neben
  • , Marcie Steeves
  • , Alicia Y. Zhou
  • , Scott Topper
  •  & Stacey Gabriel

Genome Center: University of Washington

  • Gail P. Jarvik
  • , Joshua D. Smith
  • , Christian D. Frazar
  • , Colleen P. Davis
  • , Karynne E. Patterson
  • , Marsha M. Wheeler
  • , Sean McGee
  • , Christina M. Lockwood
  • , Brian H. Shirts
  • , Colin C. Pritchard
  • , Mitzi L. Murray
  • , Valeria Vasta
  • , Dru Leistritz
  • , Matthew A. Richardson
  • , Jillian G. Buchan
  • , Aparna Radhakrishnan
  • , Niklas Krumm
  •  & Brenna W. Ehmen

Data and Research Center

  • Lee Lichtenstein
  • , Sophie Schwartz
  • , M. Morgan T. Aster
  • , Kristian Cibulskis
  • , Andrea Haessly
  • , Rebecca Asch
  • , Aurora Cremer
  • , Kylee Degatano
  • , Akum Shergill
  • , Laura D. Gauthier
  • , Samuel K. Lee
  • , Aaron Hatcher
  • , George B. Grant
  • , Genevieve R. Brandt
  • , Miguel Covarrubias
  • , Melissa Basford
  • , Alexander G. Bick
  • , Ashley Able
  • , Ashley E. Green
  • , Jennifer Zhang
  • , Henry R. Condon
  • , Yuanyuan Wang
  • , Moira K. Dillon
  • , C. H. Albach
  • , Wail Baalawi
  •  & Dan M. Roden

All of Us Research Demonstration Project Teams

  • Seung Hoan Choi
  • , Elisabeth A. Rosenthal

NIH All of Us Research Program Staff

  • Andrea H. Ramirez
  • , Sokny Lim
  • , Siddhartha Nambiar
  • , Bradley Ozenberger
  • , Anastasia L. Wise
  • , Chris Lunt
  • , Geoffrey S. Ginsburg
  •  & Joshua C. Denny

Contributions

The All of Us Biobank (Mayo Clinic) collected, stored and plated participant biospecimens. The All of Us Genome Centers (Baylor-Hopkins Clinical Genome Center; Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine; and University of Washington School of Medicine) generated and QCed the whole-genomic data. The All of Us Data and Research Center (Vanderbilt University Medical Center, Broad Institute of MIT and Harvard, and Verily) generated the WGS joint call set, carried out quality assurance and QC analyses and developed the Researcher Workbench. All of Us Research Demonstration Project Teams contributed analyses. The other All of Us Genomics Investigators and NIH All of Us Research Program Staff provided crucial programmatic support. Members of the manuscript writing group (A.G.B., G.A.M., K.R.M., L.L., S.R., R.J.C. and A.M.) wrote the first draft of this manuscript, which was revised with contributions and feedback from all authors.

Corresponding author

Correspondence to Alexander G. Bick .

Ethics declarations

Competing interests.

D.M.M., G.A.M., E.V., K.W., J.H., H.D., C.L.K., M.M., S.D., Z.K., E. Boerwinkle and R.A.G. declare that Baylor Genetics is a Baylor College of Medicine affiliate that derives revenue from genetic testing. Eric Venner is affiliated with Codified Genomics, a provider of genetic interpretation. E.E.E. is a scientific advisory board member of Variant Bio, Inc. A.G.B. is a scientific advisory board member of TenSixteen Bio. The remaining authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Timothy Frayling and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 historic availability of ehr records in all of us v7 controlled tier curated data repository (n = 413,457)..

For better visibility, the plot shows growth starting in 2010.

Extended Data Fig. 2 Overview of the Genomic Data Curation Pipeline for WGS samples.

The Data and Research Center (DRC) performs additional single sample quality control (QC) on the data as it arrives from the Genome Centers. The variants from samples that pass this QC are loaded into the Genomic Variant Store (GVS), where we jointly call the variants and apply additional QC. We apply a joint call set QC process, which is stored with the call set. The entire joint call set is rendered as a Hail Variant Dataset (VDS), which can be accessed from the analysis notebooks in the Researcher Workbench. Subsections of the genome are extracted from the VDS and rendered in different formats with all participants. Auxiliary data can also be accessed through the Researcher Workbench. This includes variant functional annotations, joint call set QC results, predicted ancestry, and relatedness. Auxiliary data are derived from GVS (arrow not shown) and the VDS. The Cohort Builder directly queries GVS when researchers request genomic data for subsets of samples. Aligned reads, as cram files, are available in the Researcher Workbench (not shown). The graphics of the dish, gene and computer and the All of Us logo are reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Extended Data Fig. 3 Proportion of allelic frequencies (AF), stratified by computed ancestry with over 10,000 participants.

Bar counts are not cumulative (eg, “pop AF < 0.01” does not include “pop AF < 0.001”).

Extended Data Fig. 4 Distribution of pathogenic, and likely pathogenic ClinVar variants.

Stratified by ancestry filtered to only those variants that are found in allele count (AC) < 40 individuals for 245,388 short read WGS samples.

Extended Data Fig. 5 Ancestry specific HLA-DQB1 ( rs9273363 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight ancestry specific consequences across ancestries.

Extended Data Fig. 6 Ancestry specific TCF7L2 ( rs7903146 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight diabetic consequences across ancestries.

Supplementary information

Supplementary information.

Supplementary Figs. 1–7, Tables 1–8 and Note.

Reporting Summary

Supplementary dataset 1.

Associations of ACKR1, HLA-DQB1 and TCF7L2 loci with all Phecodes stratified by genetic ancestry.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature (2024). https://doi.org/10.1038/s41586-023-06957-x

Download citation

Received : 22 July 2022

Accepted : 08 December 2023

Published : 19 February 2024

DOI : https://doi.org/10.1038/s41586-023-06957-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

clinical research paper

  • See us on facebook
  • See us on twitter
  • See us on youtube
  • See us on linkedin
  • See us on instagram

Stanford Medicine study identifies distinct brain organization patterns in women and men

Stanford Medicine researchers have developed a powerful new artificial intelligence model that can distinguish between male and female brains.

February 20, 2024

sex differences in brain

'A key motivation for this study is that sex plays a crucial role in human brain development, in aging, and in the manifestation of psychiatric and neurological disorders,' said Vinod Menon. clelia-clelia

A new study by Stanford Medicine investigators unveils a new artificial intelligence model that was more than 90% successful at determining whether scans of brain activity came from a woman or a man.

The findings, published Feb. 20 in the Proceedings of the National Academy of Sciences, help resolve a long-term controversy about whether reliable sex differences exist in the human brain and suggest that understanding these differences may be critical to addressing neuropsychiatric conditions that affect women and men differently.

“A key motivation for this study is that sex plays a crucial role in human brain development, in aging, and in the manifestation of psychiatric and neurological disorders,” said Vinod Menon , PhD, professor of psychiatry and behavioral sciences and director of the Stanford Cognitive and Systems Neuroscience Laboratory . “Identifying consistent and replicable sex differences in the healthy adult brain is a critical step toward a deeper understanding of sex-specific vulnerabilities in psychiatric and neurological disorders.”

Menon is the study’s senior author. The lead authors are senior research scientist Srikanth Ryali , PhD, and academic staff researcher Yuan Zhang , PhD.

“Hotspots” that most helped the model distinguish male brains from female ones include the default mode network, a brain system that helps us process self-referential information, and the striatum and limbic network, which are involved in learning and how we respond to rewards.

The investigators noted that this work does not weigh in on whether sex-related differences arise early in life or may be driven by hormonal differences or the different societal circumstances that men and women may be more likely to encounter.

Uncovering brain differences

The extent to which a person’s sex affects how their brain is organized and operates has long been a point of dispute among scientists. While we know the sex chromosomes we are born with help determine the cocktail of hormones our brains are exposed to — particularly during early development, puberty and aging — researchers have long struggled to connect sex to concrete differences in the human brain. Brain structures tend to look much the same in men and women, and previous research examining how brain regions work together has also largely failed to turn up consistent brain indicators of sex.

test

Vinod Menon

In their current study, Menon and his team took advantage of recent advances in artificial intelligence, as well as access to multiple large datasets, to pursue a more powerful analysis than has previously been employed. First, they created a deep neural network model, which learns to classify brain imaging data: As the researchers showed brain scans to the model and told it that it was looking at a male or female brain, the model started to “notice” what subtle patterns could help it tell the difference.

This model demonstrated superior performance compared with those in previous studies, in part because it used a deep neural network that analyzes dynamic MRI scans. This approach captures the intricate interplay among different brain regions. When the researchers tested the model on around 1,500 brain scans, it could almost always tell if the scan came from a woman or a man.

The model’s success suggests that detectable sex differences do exist in the brain but just haven’t been picked up reliably before. The fact that it worked so well in different datasets, including brain scans from multiple sites in the U.S. and Europe, make the findings especially convincing as it controls for many confounds that can plague studies of this kind.

“This is a very strong piece of evidence that sex is a robust determinant of human brain organization,” Menon said.

Making predictions

Until recently, a model like the one Menon’s team employed would help researchers sort brains into different groups but wouldn’t provide information about how the sorting happened. Today, however, researchers have access to a tool called “explainable AI,” which can sift through vast amounts of data to explain how a model’s decisions are made.

Using explainable AI, Menon and his team identified the brain networks that were most important to the model’s judgment of whether a brain scan came from a man or a woman. They found the model was most often looking to the default mode network, striatum, and the limbic network to make the call.

The team then wondered if they could create another model that could predict how well participants would do on certain cognitive tasks based on functional brain features that differ between women and men. They developed sex-specific models of cognitive abilities: One model effectively predicted cognitive performance in men but not women, and another in women but not men. The findings indicate that functional brain characteristics varying between sexes have significant behavioral implications.

“These models worked really well because we successfully separated brain patterns between sexes,” Menon said. “That tells me that overlooking sex differences in brain organization could lead us to miss key factors underlying neuropsychiatric disorders.”

While the team applied their deep neural network model to questions about sex differences, Menon says the model can be applied to answer questions regarding how just about any aspect of brain connectivity might relate to any kind of cognitive ability or behavior. He and his team plan to make their model publicly available for any researcher to use.

“Our AI models have very broad applicability,” Menon said. “A researcher could use our models to look for brain differences linked to learning impairments or social functioning differences, for instance — aspects we are keen to understand better to aid individuals in adapting to and surmounting these challenges.”

The research was sponsored by the National Institutes of Health (grants MH084164, EB022907, MH121069, K25HD074652 and AG072114), the Transdisciplinary Initiative, the Uytengsu-Hamilton 22q11 Programs, the Stanford Maternal and Child Health Research Institute, and the NARSAD Young Investigator Award.

About Stanford Medicine

Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients. For more information, please visit med.stanford.edu .

Artificial intelligence

Exploring ways AI is applied to health care

Stanford Medicine Magazine: AI

  • Share full article

clinical research paper

A Columbia Surgeon’s Study Was Pulled. He Kept Publishing Flawed Data.

The quiet withdrawal of a 2021 cancer study by Dr. Sam Yoon highlights scientific publishers’ lack of transparency around data problems.

Supported by

Benjamin Mueller

By Benjamin Mueller

Benjamin Mueller covers medical science and has reported on several research scandals.

  • Feb. 15, 2024

The stomach cancer study was shot through with suspicious data. Identical constellations of cells were said to depict separate experiments on wholly different biological lineages. Photos of tumor-stricken mice, used to show that a drug reduced cancer growth, had been featured in two previous papers describing other treatments.

Problems with the study were severe enough that its publisher, after finding that the paper violated ethics guidelines, formally withdrew it within a few months of its publication in 2021. The study was then wiped from the internet, leaving behind a barren web page that said nothing about the reasons for its removal.

As it turned out, the flawed study was part of a pattern. Since 2008, two of its authors — Dr. Sam S. Yoon, chief of a cancer surgery division at Columbia University’s medical center, and a more junior cancer biologist — have collaborated with a rotating cast of researchers on a combined 26 articles that a British scientific sleuth has publicly flagged for containing suspect data. A medical journal retracted one of them this month after inquiries from The New York Times.

A person walks across a covered walkway connecting two buildings over a road with parked cars. A large, blue sign on the walkway says "Columbia University Irving Medical Center."

Memorial Sloan Kettering Cancer Center, where Dr. Yoon worked when much of the research was done, is now investigating the studies. Columbia’s medical center declined to comment on specific allegations, saying only that it reviews “any concerns about scientific integrity brought to our attention.”

Dr. Yoon, who has said his research could lead to better cancer treatments , did not answer repeated questions. Attempts to speak to the other researcher, Changhwan Yoon, an associate research scientist at Columbia, were also unsuccessful.

The allegations were aired in recent months in online comments on a science forum and in a blog post by Sholto David, an independent molecular biologist. He has ferreted out problems in a raft of high-profile cancer research , including dozens of papers at a Harvard cancer center that were subsequently referred for retractions or corrections.

From his flat in Wales , Dr. David pores over published images of cells, tumors and mice in his spare time and then reports slip-ups, trying to close the gap between people’s regard for academic research and the sometimes shoddier realities of the profession.

When evaluating scientific images, it is difficult to distinguish sloppy copy-and-paste errors from deliberate doctoring of data. Two other imaging experts who reviewed the allegations at the request of The Times said some of the discrepancies identified by Dr. David bore signs of manipulation, like flipped, rotated or seemingly digitally altered images.

Armed with A.I.-powered detection tools, scientists and bloggers have recently exposed a growing body of such questionable research, like the faulty papers at Harvard’s Dana-Farber Cancer Institute and studies by Stanford’s president that led to his resignation last year.

But those high-profile cases were merely the tip of the iceberg, experts said. A deeper pool of unreliable research has gone unaddressed for years, shielded in part by powerful scientific publishers driven to put out huge volumes of studies while avoiding the reputational damage of retracting them publicly.

The quiet removal of the 2021 stomach cancer study from Dr. Yoon’s lab, a copy of which was reviewed by The Times, illustrates how that system of scientific publishing has helped enable faulty research, experts said. In some cases, critical medical fields have remained seeded with erroneous studies.

“The journals do the bare minimum,” said Elisabeth Bik, a microbiologist and image expert who described Dr. Yoon’s papers as showing a worrisome pattern of copied or doctored data. “There’s no oversight.”

Memorial Sloan Kettering, where portions of the stomach cancer research were done, said no one — not the journal nor the researchers — had ever told administrators that the paper was withdrawn or why it had been. The study said it was supported in part by federal funding given to the cancer center.

Dr. Yoon, a stomach cancer specialist and a proponent of robotic surgery, kept climbing the academic ranks, bringing his junior researcher along with him. In September 2021, around the time the study was published, he joined Columbia, which celebrated his prolific research output in a news release . His work was financed in part by half a million dollars in federal research money that year, adding to a career haul of nearly $5 million in federal funds.

The decision by the stomach cancer study’s publisher, Elsevier, not to post an explanation for the paper’s removal made it less likely that the episode would draw public attention or affect the duo’s work. That very study continued to be cited in papers by other scientists .

And as recently as last year, Dr. Yoon’s lab published more studies containing identical images that were said to depict separate experiments, according to Dr. David’s analyses.

The researchers’ suspicious publications stretch back 16 years. Over time, relatively minor image copies in papers by Dr. Yoon gave way to more serious discrepancies in studies he collaborated on with Changhwan Yoon, Dr. David said. The pair, who are not related, began publishing articles together around 2013.

But neither their employers nor their publishers seemed to start investigating their work until this past fall, when Dr. David published his initial findings on For Better Science, a blog, and notified Memorial Sloan Kettering, Columbia and the journals. Memorial Sloan Kettering said it began its investigation then.

None of those flagged studies was retracted until last week. Three days after The Times asked publishers about the allegations, the journal Oncotarget retracted a 2016 study on combating certain pernicious cancers. In a retraction notice , the journal said the authors’ explanations for copied images “were deemed unacceptable.”

The belated action was symptomatic of what experts described as a broken system for policing scientific research.

A proliferation of medical journals, they said, has helped fuel demand for ever more research articles. But those same journals, many of them operated by multibillion-dollar publishing companies, often respond slowly or do nothing at all once one of those articles is shown to contain copied data. Journals retract papers at a fraction of the rate at which they publish ones with problems.

Springer Nature, which published nine of the articles that Dr. David said contained discrepancies across five journals, said it was investigating concerns. So did the American Association for Cancer Research, which published 10 articles under question from Dr. Yoon’s lab across four journals.

It is difficult to know who is responsible for errors in articles. Eleven of the scientists’ co-authors, including researchers at Harvard, Duke and Georgetown, did not answer emailed inquiries.

The articles under question examined why certain stomach and soft-tissue cancers withstood treatment, and how that resistance could be overcome.

The two independent image specialists said the volume of copied data, along with signs that some images had been rotated or similarly manipulated, suggested considerable sloppiness or worse.

“There are examples in this set that raise pretty serious red flags for the possibility of misconduct,” said Dr. Matthew Schrag, a Vanderbilt University neurologist who commented as part of his outside work on research integrity.

One set of 10 articles identified by Dr. David showed repeated reuse of identical or overlapping black-and-white images of cancer cells supposedly under different experimental conditions, he said.

“There’s no reason to have done that unless you weren’t doing the work,” Dr. David said.

One of those papers , published in 2012, was formally tagged with corrections. Unlike later studies, which were largely overseen by Dr. Yoon in New York, this paper was written by South Korea-based scientists, including Changhwan Yoon, who then worked in Seoul.

An immunologist in Norway randomly selected the paper as part of a screening of copied data in cancer journals. That led the paper’s publisher, the medical journal Oncogene, to add corrections in 2016.

But the journal did not catch all of the duplicated data , Dr. David said. And, he said, images from the study later turned up in identical form in another paper that remains uncorrected.

Copied cancer data kept recurring, Dr. David said. A picture of a small red tumor from a 2017 study reappeared in papers in 2020 and 2021 under different descriptions, he said. A ruler included in the pictures for scale wound up in two different positions.

The 2020 study included another tumor image that Dr. David said appeared to be a mirror image of one previously published by Dr. Yoon’s lab. And the 2021 study featured a color version of a tumor that had appeared in an earlier paper atop a different section of ruler, Dr. David said.

“This is another example where this looks intentionally done,” Dr. Bik said.

The researchers were faced with more serious action when the publisher Elsevier withdrew the stomach cancer study that had been published online in 2021. “The editors determined that the article violated journal publishing ethics guidelines,” Elsevier said.

Roland Herzog, the editor of Molecular Therapy, the journal where the article appeared, said that “image duplications were noticed” as part of a process of screening for discrepancies that the journal has since continued to beef up.

Because the problems were detected before the study was ever published in the print journal, Elsevier’s policy dictated that the article be taken down and no explanation posted online.

But that decision appeared to conflict with industry guidelines from the Committee on Publication Ethics . Posting articles online “usually constitutes publication,” those guidelines state. And when publishers pull such articles, the guidelines say, they should keep the work online for the sake of transparency and post “a clear notice of retraction.”

Dr. Herzog said he personally hoped that such an explanation could still be posted for the stomach cancer study. The journal editors and Elsevier, he said, are examining possible options.

The editors notified Dr. Yoon and Changhwan Yoon of the article’s removal, but neither scientist alerted Memorial Sloan Kettering, the hospital said. Columbia did not say whether it had been told.

Experts said the handling of the article was symptomatic of a tendency on the part of scientific publishers to obscure reports of lapses .

“This is typical, sweeping-things-under-the-rug kind of nonsense,” said Dr. Ivan Oransky, co-founder of Retraction Watch, which keeps a database of 47,000-plus retracted papers. “This is not good for the scientific record, to put it mildly.”

Susan C. Beachy contributed research.

Benjamin Mueller reports on health and medicine. He was previously a U.K. correspondent in London and a police reporter in New York. More about Benjamin Mueller

Advertisement

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Avicenna J Med
  • v.10(2); Apr-Jun 2020

Logo of avicjmed

How to read a published clinical trial: A practical guide for clinicians

Mohamad b sonbol.

1 Mayo Clinic Cancer Center, Phoenix, Arizona, USA

Belal M Firwana

2 Heartland Cancer Research, National Cancer Institute Community Oncology Research Program (NCORP), Missouri Baptist Medical Center, St. Louis, Missouri, USA

Talal Hilal

3 University of Mississippi Medical Center, Jackson, Mississippi, USA

Mohammad Hassan Murad

4 Evidence-Based Practice Center, Mayo Clinic, Rochester, Minnesota, USA

Over the last 5 years, there have been more than 140 new drug approvals in the field of Oncology alone, all based on newly published clinical trials. These approvals have led to an ongoing change in clinical practice, offering new therapeutic options for patients. Therefore, it is important for healthcare providers to be able to appraise a clinical trial and determine its validity, understand its results, and be able to apply such results to their patients. In this guide, we provide a simplified approach tailored to practicing clinicians and trainees. The same concepts and principles apply to other medical specialties.

INTRODUCTION

The modern practice of oncology is based on clinical trials, which have been increasingly conducted and published in the last 20 years. Over the last 5 years, there have been more than 140 anticancer drug approvals in the United States.[ 1 ] These approvals have led to an ongoing change in clinical practice, offering new therapeutic options for patients with cancer. Therefore, it is important for physicians to be able to appraise a clinical trial and determine its validity, understand its results, and be able to apply such results to their patients. In this guide, we provide a simplified approach based on the User’s Guide to the Medical Literature series tailored to practicing clinicians and trainees.[ 2 ] Although most of the included examples are from the oncology literature, the same concepts and principles would apply to other medical and surgical specialties.

Clinical case

A 56-year-old man with a history of diabetes mellitus who was recently diagnosed with metastatic non-small cell lung cancer comes to your oncology clinic for clinical care. You decide to start him on chemotherapy with carboplatin and pemetrexed. He is otherwise healthy and asymptomatic. His body mass index (BMI) is 41. He has reasonable functional status as measured by the Eastern Cooperative Oncology Group (ECOG) performance status (PS) score value of 1. His mother recently died of a pulmonary embolism (PE) and he is asking you about prevention of PE. You calculate his risk for chemotherapy-associated thrombosis using the Khorana score[ 3 ] and find it to be 2, which suggests an intermediate risk for venous thromboembolism. You are contemplating thromboprophylaxis and proceed to review the evidence.

What is a clinical trial?

A clinical trial is any research study that prospectively assigns human participants or groups of humans to one or more health-related interventions to evaluate the effects on health outcomes. Therefore, a clinical trial can be randomized (i.e., a randomized controlled trial [RCT]) or nonrandomized. For inference purposes, nonrandomized trials are at similar risk of bias to that of observational studies and can be appraised by focusing on cohort selection, comparability of study groups, and adjustment for confounders (discussed in the Comparability of the groups at the baseline section) (i.e., just like an observational study). For the most part, when clinicians think about trials, they are usually referring to RCTs which are the gold standard study design to ascertain the effect of therapy. The RCT design creates groups of patients that are similar in all known and unknown prognostic factors (i.e., confounders) except the intervention. RCTs can randomize the patients to groups and follow them prospectively (parallel RCTs) or can switch patients at random to different treatment regimens during the course of the trial (crossover RCTs). This guide will focus on parallel-design RCTs because they are more common and are critical for the practice of internal medicine and oncology. We will also focus on an example of a superiority trial for simplicity (i.e., a trial that aims to evaluate if the experimental treatment is better than the standard treatment or placebo), although many of the constructs and principles discussed here apply to noninferiority trials (i.e., a trial that aims to evaluate whether an experimental treatment is not importantly worse than a standard treatment).

Lastly, although we are discussing an approach to appraise and apply an RCT, it is important to keep in mind that having a systematic review and meta-analysis of multiple RCTs would likely give more precise and valid estimates and would be preferred if available.[ 4 ] Moreover, the fundamental principles of evidence-based medicine (EBM) are assumed, which include formulating an answerable question, identifying the best evidence, critically appraising the evidence, applying the evidence, and integrating clinical expertise and patient’s values with the evidence.[ 5 ] In this concise guide, we will critically appraise an RCT that aims to answer the following clinical question: What is the evidence supporting prophylactic anticoagulation in patients with cancer?

When reading a manuscript reporting the conduct and results of an RCT, one should ask three questions. How valid are the results (which is also expressed as to what extent does the risk of bias affect the trustworthiness of the results)? What are the results? How do I apply these results to patient care? This simplified approach is based on the User’s Guide to the Medical Literature series and also adapted in oncology.[ 2 , 6 ]

We have identified one RCT that addresses the clinical question of interest to our patient––“Apixaban to Prevent Venous Thromboembolism in Patients with Cancer”––the AVERT trial.[ 7 ]

The trial evaluates the efficacy and safety of apixaban (2.5mg twice daily) for thromboprophylaxis in ambulatory patients with cancer who were at intermediate-to-high risk for venous thromboembolism (Khorana score, ≥2) initiating chemotherapy.”[ 7 ]

How valid are the results (to what extent does the risk of bias affect the trustworthiness of the results)?

The validity of the RCT focuses on how well the study is conducted and addresses different types of bias.[ 8 ] Appraising the study’s internal validity can be achieved by evaluating the methods section and following a stepwise approach. Did the study: start well, run well, and finish well?[ 6 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is AJM-10-68-g001.jpg

A framework summarizing the steps in critically appraising a randomized controlled trial.

ITT = intention to treat

Was the allocation sequence random?

Randomization (also known as allocation sequence generation) ensures that the study participants have an equal chance of being assigned to either the intervention group or the control group, thereby decreasing the likelihood of an imbalance in baseline prognostic factors which can cause what is called selection bias. For example, if the fit and younger patients were assigned to one arm of a study, this arm will have better outcomes that are not caused by the intervention. Randomization is commonly performed using a computer-generated algorithm.

In the AVERT trial, the authors state in the methods section “eligible patients underwent randomization by means of a centralized, web-based randomization system to receive apixaban or placebo in a 1:1 ratio.”[ 7 ] The randomization in this trial is adequate.

Was the allocation sequence concealed until participants were enrolled and assigned to interventions?

When appraising the validity of a study, it is important to look at the method of randomization and whether it can prevent the predictability of the allocation (also known as concealment). Concealment means that both study participants and investigators are not aware, and cannot predict, which group the study participant (patient) will be assigned to. This is not to be confused with blinding of assigned interventions (discussed below). Allocation concealment happens prior to and at the time of randomization. Conversely, blinding occurs after randomization[ 9 , 10 , 11 , 12 ] [ Figure 2 ]. Patient enrolment can be concealed but not blinded. An example of that is the biliary tract cancer (BILCAP) trial, where treatments were not masked but allocation concealment was achieved.[ 13 ]

An external file that holds a picture, illustration, etc.
Object name is AJM-10-68-g002.jpg

A flow diagram showing blinding, concealment, and randomization

In the AVERT trial, the authors used a “centralized, web-based randomization” method which ensures that both participants and investigators could not foresee assignment.[ 7 ]

Comparability of the groups at the baseline

The benefit of randomization is in minimizing the imbalance and differences in baseline characteristics and prognostic factors between the groups. These differences are sometimes referred to as “confounders.” These baseline characteristics are almost always reported in Table 1 in RCTs. When detected, it is important to evaluate the importance of the prognostic factor imbalances (confounding) by asking the following questions: (1) Does the prognostic factor affect the outcome?; (2) If yes, which group is favored?; (3) Does this change the conclusion? This accounts for known confounders, but unknown confounders can always introduce bias. Potential unknown confounder imbalance can be minimized with appropriate randomization.

Primary efficacy and safety outcomes of the AVERT trial

N = total number of patients, n = number of events, RR = relative risk, ARR = absolute risk reduction, RD = risk difference, NNT = number needed to treat, HR = hazard ratio, CI = confidence interval

*Value calculated

† Value exported from reported trial results

In the AVERT trial, table 1 shows how the groups were comparable at the baseline, in terms of tumor type, Khorana score, PS, and others, including the use of concomitant antiplatelet medications.[ 7 ] It is important to look at the proportions in table 1 and determine whether they are clinically meaningful, and not to depend on reported P values. These P values are not meaningful (although commonly reported) because the trial is often underpowered to show significant differences in these variables.

This series of questions concerns performance bias and bias due to deviations from intended intervention and includes blinding, contamination, co-intervention, and compliance [ Figure 1 ].

Were participants and investigators aware of their assigned intervention during the trial?

Blinding refers to the process by which the study participants (patients), providers (nurses and physicians), investigators, and outcome assessors are kept unaware of treatment assignment throughout the study.[ 8 , 14 ] Blinding of patients and study personnel help in reducing performance bias that could occur upon the knowledge of the assignment. Performance biases arise from deviations from intended interventions. For example, if a study investigator is aware of treatment assignment, they might elect to monitor and see the patient in the novel therapy group more frequently than the control group. In addition, blinding of study participants helps in reducing the risk of the “placebo-effect” that can be detected in more subjective outcomes such as pain.[ 15 , 16 ] For example, in an RCT of patients with nasopharyngeal carcinoma, acupuncture significantly lowered radiation-induced xerostomia compared to standard care group (no acupuncture).[ 17 ] In this example, blinding of participants was not performed; however, it is hard to draw a clear conclusion from such trial when the outcomes (xerostomia and quality of life [QOL]) are subjective and could be affected by the “placebo-effect.” This has been described before where trials of acupuncture found benefit in treating pain compared to no treatment. However, this benefit was less significant when acupuncture was compared with sham control.[ 18 ] The effect of blinding in a study should be assessed for each individual outcome; it may be less important in more objective outcomes such as overall survival (OS).

In the AVERT trial, the authors state that “The AVERT trial was a randomized, placebo controlled, double-blind clinical trial.”[ 8 ] One can assume that “double blind” implies that patients and investigators were blinded. However, it is important to read the methods section to find out who was actually blinded.[ 19 ]

Was there any contamination or co-intervention

The study protocol usually specifies the intended interventions in each study group. When a study participant (patient) receives a non-protocol intervention, it is usually referred to as “co-intervention.” On the contrary, when a study participant receives the intervention that is assigned to the other study group, it is referred to as “contamination.”

In the AVERT trial, 23% and 22.6% of patients in the apixaban and placebo groups, respectively, received a concomitant antiplatelet medication (a co-intervention), which could potentially affect the primary outcomes of bleeding and clotting in such trial.[ 8 ] However, as both groups equally received this co-intervention, this will unlikely bias the results.

Was there nonadherence to the assigned intervention regimen that could have affected participants’ outcomes?

Compliance of the study participants to the intervention they are assigned to is referred to as “adherence.” It is important when appraising a trial to look at the reported adherence and whether there is a significant difference between groups. This is especially important in oncology RCTs where adverse events and safety profile of the studied medications play a major role in patients’ compliance.[ 20 ] For example, in the recently reported BILCAP trial studying the effect of adjuvant capecitabine compared to observation following surgery in patients with BILCAP, only half of the patients (55%) completed the planned eight cycles of capecitabine with third of the patients discontinuing treatment secondary to toxicity.[ 13 ]

In the AVERT trial, the authors state that “The rate of adherence to the trial regimen was high in both groups, at 83.6% in the apixaban group and 84.1% in the placebo group.”

FINISH WELL

The method of analysis and completion of follow-up are important factors that affect trial validity.

Were all patients who entered the trial accounted for? And were they analyzed in the groups to which they were randomized? Were there any lost to follow-up?

The principle of intention to treat (ITT) analysis indicates that participants should be analyzed based on the intervention group to which they were assigned, regardless of their adherence to the intervention or lost to follow-up (participant cannot be located).[ 21 ]

This is in contrast to the per-protocol analysis, which only analyzes the individuals who adhered to the intervention. ITT analysis maintains the benefit of randomization in minimizing any prognostic differences between groups. In contrast, the problem with the per-protocol analysis is that prognostic factors might influence whether individuals receive their allocated intervention. In RCTs assessing a superiority outcome, ITT is suggested for the most part. Some trials report both ITT and per-protocol analysis; for example, the previously mentioned BILCAP trial reported the OS results using both ITT and per-protocol analyses, with significant improvement in outcome seen with per-protocol analysis, but not with ITT analysis, reducing the trustworthiness or believability of the results.[ 13 ]

In some trials, instead of reporting ITT, a modified intention to treat (mITT) is reported. The definition of such an analysis is variable between trials and mostly generates post-randomization exclusions that potentially bias results making interpretation of such analyses challenging.[ 22 ]

In the AVERT trial, the primary analysis was performed in the “modified intention-to-treat” population, which included all patients who undergone randomization and received at least one dose of apixaban or placebo on or before day180.[ 7 ] Although ITT is the preferred approach, in this study the mITT is likely adequate and would not be expected to greatly alter the observed effect size compared to ITT. This modification––analyzing patients who received at least one dose of the study drug––is commonly seen in studies assessing differences in adverse drug events between treatment groups because it could be considered inappropriate to attribute an adverse drug event to a medication never received by the patient.[ 23 ] Although a threshold of >20% patients lost to follow-up is sometimes used to assess whether the number of patients of lost to follow-up is not acceptable, these arbitrary cutoffs can sometimes be misleading. It is important to compare the proportion lost to follow-up to the event rate in the trial. It is also important to conduct what is called a worst-case scenario in which we assume that patients lost to follow-up had bad outcome. If this new analysis shows results that are different from the original analysis, validity is then reduced.

Was outcome assessment blinded?

As described above, in addition to blinding patients and investigators, it is important to have blinding of outcome assessors. Indeed, the effect of blinding in a study should be assessed for each individual outcome as it is probably less important in objective outcomes as OS (death or alive) compared to progression-free survival.[ 24 ]

In the AVERT trial, outcomes were assessed by blinded investigators “All trial outcomes were adjudicated by an independent adjudication committee whose members were unaware of the treatment assignment.”

WHAT ARE THE RESULTS?

Once trial validity is established (i.e., risk of bias is low or unlikely to impact the conclusions), results need to be interpreted by asking about the magnitude of the effect and its precision.

What is the magnitude of the treatment effect?

There are several commonly used methods that are referred to as “measures of association” to assess the magnitude of treatment effect in clinical trials, including but not limited to relative risk (RR), odds ratio (OR), risk difference (RD), and hazard ratio (HR).

Relative risk and relative risk reduction

RR is the risk of disease or outcome in the treatment or exposed arm compared (relative) to the risk of the outcome in the control arm, hence the name RR.

On the contrary, relative risk reduction (RRR) is an estimate of the percentage of baseline risk (the control arm risk) that is reduced by receiving the experimental therapy, which is calculated as subtracting RR from 1 (1 – RR). For example, looking at the outcome table for the AVERT trial [ Table 1 ], the risk of venous thromboembolism (VTE) in apixaban group is 12/288 = 4.2% (also known as experimental event rate or EER) and the risk of VTE in placebo group is 28/275 = 10.2% (also known as control event rate or CER). Compared to patients in the placebo group, patients assigned to the apixaban group have almost half of the risk (41%) of the patients in the placebo group 4.2/10.2 = 41%. This is also known as RR. In other words, this means that apixaban decreased the RR by 1–0.41 (41%) = 59%. This is known as RRR.

One example of using RR in cancer clinical trials is when assessing response rates in the experimental and control arms. For example, in the Keynote-189 trial, comparing pembrolizumab plus chemotherapy versus chemotherapy alone in metastatic non-small-cell lung cancer,[ 25 ] objective response rates were 47.6% versus 18.9%, with an RR of 2.5, meaning that the experimental regimen results in 2.5 times better responses compared to the control arm.

OR is another relative association measure that is similar to RR. However, it is a ratio of odds, not risks. Odds are events/nonevents, whereas risk is events/total exposed sample. When the event rate is low (<10%), OR and RR become very similar.[ 26 ]

Risk difference

Although relative measures (RR and RRR, OR and HR) are very helpful to depict the direction of the association, they do not give the full picture, especially when interpreting data or discussing with patients. Therefore, reporting absolute measures is as important, namely the RD, which is the proportion of the event in the experimental arm subtracted from the proportion of the even in the control arm. In other words, it is the proportion of patients who are spared the undesired outcome having received the experimental rather than the control treatment. RD of 0 means the events occurred equally in both groups. RD is sometimes called absolute risk reduction (ARR) or absolute risk increase (ARI) based on the direction of the effect. When interpreting RCTs, it is important to look at both ARR and RRR, as looking at relative measures can be deceiving and tends to overestimate results. In a hypothetical example, an RR of 50% could represent an ARR of 30% (if the absolute risk improved from 60% to 30%), or that same RR of 50% could represent and ARR of 2% (if the absolute risk improved from 4% to 2%).

In our example [ Table 1 ], in the AVERT trial: baseline risk of VTE in the placebo group is 10.2% and is decreased to 4.2% in the apixaban group. Therefore, giving apixaban decreased the risk of VTE by 10.2–4.2 = 6%, which is the RD.

Number needed to treat/harm

Another important measure of association is the number needed to treat (NNT). This reflects the number of patients who needs to be treated in order to prevent one event (in this case, VTE). NNT = 1/ARR (when ARR is in percentage, this would be NNT = 100/ARR). In the AVERT trial, the NNT = 100/6 = 16.6 patients. In the same way, we can calculate the number needed to harm (NNH), which is the number of patients who need to be treated in order to harm one patient or cause one undesired event. The risk of bleeding in the apixaban group is 3.5% and in the placebo group it is 1.8%. The RD is 1.7% (3.5–1.8). For 100 patients treated, 1.7% get harmed. The NNH = 100/1.7 = 58.8 patients.

These numbers are useful when evaluating the magnitude of effect and safety of the intervention by comparing NNH and NNT. For apixaban, for each 16 patients we treat we benefit 1, and for each 58 patients we treat, 1 would be harmed. We obviously seek drugs with low NNT and high NNH.

Hazard ratio

The HR is a relative association measure used for outcomes of survival in cancer clinical trials. Although calculated differently,[ 27 ] for practical purposes it can be interpreted as an RR averaged over the course of a trial and can be expected at any given time during the follow-up. The calculation of HR includes the element of time (i.e., how long an event took to occur vs. did it occur or not). HR of 1 means no effect; HR of 2 means that the intervention doubles the risk of outcome; and HR of 0.5 means that the intervention halves the risk of outcome. HR should always be interpreted with consideration of the associated length of survival. In the trial of erlotinib plus gemcitabine compared with gemcitabine in patients with advanced pancreatic cancer,[ 28 ] median survival time was 6.24 months in the experimental arm of combination therapy versus 5.91 months in the gemcitabine arm. Thus, although the HR of 0.82 suggests improved survival, the actual difference in survival could be trivial.

How precise is the estimate of treatment effect?

Confidence intervals (CIs) in RCTs identify a range of values within which it is probable that the true effect of treatment lies. In most trials, 95% CI is estimated to indicate that if the trial was repeated 100 times, 95% of the CI would include the true effect; the wider the CI, the less precise the estimate. For example, in the Keynote-189 trial, the HR for death was 0.49 with a reported P < 0.001 (which means that this effect is statistically significant because it is <0.05, the arbitrary cutoff for significance). This HR of 0.49 had a 95% CI of 0.38–0.64. When making a decision, one should consider precision. If our decision would be the same whether the lower or the upper boundaries were the truth, then the results are sufficiently precise. In this case, the precision is adequate.

HOW DO I APPLY THE RESULTS TO MY PATIENTS?

Applicability is a form of external validity.[ 29 ] To assess applicability, one should ask the following questions:

Were the study patients similar to my patients?

This question can be answered by looking at inclusion and exclusion criteria for the RCT and compare them to the characteristics of the patient of interest. RCTs with long lists of exclusion criteria (e.g., comorbid conditions) may be less applicable in real practice. Furthermore, RCTs in oncology can be regional, a few countries in the same region, or international, spanning multiple regions and countries, which makes generalizability variable depending on the regions where the RCT was conducted. For example, the oral fluoropyrimidine, S-1, was shown to improve OS as an adjuvant chemotherapy option in patients with curatively resected gastric cancer in Japan only.[ 30 ] It has yet to be approved in the United States due to this regional variation, which limits generalizability of drug metabolism and efficacy data to Western patients. However, we should not expect a perfect match and we should anticipate that most of the time relative treatment effects apply to patients with various characteristics.

In the AVERT trial, one of the inclusion criteria was a Khorana score of 2, an intermediate risk category associated with only 1%–2% risk of VTE.[ 3 ] Approximately two-thirds of participants in the AVERT trial had a Khorana score of 2. Using apixaban in this group of patients may be associated with greater harm than benefit as the baseline risk of VTE is very small.

Were all clinically meaningful outcomes considered?

When a drug produces small increments in hemoglobin level, or that a chemotherapy agent causes tumors to shrink above a specific threshold (i.e., response rate), this may not provide sufficient justification for recommending these interventions to patients. These are surrogate outcomes that may or may not lead to an improvement in clinically meaningful, patient-important outcomes, such as QOL or OS.

In the AVERT trial, investigators preemptively evaluated for the presence of VTE with imaging in the absence of symptoms or signs of VTE, a practice that is not commonly performed or indicated for most VTEs. This probably led to the diagnosis of many incidental VTE, which otherwise may have not been found or caused important morbidity or mortality.

Do treatment benefits outweigh the potential risks (harm and costs)?

We evaluate the patient’s baseline risk to determine whether introducing an intervention would be worthwhile. A low baseline risk usually means the RD will be low and NNT will be high. Knowing these absolute measures can assist clinicians in helping patients weigh the benefits and risks of each potential intervention. Ultimately, the values and preferences for each individual patient need to be considered before recommending one therapy over another.

CONCLUSION OF THE CLINICAL SCENARIO

After applying the framework [ Figure 1 ], you found that this RCT (AVERT trial) was at low risk of bias. However, after you discuss the efficacy data along with the underlying risk of bleeding in this patient, the patient decided not to start the medication. A different patient with similar characteristics (Khorana score of 2) might elect to accept such risk in return to the benefits seen. This emphasizes the importance of shared-decision making when applying evidence to individual patients.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

IMAGES

  1. How To Write A Clinical

    clinical research paper

  2. How to Write a Clinical Paper

    clinical research paper

  3. clinical trial papers

    clinical research paper

  4. Clinical Research White Paper Template

    clinical research paper

  5. (PDF) How to Review a Clinical Research Paper

    clinical research paper

  6. 43+ Research Paper Examples

    clinical research paper

VIDEO

  1. Fundamentals of Clinical Research Series 3

  2. Laura Shane-McWhorter

  3. Different Types of Research in Nursing

  4. Clinical biochemistry paper 📃📄#3rd semesters 2 year #Dmlt branch #exam #like and subscribe 💜

  5. What Is A (Systematic) Review Paper?

  6. Guess Paper Of Clinical Biochemistry || Major And Minor || BG 3rd Sem || KU || Batch 2022 ||

COMMENTS

  1. Clinical research study designs: The essentials

    In clinical research, our aim is to design a study, which would be able to derive a valid and meaningful scientific conclusion using appropriate statistical methods that can be translated to the "real world" setting. 1 Before choosing a study design, one must establish aims and objectives of the study, and choose an appropriate target population...

  2. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine

    A two-dose regimen of BNT162b2 (30 μg per dose, given 21 days apart) was found to be safe and 95% effective against Covid-19. The vaccine met both primary efficacy end points, with more than a 99 ...

  3. The BMJ research homepage: make an impact, change clinical practice

    1 response Editorial Rehabilitation for post-covid-19 condition Research paper Atypia detected during breast screening and subsequent development of cancer 2 responses Research paper Comparative effectiveness of GLP-1 receptor agonists on glycaemic control, body weight, and lipid profile for type 2 diabetes 2 responses Research methods & reporting

  4. Clinical research

    Jonathan J. Lau Joseph T. Wu Article Open Access 16 Dec 2023 Nature Communications Persistent humoral immune response in youth throughout the COVID-19 pandemic: prospective school-based cohort...

  5. Clinical Trials: Sage Journals

    First published Feb 12, 2024 2023 Peer Reviewers Open Access Research article First published Feb 6, 2024 Patient-reported measures of tinnitus for individuals with neurofibromatosis type 2-related schwannomatosis: Recommendations for clinical trials Heather L Thompson Jane Grabowski Barbara Franklin [...] View all Restricted access

  6. What makes a high quality clinical research paper?

    A high quality research paper should start with high quality research. But that's easier said than done. In new or particularly difficult areas of research, partic-ularly when there are ethical constraints, imperfectly conducted studies often have to suffice. And when clinicians conduct studies the high relevance of their research may have to ...

  7. How to Review a Clinical Research Paper

    How to Review a Clinical Research Paper Michael D. Hill Originally published 5 Apr 2018 https://doi.org/10.1161/STROKEAHA.118.021286 Stroke. 2018;49:e204-e206 Other version (s) of this article Peer review is an essential component of the scientific process.

  8. The BMJ original medical research articles

    Fracture risk reduction and safety by osteoporosis treatment in women. May 2, 2023. Can't find what you're looking for? Continue to all research articles. Original research studies that can improve decision making in clinical medicine, public health, health care policy, medical education, or biomedical research.

  9. Recently Published

    Medical news & recently published articles: The New England Journal of Medicine Recently Published Editorials Feb 21, 2024 Advancing Second-Line Treatment for Primary Biliary Cholangitis D.N....

  10. How to Write and Publish Clinical Research in Medical School

    1. Build your network to find publication opportunities in medical school When looking for projects, finding great mentors is often more useful than finding the perfect project. This is especially true when starting out. Use your time on clerkships to identify attending and resident mentors who you trust to support your budding author ambitions.

  11. How to Write a Clinical Paper

    How to Write a Clinical Paper Brendan Coleman Chapter First Online: 02 February 2019 2278 Accesses Abstract Publishing the results of your research is a fulfilling experience, both from a personal viewpoint, but also helping advance the orthopaedic knowledge.

  12. (PDF) Critical appraisal of a clinical research paper: What one needs

    All components of a clinical research article need to be appraised as per the study design and conduct. As research bias can be introduced at every step in the flow of a study leading to...

  13. Handbook for Good Clinical Research Practice (Gcp)

    Good Clinical Research Practice (GCP) is a process that incorporates established ethical and scientifi c quality standards for the design, conduct, recording and reporting of clinical research involving the participation of human subjects. Compliance with GCP provides public assurance that the rights, safety, and well-being of research

  14. Journal of Clinical Psychology

    The Journal of Clinical Psychology is a clinical psychology and psychotherapy journal devoted to research, assessment, and practice in clinical psychological science. In addition to papers on psychopathology, psychodiagnostics, and the psychotherapeutic process, we welcome articles on psychotherapy effectiveness research, psychological assessment and treatment matching, clinical outcomes ...

  15. Clinical Research: An Overview of Study Types, Designs, and Their

    Clinical trials or clinical research are conducted to improve the understanding of the unknown, test a hypothesis, and perform public health-related research [2, 3]. This is majorly carried...

  16. 'It depends': what 86 systematic reviews tell us about what strategies

    The gap between research findings and clinical practice is well documented and a range of interventions has been developed to increase the implementation of research into clinical practice [1, 2].In recent years researchers have worked to improve the consistency in the ways in which these interventions (often called strategies) are described to support their evaluation.

  17. Genomic data in the All of Us Research Program

    A study describes the release of clinical-grade whole-genome sequence data for 245,388 diverse participants by the All of Us Research Program and characterizes the properties of the dataset.

  18. Login to your account

    CM, JK, and EG managed all aspects of Clinical Trials Unit operations. The APIPPRA study investigators contributed to identification and recruitment of at-risk individuals, collection of clinical and imaging data, development of the ultrasonography protocol, interpretation of the data, and review of the final manuscript.

  19. Clinical research study designs: The essentials

    Introduction. In clinical research, our aim is to design a study, which would be able to derive a valid and meaningful scientific conclusion using appropriate statistical methods that can be translated to the "real world" setting. 1 Before choosing a study design, one must establish aims and objectives of the study, and choose an appropriate target population that is most representative of ...

  20. Stanford Medicine study identifies distinct brain organization patterns

    The research was sponsored by the National Institutes of Health (grants MH084164, EB022907, MH121069, K25HD074652 and AG072114), the Transdisciplinary Initiative, the Uytengsu-Hamilton 22q11 Programs, the Stanford Maternal and Child Health Research Institute, and the NARSAD Young Investigator Award.

  21. A Columbia Surgeon's Study Was Pulled. He Kept Publishing Flawed Data

    An immunologist in Norway randomly selected the paper as part of a screening of copied data in cancer journals. That led the paper's publisher, the medical journal Oncogene, to add corrections ...

  22. How to read a published clinical trial: A practical guide for

    A clinical trial is any research study that prospectively assigns human participants or groups of humans to one or more health-related interventions to evaluate the effects on health outcomes. Therefore, a clinical trial can be randomized (i.e., a randomized controlled trial [RCT]) or nonrandomized. For inference purposes, nonrandomized trials ...