U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Trending Articles

  • Biomarker Changes during 20 Years Preceding Alzheimer's Disease. Jia J, et al. N Engl J Med. 2024. PMID: 38381674
  • CD19 CAR T-Cell Therapy in Autoimmune Disease - A Case Series with Follow-up. Müller F, et al. N Engl J Med. 2024. PMID: 38381673
  • Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Suzuki K, et al. Nature. 2024. PMID: 38374256
  • A versatile CRISPR-Cas13d platform for multiplexed transcriptomic regulation and metabolic engineering in primary human T cells. Tieu V, et al. Cell. 2024. PMID: 38387457
  • PRMT1 sustains de novo fatty acid synthesis by methylating PHGDH to drive chemoresistance in triple-negative breast cancer. Yamamoto T, et al. Cancer Res. 2024. PMID: 38383964

Latest Literature

  • Am Heart J (4)
  • Am J Med (7)
  • Arch Phys Med Rehabil (1)
  • Cell Metab (1)
  • Gastroenterology (1)
  • J Am Acad Dermatol (2)
  • J Biol Chem (3)
  • Lancet (13)
  • Nat Commun (31)

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

PubMed Central (PMC) Home Page

PubMed Central ® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM)

Discover a digital archive of scholarly articles, spanning centuries of scientific research.

Learn how to find and read articles of interest to you.

Collections

Browse the PMC Journal List or learn about some of PMC's unique collections.

For Authors

Navigate the PMC submission methods to comply with a funder mandate, expand access, and ensure preservation.

For Publishers

Learn about deposit options for journals and publishers and the PMC selection process.

For Developers

Find tools for bulk download, text mining, and other machine analysis.

9.7 MILLION articles are archived in PMC.

Content provided in part by:, full participation journals.

Journals deposit the complete contents of each issue or volume.

NIH Portfolio Journals

Journals deposit all NIH-funded articles as defined by the NIH Public Access Policy.

Selective Deposit Programs

Publisher deposits a subset of articles from a collection of journals.

Dec. 15, 2023

Update on pubreader format.

The PubReader format was added to PMC in 2012 to make it easier to read full text articles on tablet, mobile, and oth…

Aug. 30, 2023

Update on access to coronavirus-related articles in pubmed central (pmc) covid-19 collection after end of public health emergency.

Early in the COVID-19 pandemic, the National Library of Medicine (NLM) collaborated with publishers and scholarly soci…

Second Phase of the NIH Preprint Pilot Launched

Example of the preprint banner that displays on preprints in PMC

The second phase launched in January 2023 and expands the scope of the Pilot to include preprints resulting from all NIH-funded research.

  • All subject areas
  • Agricultural and Biological Sciences
  • Arts and Humanities
  • Biochemistry, Genetics and Molecular Biology
  • Business, Management and Accounting
  • Chemical Engineering
  • Computer Science
  • Decision Sciences
  • Earth and Planetary Sciences
  • Economics, Econometrics and Finance
  • Engineering
  • Environmental Science
  • Health Professions
  • Immunology and Microbiology
  • Materials Science
  • Mathematics
  • Multidisciplinary
  • Neuroscience
  • Pharmacology, Toxicology and Pharmaceutics
  • Physics and Astronomy
  • Social Sciences
  • All subject categories
  • Acoustics and Ultrasonics
  • Advanced and Specialized Nursing
  • Aerospace Engineering
  • Agricultural and Biological Sciences (miscellaneous)
  • Agronomy and Crop Science
  • Algebra and Number Theory
  • Analytical Chemistry
  • Anesthesiology and Pain Medicine
  • Animal Science and Zoology
  • Anthropology
  • Applied Mathematics
  • Applied Microbiology and Biotechnology
  • Applied Psychology
  • Aquatic Science
  • Archeology (arts and humanities)
  • Architecture
  • Artificial Intelligence
  • Arts and Humanities (miscellaneous)
  • Assessment and Diagnosis
  • Astronomy and Astrophysics
  • Atmospheric Science
  • Atomic and Molecular Physics, and Optics
  • Automotive Engineering
  • Behavioral Neuroscience
  • Biochemistry
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Biochemistry (medical)
  • Bioengineering
  • Biological Psychiatry
  • Biomaterials
  • Biomedical Engineering
  • Biotechnology
  • Building and Construction
  • Business and International Management
  • Business, Management and Accounting (miscellaneous)
  • Cancer Research
  • Cardiology and Cardiovascular Medicine
  • Care Planning
  • Cell Biology
  • Cellular and Molecular Neuroscience
  • Ceramics and Composites
  • Chemical Engineering (miscellaneous)
  • Chemical Health and Safety
  • Chemistry (miscellaneous)
  • Chiropractics
  • Civil and Structural Engineering
  • Clinical Biochemistry
  • Clinical Psychology
  • Cognitive Neuroscience
  • Colloid and Surface Chemistry
  • Communication
  • Community and Home Care
  • Complementary and Alternative Medicine
  • Complementary and Manual Therapy
  • Computational Mathematics
  • Computational Mechanics
  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Computer Science Applications
  • Computer Science (miscellaneous)
  • Computers in Earth Sciences
  • Computer Vision and Pattern Recognition
  • Condensed Matter Physics
  • Conservation
  • Control and Optimization
  • Control and Systems Engineering
  • Critical Care and Intensive Care Medicine
  • Critical Care Nursing
  • Cultural Studies
  • Decision Sciences (miscellaneous)
  • Dental Assisting
  • Dental Hygiene
  • Dentistry (miscellaneous)
  • Dermatology
  • Development
  • Developmental and Educational Psychology
  • Developmental Biology
  • Developmental Neuroscience
  • Discrete Mathematics and Combinatorics
  • Drug Discovery
  • Drug Guides
  • Earth and Planetary Sciences (miscellaneous)
  • Earth-Surface Processes
  • Ecological Modeling
  • Ecology, Evolution, Behavior and Systematics
  • Economic Geology
  • Economics and Econometrics
  • Economics, Econometrics and Finance (miscellaneous)
  • Electrical and Electronic Engineering
  • Electrochemistry
  • Electronic, Optical and Magnetic Materials
  • Emergency Medical Services
  • Emergency Medicine
  • Emergency Nursing
  • Endocrine and Autonomic Systems
  • Endocrinology
  • Endocrinology, Diabetes and Metabolism
  • Energy Engineering and Power Technology
  • Energy (miscellaneous)
  • Engineering (miscellaneous)
  • Environmental Chemistry
  • Environmental Engineering
  • Environmental Science (miscellaneous)
  • Epidemiology
  • Experimental and Cognitive Psychology
  • Family Practice
  • Filtration and Separation
  • Fluid Flow and Transfer Processes
  • Food Animals
  • Food Science
  • Fuel Technology
  • Fundamentals and Skills
  • Gastroenterology
  • Gender Studies
  • Genetics (clinical)
  • Geochemistry and Petrology
  • Geography, Planning and Development
  • Geometry and Topology
  • Geotechnical Engineering and Engineering Geology
  • Geriatrics and Gerontology
  • Gerontology
  • Global and Planetary Change
  • Hardware and Architecture
  • Health Informatics
  • Health Information Management
  • Health Policy
  • Health Professions (miscellaneous)
  • Health (social science)
  • Health, Toxicology and Mutagenesis
  • History and Philosophy of Science
  • Horticulture
  • Human-Computer Interaction
  • Human Factors and Ergonomics
  • Immunology and Allergy
  • Immunology and Microbiology (miscellaneous)
  • Industrial and Manufacturing Engineering
  • Industrial Relations
  • Infectious Diseases
  • Information Systems
  • Information Systems and Management
  • Inorganic Chemistry
  • Insect Science
  • Instrumentation
  • Internal Medicine
  • Issues, Ethics and Legal Aspects
  • Leadership and Management
  • Library and Information Sciences
  • Life-span and Life-course Studies
  • Linguistics and Language
  • Literature and Literary Theory
  • LPN and LVN
  • Management Information Systems
  • Management, Monitoring, Policy and Law
  • Management of Technology and Innovation
  • Management Science and Operations Research
  • Materials Chemistry
  • Materials Science (miscellaneous)
  • Maternity and Midwifery
  • Mathematical Physics
  • Mathematics (miscellaneous)
  • Mechanical Engineering
  • Mechanics of Materials
  • Media Technology
  • Medical and Surgical Nursing
  • Medical Assisting and Transcription
  • Medical Laboratory Technology
  • Medical Terminology
  • Medicine (miscellaneous)
  • Metals and Alloys
  • Microbiology
  • Microbiology (medical)
  • Modeling and Simulation
  • Molecular Biology
  • Molecular Medicine
  • Nanoscience and Nanotechnology
  • Nature and Landscape Conservation
  • Neurology (clinical)
  • Neuropsychology and Physiological Psychology
  • Neuroscience (miscellaneous)
  • Nuclear and High Energy Physics
  • Nuclear Energy and Engineering
  • Numerical Analysis
  • Nurse Assisting
  • Nursing (miscellaneous)
  • Nutrition and Dietetics
  • Obstetrics and Gynecology
  • Occupational Therapy
  • Ocean Engineering
  • Oceanography
  • Oncology (nursing)
  • Ophthalmology
  • Oral Surgery
  • Organic Chemistry
  • Organizational Behavior and Human Resource Management
  • Orthodontics
  • Orthopedics and Sports Medicine
  • Otorhinolaryngology
  • Paleontology
  • Parasitology
  • Pathology and Forensic Medicine
  • Pediatrics, Perinatology and Child Health
  • Periodontics
  • Pharmaceutical Science
  • Pharmacology
  • Pharmacology (medical)
  • Pharmacology (nursing)
  • Pharmacology, Toxicology and Pharmaceutics (miscellaneous)
  • Physical and Theoretical Chemistry
  • Physical Therapy, Sports Therapy and Rehabilitation
  • Physics and Astronomy (miscellaneous)
  • Physiology (medical)
  • Plant Science
  • Political Science and International Relations
  • Polymers and Plastics
  • Process Chemistry and Technology
  • Psychiatry and Mental Health
  • Psychology (miscellaneous)
  • Public Administration
  • Public Health, Environmental and Occupational Health
  • Pulmonary and Respiratory Medicine
  • Radiological and Ultrasound Technology
  • Radiology, Nuclear Medicine and Imaging
  • Rehabilitation
  • Religious Studies
  • Renewable Energy, Sustainability and the Environment
  • Reproductive Medicine
  • Research and Theory
  • Respiratory Care
  • Review and Exam Preparation
  • Reviews and References (medical)
  • Rheumatology
  • Safety Research
  • Safety, Risk, Reliability and Quality
  • Sensory Systems
  • Signal Processing
  • Small Animals
  • Social Psychology
  • Social Sciences (miscellaneous)
  • Social Work
  • Sociology and Political Science
  • Soil Science
  • Space and Planetary Science
  • Spectroscopy
  • Speech and Hearing
  • Sports Science
  • Statistical and Nonlinear Physics
  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Strategy and Management
  • Stratigraphy
  • Structural Biology
  • Surfaces and Interfaces
  • Surfaces, Coatings and Films
  • Theoretical Computer Science
  • Tourism, Leisure and Hospitality Management
  • Transplantation
  • Transportation
  • Urban Studies
  • Veterinary (miscellaneous)
  • Visual Arts and Performing Arts
  • Waste Management and Disposal
  • Water Science and Technology
  • All regions / countries
  • Asiatic Region
  • Eastern Europe
  • Latin America
  • Middle East
  • Northern America
  • Pacific Region
  • Western Europe
  • ARAB COUNTRIES
  • IBEROAMERICA
  • NORDIC COUNTRIES
  • Afghanistan
  • Bosnia and Herzegovina
  • Brunei Darussalam
  • Czech Republic
  • Dominican Republic
  • Netherlands
  • New Caledonia
  • New Zealand
  • Papua New Guinea
  • Philippines
  • Puerto Rico
  • Russian Federation
  • Saudi Arabia
  • South Africa
  • South Korea
  • Switzerland
  • Syrian Arab Republic
  • Trinidad and Tobago
  • United Arab Emirates
  • United Kingdom
  • United States
  • Vatican City State
  • Book Series
  • Conferences and Proceedings
  • Trade Journals

research on scientific journal

  • Citable Docs. (3years)
  • Total Cites (3years)

research on scientific journal

Follow us on @ScimagoJR Scimago Lab , Copyright 2007-2022. Data Source: Scopus®

research on scientific journal

Cookie settings

Cookie Policy

Legal Notice

Privacy Policy

ScienceDaily

Top Science News

  • Social Psychology
  • Consumer Behavior
  • Organic Chemistry
  • Biochemistry
  • Personalized Medicine
  • Today's Healthcare
  • Wounds and Healing
  • Brain-Computer Interfaces
  • K-12 Education
  • Black Holes
  • Extrasolar Planets
  • Inorganic Chemistry
  • Space Exploration
  • Global Warming
  • Snow and Avalanches
  • Origin of Life
  • Energy and the Environment
  • Renewable Energy
  • Solar Energy
  • Frogs and Reptiles
  • How Competition Can Help Promote Cooperation
  • Chemists Synthesize Unique Anticancer Molecules
  • Dramatic Improvements in Crohn's Patients
  • Wearable Human Emotion Recognition Tech
  • Top Physical/Tech
  • Neutron Star at Heart of Supernova Remnant
  • The Search for More Temperate Tatooines
  • Steering Light With Supercritical Coupling
  • Search for Life On Mars Continues
  • Top Environment
  • Anchors Holding Antarctic Land-Ice Shrinking
  • Compound Vital for All Life and Life's Origin
  • Physicists Develop More Efficient Solar Cell
  • Incredibly Rare Giant Turtle Found Nesting
  • Huntington's Disease
  • Amyotrophic Lateral Sclerosis
  • Chronic Illness
  • Disorders and Syndromes
  • Pharmacology
  • Pharmaceuticals
  • Controlled Substances
  • Skin Cancer
  • Breast Cancer
  • Neuroscience
  • Brain Injury
  • Intelligence
  • Child Development
  • Child Psychology
  • Educational Technology
  • Mental Health
  • Environmental Awareness
  • Sleep Disorders
  • Obstructive Sleep Apnea
  • Cholesterol
  • Men's Health
  • Health & Medicine
  • Mapping Potential Pathways to MND Treatment
  • Treating Anxiety, Depression and More
  • Drugs That Shouldn't Be Taken Together
  • Terahertz Biosensor Detects Skin Cancer Early
  • Mind & Brain
  • Where Neural Stem Cells Feel at Home
  • Post-Concussion Anxiety and Exercise Intensity
  • Young Children and Digital Media
  • Climate Change: Mental Distress Among Teens
  • Living Well
  • If a TV Spoke to You, Would You Buy It?
  • Gargling Away the Bad Bacteria in Type 2 ...
  • Sleep Improves Ability to Recall Complex Events
  • Hair Loss Drug May Also Cut Heart Disease Risk
  • Materials Science
  • Computer Modeling
  • Engineering and Construction
  • Solar System
  • Ecology Research
  • Hazardous Waste
  • Astrophysics
  • Nuclear Energy
  • Computers and Internet
  • Information Technology
  • Artificial Intelligence
  • Spintronics
  • Spintronics Research
  • Engineering
  • Virtual Reality
  • Matter & Energy
  • Testing for Toxins; Leaving Rats Out of It
  • Discovering Single-Molecule Magnets Quickly
  • Plastic Recycling With a Protein Anchor
  • Two-Dimensional Waveguides
  • Space & Time
  • Little Groundwater Recharge in Old Mars Aquifer
  • Historical Spy-Satellite Images and Ecology
  • Measuring Neutrons to Reduce Nuclear Waste
  • Fundamental Science On Earth and in the Cosmos
  • Computers & Math
  • AI Wrangles Fusion Power for the Grid
  • Scientists Double Computer Processing Speeds
  • Exotic Electronic State and Quantum Computing
  • Angle-Dependent Holograms
  • Sustainability
  • Scientific Conduct
  • Evolutionary Biology
  • Environmental Issues
  • Cell Biology
  • Molecular Biology
  • Microbes and More
  • Natural Disasters
  • Wild Animals
  • New Species
  • Marine Biology
  • Ancient DNA
  • Ancient Civilizations
  • Plants & Animals
  • Cultural Evolution of Collective Property Rights
  • Parks, Botanical Gardens, Keep Cities Cool
  • Damage to Cell Membranes Causes Cell Aging
  • Spread and Evolution of Antibiotic Resistance
  • Earth & Climate
  • Air Pollution Hides Increases in Rainfall
  • More Fires Likely in Cooler Northwest US
  • Killer Instinct and Mammals' Ancestors
  • Snakes Do It Faster, Better
  • Fossils & Ruins
  • Butterfly, Moth Genomes Stood the Test of Time
  • Sea Monsters Are Our Cousins
  • World's Most Ecologically Diverse Bats
  • Down Syndrome: Record in Ancient DNA
  • Political Science
  • Public Health Education
  • Funding Policy
  • Air Quality
  • Air Pollution
  • Earthquakes
  • Privacy Issues
  • Education and Employment
  • Legal Issues
  • Learning Disorders
  • Educational Policy
  • Children's Health
  • Educational Psychology
  • Workplace Health
  • Environmental Policy
  • Environmental Policies
  • Racial Disparity
  • Racial Issues
  • Science & Society
  • Nudged Towards Healthier Diets?
  • How US Air Pollution Has Changed Over Time
  • Earthquake Fatality Measure to Assess Impact
  • Background Checks Don't Always Check out
  • Education & Learning
  • Boost Kids' Language Skills by Reminiscing
  • School Uniform Policies May Limit Student ...
  • Brain 'Programmed' to Learn from Those We Like
  • Education: Early Drawing and Building Skills
  • Business & Industry
  • Burnout: Identifying People at Risk
  • Businesses Need Clarity for Mitigating Pollution
  • Hiring the Most Qualified: Unfair?
  • Online Reviews: Should They Be Filtered?
  • Giant New Snake Species Identified in the Amazon
  • Record-Breaking Quasar Discovered

ScienceDaily features breaking news about the latest discoveries in science, health, the environment, technology, and more -- from leading universities, scientific journals, and research organizations.

Visitors can browse more than 500 individual topics, grouped into 12 main sections (listed under the top navigational menu), covering: the medical sciences and health; physical sciences and technology; biological sciences and the environment; and social sciences, business and education. Headlines and summaries of relevant news stories are provided on each topic page.

Stories are posted daily, selected from press materials provided by hundreds of sources from around the world. Links to sources and relevant journal citations (where available) are included at the end of each post.

For more information about ScienceDaily, please consult the links listed at the bottom of each page.

Detail of a painting depicting the landscape of New Mexico with mountains in the distance

Explore millions of high-quality primary sources and images from around the world, including artworks, maps, photographs, and more.

Explore migration issues through a variety of media types

  • Part of The Streets are Talking: Public Forms of Creative Expression from Around the World
  • Part of The Journal of Economic Perspectives, Vol. 34, No. 1 (Winter 2020)
  • Part of Cato Institute (Aug. 3, 2021)
  • Part of University of California Press
  • Part of Open: Smithsonian National Museum of African American History & Culture
  • Part of Indiana Journal of Global Legal Studies, Vol. 19, No. 1 (Winter 2012)
  • Part of R Street Institute (Nov. 1, 2020)
  • Part of Leuven University Press
  • Part of UN Secretary-General Papers: Ban Ki-moon (2007-2016)
  • Part of Perspectives on Terrorism, Vol. 12, No. 4 (August 2018)
  • Part of Leveraging Lives: Serbia and Illegal Tunisian Migration to Europe, Carnegie Endowment for International Peace (Mar. 1, 2023)
  • Part of UCL Press

Harness the power of visual materials—explore more than 3 million images now on JSTOR.

Enhance your scholarly research with underground newspapers, magazines, and journals.

Explore collections in the arts, sciences, and literature from the world’s leading museums, archives, and scholars.

  •    Home
  • Biomedical & Life Sci.
  • Business & Economics
  • Chemistry & Materials Sci.
  • Computer Sci. & Commun.
  • Earth & Environmental Sci.
  • Engineering
  • Medicine & Healthcare
  • Physics & Mathematics
  • Social Sci. & Humanities

Journals by Subject  

  • Biomedical & Life Sciences
  • Chemistry & Materials Science
  • Computer Science & Communications
  • Earth & Environmental Sciences
  • Social Sciences & Humanities
  • Paper Submission
  • Information for Authors
  • Peer-Review Resources
  • Open Special Issues
  • Open Access Statement
  • Frequently Asked Questions

Publish with us  

Biomedical & life sciences, business & economics, chemistry & materials science, computer science & communications, earth & environmental sciences, medicine & healthcare, physics & mathematics, social sciences & humanities.

  • Advances in Alzheimer's Disease
  • Advances in Aging Research
  • Advances in Bioscience and Biotechnology
  • Advances in Entomology
  • Advances in Enzyme Research
  • Advances in Microbiology
  • American Journal of Molecular Biology
  • American Journal of Plant Sciences
  • Advances in Molecular Imaging
  • Advances in Nanoparticles
  • Advances in Parkinson's Disease
  • Agricultural Sciences
  • Computational Molecular Bioscience
  • Food and Nutrition Sciences
  • International Journal of Organic Chemist...
  • Journal of Behavioral and Brain Science
  • Journal of Biomedical Science and Engine...
  • Journal of Biosciences and Medicines
  • Journal of Biomaterials and Nanobiotechn...
  • Journal of Tuberculosis Research
  • Microscopy Research
  • Natural Science
  • Open Access Library Journal
  • Open Journal of Applied Biosensor
  • Open Journal of Apoptosis
  • Open Journal of Applied Sciences
  • Open Journal of Animal Sciences
  • Open Journal of Biophysics
  • Open Journal of Depression
  • Open Journal of Genetics
  • Open Journal of Molecular and Integrativ...
  • Open Journal of Psychiatry
  • Open Journal of Regenerative Medicine
  • Stem Cell Discovery
  • World Journal of Neuroscience
  • American Journal of Industrial and Busin...
  • Chinese Studies
  • Journal of Financial Risk Management
  • Journal of Human Resource and Sustainabi...
  • Journal of Mathematical Finance
  • Open Journal of Social Sciences
  • Journal of Service Science and Managemen...
  • Low Carbon Economy
  • Modern Economy
  • Open Journal of Accounting
  • Open Journal of Business and Management
  • Theoretical Economics Letters
  • Technology and Investment
  • Advances in Biological Chemistry
  • Advances in Chemical Engineering and Sci...
  • American Journal of Analytical Chemistry
  • Advances in Materials Physics and Chemis...
  • Computational Chemistry
  • Crystal Structure Theory and Application...
  • Green and Sustainable Chemistry
  • International Journal of Analytical Mass...
  • International Journal of Nonferrous Meta...
  • Journal of Agricultural Chemistry and En...
  • Journal of Analytical Sciences, Methods ...
  • Journal of Biophysical Chemistry
  • Journal of Crystallization Process and T...
  • Journal of Encapsulation and Adsorption ...
  • Journal of Minerals and Materials Charac...
  • Journal of Surface Engineered Materials ...
  • Journal of Textile Science and Technolog...
  • Modeling and Numerical Simulation of Mat...
  • Modern Research in Catalysis
  • Materials Sciences and Applications
  • Journal of Materials Science and Chemica...
  • New Journal of Glass and Ceramics
  • Open Journal of Composite Materials
  • Open Journal of Inorganic Chemistry
  • Open Journal of Inorganic Non-metallic M...
  • Open Journal of Medicinal Chemistry
  • Open Journal of Metal
  • Open Journal of Organic Polymer Material...
  • Open Journal of Physical Chemistry
  • Open Journal of Polymer Chemistry
  • Open Journal of Safety Science and Techn...
  • Open Journal of Synthesis Theory and App...
  • Optics and Photonics Journal
  • Pharmacology & Pharmacy
  • Spectral Analysis Review
  • Soft Nanoscience Letters
  • World Journal of Engineering and Technol...
  • World Journal of Nano Science and Engine...
  • Advances in Computed Tomography
  • Advances in Internet of Things
  • Advances in Remote Sensing
  • Communications and Network
  • Circuits and Systems
  • E-Health Telecommunication Systems and N...
  • Intelligent Control and Automation
  • Intelligent Information Management
  • International Journal of Communications,...
  • International Journal of Internet and Di...
  • International Journal of Intelligence Sc...
  • Journal of Computer and Communications
  • Journal of Data Analysis and Information...
  • Journal of Intelligent Learning Systems ...
  • Journal of Information Security
  • Journal of Software Engineering and Appl...
  • Journal of Signal and Information Proces...
  • Journal of Sensor Technology
  • Open Journal of Antennas and Propagation
  • Open Journal of Optimization
  • Positioning
  • Social Networking
  • Wireless Engineering and Technology
  • Wireless Sensor Network
  • Atmospheric and Climate Sciences
  • American Journal of Climate Change
  • Computational Water, Energy, and Environ...
  • Journal of Geoscience and Environment Pr...
  • Geomaterials
  • International Journal of Clean Coal and ...
  • International Journal of Geosciences
  • InfraMatics
  • Journal of Environmental Protection
  • Journal of Geographic Information System
  • Journal of Water Resource and Protection
  • Natural Resources
  • Open Journal of Air Pollution
  • Open Journal of Ecology
  • Open Journal of Earthquake Research
  • Open Journal of Forestry
  • Open Journal of Geology
  • Open Journal of Modern Hydrology
  • Open Journal of Marine Science
  • Open Journal of Soil Science
  • Smart Grid and Renewable Energy
  • Advances in Aerospace Science and Techno...
  • Energy and Power Engineering
  • International Journal of Modern Nonlinea...
  • Journal of Building Construction and Pla...
  • Journal of Electronics Cooling and Therm...
  • Journal of Electromagnetic Analysis and ...
  • Journal of Flow Control, Measurement & V...
  • Journal of Power and Energy Engineering
  • Journal of Sustainable Bioenergy Systems
  • Journal of Transportation Technologies
  • Modern Mechanical Engineering
  • Open Journal of Civil Engineering
  • Open Journal of Energy Efficiency
  • Open Journal of Yangtze Oil and Gas
  • World Journal of Mechanics
  • World Journal of Nuclear Science and Tec...
  • Advances in Breast Cancer Research
  • Advances in Infectious Diseases
  • Advances in Lung Cancer
  • Advances in Physical Education
  • Advances in Reproductive Sciences
  • Advances in Sexual Medicine
  • Chinese Medicine
  • Case Reports in Clinical Medicine
  • Forensic Medicine and Anatomy Research
  • International Journal of Clinical Medici...
  • International Journal of Medical Physics...
  • International Journal of Otolaryngology ...
  • Journal of Cosmetics, Dermatological Sci...
  • Journal of Cancer Therapy
  • Journal of Diabetes Mellitus
  • Journal of Immune Based Therapies, Vacci...
  • Modern Chemotherapy
  • Modern Plastic Surgery
  • Modern Research in Inflammation
  • Neuroscience and Medicine
  • Occupational Diseases and Environmental ...
  • Open Journal of Anesthesiology
  • Open Journal of Blood Diseases
  • Open Journal of Clinical Diagnostics
  • Open Journal of Emergency Medicine
  • Open Journal of Endocrine and Metabolic ...
  • Open Journal of Epidemiology
  • Open Journal of Gastroenterology
  • Open Journal of Immunology
  • Open Journal of Internal Medicine
  • Open Journal of Medical Imaging
  • Open Journal of Medical Microbiology
  • Open Journal of Modern Neurosurgery
  • Open Journal of Medical Psychology
  • Open Journal of Nursing
  • Open Journal of Nephrology
  • Open Journal of Orthopedics
  • Open Journal of Obstetrics and Gynecolog...
  • Open Journal of Ophthalmology
  • Open Journal of Organ Transplant Surgery
  • Open Journal of Pathology
  • Open Journal of Pediatrics
  • Open Journal of Preventive Medicine
  • Open Journal of Rheumatology and Autoimm...
  • Open Journal of Respiratory Diseases
  • Open Journal of Stomatology
  • Open Journal of Therapy and Rehabilitati...
  • Open Journal of Thoracic Surgery
  • Open Journal of Urology
  • Open Journal of Veterinary Medicine
  • Pain Studies and Treatment
  • Surgical Science
  • World Journal of AIDS
  • World Journal of Cardiovascular Diseases
  • World Journal of Cardiovascular Surgery
  • World Journal of Vaccines
  • Yangtze Medicine
  • American Journal of Computational Mathem...
  • American Journal of Operations Research
  • Advances in Linear Algebra & Matrix Theo...
  • Applied Mathematics
  • Advances in Pure Mathematics
  • International Journal of Astronomy and A...
  • Journal of Applied Mathematics and Physi...
  • Journal of High Energy Physics, Gravitat...
  • Journal of Modern Physics
  • Journal of Quantum Information Science
  • Modern Instrumentation
  • Open Journal of Acoustics
  • Open Journal of Discrete Mathematics
  • Open Journal of Fluid Dynamics
  • Open Journal of Microphysics
  • Open Journal of Modelling and Simulation
  • Open Journal of Radiology
  • Open Journal of Statistics
  • World Journal of Condensed Matter Physic...
  • Advances in Anthropology
  • Advances in Applied Sociology
  • Archaeological Discovery
  • Art and Design Review
  • Advances in Historical Studies
  • Advances in Journalism and Communication
  • Advances in Literary Study
  • Beijing Law Review
  • Creative Education
  • Current Urban Studies
  • Open Journal of Leadership
  • Open Journal of Modern Linguistics
  • Open Journal of Philosophy
  • Open Journal of Political Science
  • Sociology Mind
  • Voice of the Publisher
  • Journals A-Z

About SCIRP

  • Publication Fees
  • For Authors
  • Peer-Review Issues
  • Special Issues
  • Manuscript Tracking System
  • Subscription
  • Translation & Proofreading
  • Volume & Issue
  • Open Access
  • Publication Ethics
  • Preservation
  • Privacy Policy

AI Writes Scientific Papers That Sound Great—but Aren’t Accurate

Logo Photo Illustration

F irst came the students, who wanted help with their homework and essays. Now, ChatGPT is luring scientists, who are under pressure to publish papers in reputable scientific journals.

AI is already disrupting the archaic world of scientific publishing. When Melissa Kacena, vice chair of orthopaedic surgery at Indiana University School of Medicine, reviews articles submitted for publication in journals, she now knows to look out for ones that might have been written by the AI program. “I have a rule of thumb now that if I pull up 10 random references cited in the paper, and if more than one isn’t accurate, then I reject the paper,” she says.

But despite the pitfalls, there is also promise. Writing review articles, for example, is a task well suited to AI: it involves sifting through the existing research on a subject, analyzing the results, reaching a conclusion about the state of the science on the topic, and providing some new insight. ChatGPT can do all of those things well.

Kacena decided to see who is better at writing review articles: people or ChatGPT. For her study published in Current Osteoporosis Reports , she sorted nine students and the AI program into three groups and asked each group to write a review article on a different topic. For one group, she asked the students to write review articles on the topics; for another, she instructed ChatGPT to write articles on the same topics; and for the last group, she gave each of the students their own ChatGPT account and told them to work together with the AI program to write articles. That allowed her to compare articles written by people, by AI, and a combination of people and AI. She asked faculty member colleagues and the students to fact check each of the articles, and compared the three types of articles on measures like accuracy, ease of reading, and use of appropriate language.

Read More : To Make a Real Difference in Health Care, AI Will Need to Learn Like We Do

The results were eye-opening. The articles written by ChatGPT were easy to read and were even better written than the students'. But up to 70% of the cited references were inaccurate: they were either incoherently merged from several different studies or completely fictitious. The AI versions were also more likely to be plagiarized.

“ChatGPT was pretty convincing with some of the phony statements it made, to be honest,” says Kacena. “It used the proper syntax and integrated them with proper statements in a paragraph, so sometimes there were no warning bells. It was only because the faculty members had a good understanding of the data, or because the students fact checked everything, that they were detected.”

There were some advantages to the AI-generated articles. The algorithm was faster and more efficient in processing all the required data, and in general, ChatGPT used better grammar than the students. But it couldn't always read the room: AI tended to use more flowery language that wasn’t always appropriate for scientific journals (unless the students had told ChatGPT to write it from the perspective of a graduate-level science student.)

Read More : The 100 Most Influential People in AI

That reflects a truth about the use of AI: it's only as good as the information it receives. While ChatGPT isn’t quite ready to author scientific journal articles, with the proper programming and training, it could improve and become a useful tool for researchers. “Right now it’s not great by itself, but it can be made to work,” says Kacena. For example, if queried, the algorithm was good at recommending ways to summarize data in figures and graphical depictions. “The advice it gave on those were spot on, and exactly what I would have done,” she says.

The more feedback the students provided on ChatGPT's work, the better it learned—and that represents its greatest promise. In the study, some students found that when they worked together with ChatGPT to write the article, the program continued to improve and provide better results if they told it what things it was doing right, and what was less helpful. That means that addressing problems like questionable references and plagiarism could potentially be fixed. ChatGPT could be programmed, for example, to not merge references and to treat each scientific journal article as its own separate reference, and to limit copying consecutive words to avoid plagiarism.

With more input and some fixes, Kacena believes that AI could help researchers smooth out the writing process and even gain scientific insights. "I think ChatGPT is here to stay, and figuring out how to make it better, and how to use it in an ethical and conscientious and scientifically sound manner, is going to be really important,” she says.

More Must-Reads From TIME

  • Meet the 2024 Women of the Year
  • Greta Gerwig's Next Big Swing 
  • East Palestine, One Year After Train Derailment
  • In the Belly of MrBeast
  • The Closers: 18 People Working to End the Racial Wealth Gap
  • How Long Should You Isolate With COVID-19?
  • The Best Romantic Comedies to Watch on Netflix
  • Want Weekly Recs on What to Watch, Read, and More? Sign Up for Worth Your Time

Contact us at [email protected]

You May Also Like

  • Share full article

research on scientific journal

A Columbia Surgeon’s Study Was Pulled. He Kept Publishing Flawed Data.

The quiet withdrawal of a 2021 cancer study by Dr. Sam Yoon highlights scientific publishers’ lack of transparency around data problems.

Supported by

Benjamin Mueller

By Benjamin Mueller

Benjamin Mueller covers medical science and has reported on several research scandals.

  • Feb. 15, 2024

The stomach cancer study was shot through with suspicious data. Identical constellations of cells were said to depict separate experiments on wholly different biological lineages. Photos of tumor-stricken mice, used to show that a drug reduced cancer growth, had been featured in two previous papers describing other treatments.

Problems with the study were severe enough that its publisher, after finding that the paper violated ethics guidelines, formally withdrew it within a few months of its publication in 2021. The study was then wiped from the internet, leaving behind a barren web page that said nothing about the reasons for its removal.

As it turned out, the flawed study was part of a pattern. Since 2008, two of its authors — Dr. Sam S. Yoon, chief of a cancer surgery division at Columbia University’s medical center, and a more junior cancer biologist — have collaborated with a rotating cast of researchers on a combined 26 articles that a British scientific sleuth has publicly flagged for containing suspect data. A medical journal retracted one of them this month after inquiries from The New York Times.

A person walks across a covered walkway connecting two buildings over a road with parked cars. A large, blue sign on the walkway says "Columbia University Irving Medical Center."

Memorial Sloan Kettering Cancer Center, where Dr. Yoon worked when much of the research was done, is now investigating the studies. Columbia’s medical center declined to comment on specific allegations, saying only that it reviews “any concerns about scientific integrity brought to our attention.”

Dr. Yoon, who has said his research could lead to better cancer treatments , did not answer repeated questions. Attempts to speak to the other researcher, Changhwan Yoon, an associate research scientist at Columbia, were also unsuccessful.

The allegations were aired in recent months in online comments on a science forum and in a blog post by Sholto David, an independent molecular biologist. He has ferreted out problems in a raft of high-profile cancer research , including dozens of papers at a Harvard cancer center that were subsequently referred for retractions or corrections.

From his flat in Wales , Dr. David pores over published images of cells, tumors and mice in his spare time and then reports slip-ups, trying to close the gap between people’s regard for academic research and the sometimes shoddier realities of the profession.

When evaluating scientific images, it is difficult to distinguish sloppy copy-and-paste errors from deliberate doctoring of data. Two other imaging experts who reviewed the allegations at the request of The Times said some of the discrepancies identified by Dr. David bore signs of manipulation, like flipped, rotated or seemingly digitally altered images.

Armed with A.I.-powered detection tools, scientists and bloggers have recently exposed a growing body of such questionable research, like the faulty papers at Harvard’s Dana-Farber Cancer Institute and studies by Stanford’s president that led to his resignation last year.

But those high-profile cases were merely the tip of the iceberg, experts said. A deeper pool of unreliable research has gone unaddressed for years, shielded in part by powerful scientific publishers driven to put out huge volumes of studies while avoiding the reputational damage of retracting them publicly.

The quiet removal of the 2021 stomach cancer study from Dr. Yoon’s lab, a copy of which was reviewed by The Times, illustrates how that system of scientific publishing has helped enable faulty research, experts said. In some cases, critical medical fields have remained seeded with erroneous studies.

“The journals do the bare minimum,” said Elisabeth Bik, a microbiologist and image expert who described Dr. Yoon’s papers as showing a worrisome pattern of copied or doctored data. “There’s no oversight.”

Memorial Sloan Kettering, where portions of the stomach cancer research were done, said no one — not the journal nor the researchers — had ever told administrators that the paper was withdrawn or why it had been. The study said it was supported in part by federal funding given to the cancer center.

Dr. Yoon, a stomach cancer specialist and a proponent of robotic surgery, kept climbing the academic ranks, bringing his junior researcher along with him. In September 2021, around the time the study was published, he joined Columbia, which celebrated his prolific research output in a news release . His work was financed in part by half a million dollars in federal research money that year, adding to a career haul of nearly $5 million in federal funds.

The decision by the stomach cancer study’s publisher, Elsevier, not to post an explanation for the paper’s removal made it less likely that the episode would draw public attention or affect the duo’s work. That very study continued to be cited in papers by other scientists .

And as recently as last year, Dr. Yoon’s lab published more studies containing identical images that were said to depict separate experiments, according to Dr. David’s analyses.

The researchers’ suspicious publications stretch back 16 years. Over time, relatively minor image copies in papers by Dr. Yoon gave way to more serious discrepancies in studies he collaborated on with Changhwan Yoon, Dr. David said. The pair, who are not related, began publishing articles together around 2013.

But neither their employers nor their publishers seemed to start investigating their work until this past fall, when Dr. David published his initial findings on For Better Science, a blog, and notified Memorial Sloan Kettering, Columbia and the journals. Memorial Sloan Kettering said it began its investigation then.

None of those flagged studies was retracted until last week. Three days after The Times asked publishers about the allegations, the journal Oncotarget retracted a 2016 study on combating certain pernicious cancers. In a retraction notice , the journal said the authors’ explanations for copied images “were deemed unacceptable.”

The belated action was symptomatic of what experts described as a broken system for policing scientific research.

A proliferation of medical journals, they said, has helped fuel demand for ever more research articles. But those same journals, many of them operated by multibillion-dollar publishing companies, often respond slowly or do nothing at all once one of those articles is shown to contain copied data. Journals retract papers at a fraction of the rate at which they publish ones with problems.

Springer Nature, which published nine of the articles that Dr. David said contained discrepancies across five journals, said it was investigating concerns. So did the American Association for Cancer Research, which published 10 articles under question from Dr. Yoon’s lab across four journals.

It is difficult to know who is responsible for errors in articles. Eleven of the scientists’ co-authors, including researchers at Harvard, Duke and Georgetown, did not answer emailed inquiries.

The articles under question examined why certain stomach and soft-tissue cancers withstood treatment, and how that resistance could be overcome.

The two independent image specialists said the volume of copied data, along with signs that some images had been rotated or similarly manipulated, suggested considerable sloppiness or worse.

“There are examples in this set that raise pretty serious red flags for the possibility of misconduct,” said Dr. Matthew Schrag, a Vanderbilt University neurologist who commented as part of his outside work on research integrity.

One set of 10 articles identified by Dr. David showed repeated reuse of identical or overlapping black-and-white images of cancer cells supposedly under different experimental conditions, he said.

“There’s no reason to have done that unless you weren’t doing the work,” Dr. David said.

One of those papers , published in 2012, was formally tagged with corrections. Unlike later studies, which were largely overseen by Dr. Yoon in New York, this paper was written by South Korea-based scientists, including Changhwan Yoon, who then worked in Seoul.

An immunologist in Norway randomly selected the paper as part of a screening of copied data in cancer journals. That led the paper’s publisher, the medical journal Oncogene, to add corrections in 2016.

But the journal did not catch all of the duplicated data , Dr. David said. And, he said, images from the study later turned up in identical form in another paper that remains uncorrected.

Copied cancer data kept recurring, Dr. David said. A picture of a small red tumor from a 2017 study reappeared in papers in 2020 and 2021 under different descriptions, he said. A ruler included in the pictures for scale wound up in two different positions.

The 2020 study included another tumor image that Dr. David said appeared to be a mirror image of one previously published by Dr. Yoon’s lab. And the 2021 study featured a color version of a tumor that had appeared in an earlier paper atop a different section of ruler, Dr. David said.

“This is another example where this looks intentionally done,” Dr. Bik said.

The researchers were faced with more serious action when the publisher Elsevier withdrew the stomach cancer study that had been published online in 2021. “The editors determined that the article violated journal publishing ethics guidelines,” Elsevier said.

Roland Herzog, the editor of Molecular Therapy, the journal where the article appeared, said that “image duplications were noticed” as part of a process of screening for discrepancies that the journal has since continued to beef up.

Because the problems were detected before the study was ever published in the print journal, Elsevier’s policy dictated that the article be taken down and no explanation posted online.

But that decision appeared to conflict with industry guidelines from the Committee on Publication Ethics . Posting articles online “usually constitutes publication,” those guidelines state. And when publishers pull such articles, the guidelines say, they should keep the work online for the sake of transparency and post “a clear notice of retraction.”

Dr. Herzog said he personally hoped that such an explanation could still be posted for the stomach cancer study. The journal editors and Elsevier, he said, are examining possible options.

The editors notified Dr. Yoon and Changhwan Yoon of the article’s removal, but neither scientist alerted Memorial Sloan Kettering, the hospital said. Columbia did not say whether it had been told.

Experts said the handling of the article was symptomatic of a tendency on the part of scientific publishers to obscure reports of lapses .

“This is typical, sweeping-things-under-the-rug kind of nonsense,” said Dr. Ivan Oransky, co-founder of Retraction Watch, which keeps a database of 47,000-plus retracted papers. “This is not good for the scientific record, to put it mildly.”

Susan C. Beachy contributed research.

Benjamin Mueller reports on health and medicine. He was previously a U.K. correspondent in London and a police reporter in New York. More about Benjamin Mueller

Advertisement

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 19 February 2024

Genomic data in the All of Us Research Program

The all of us research program genomics investigators.

Nature ( 2024 ) Cite this article

56k Accesses

531 Altmetric

Metrics details

  • Genetic variation
  • Genome-wide association studies

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics 1 , 2 , 3 , 4 . The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health 5 , 6 . Here we describe the programme’s genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.

Comprehensively identifying genetic variation and cataloguing its contribution to health and disease, in conjunction with environmental and lifestyle factors, is a central goal of human health research 1 , 2 . A key limitation in efforts to build this catalogue has been the historic under-representation of large subsets of individuals in biomedical research including individuals from diverse ancestries, individuals with disabilities and individuals from disadvantaged backgrounds 3 , 4 . The All of Us Research Program (All of Us) aims to address this gap by enrolling and collecting comprehensive health data on at least one million individuals who reflect the diversity across the USA 5 , 6 . An essential component of All of Us is the generation of whole-genome sequence (WGS) and genotyping data on one million participants. All of Us is committed to making this dataset broadly useful—not only by democratizing access to this dataset across the scientific community but also to return value to the participants themselves by returning individual DNA results, such as genetic ancestry, hereditary disease risk and pharmacogenetics according to clinical standards, to those who wish to receive these research results.

Here we describe the release of WGS data from 245,388 All of Us participants and demonstrate the impact of this high-quality data in genetic and health studies. We carried out a series of data harmonization and quality control (QC) procedures and conducted analyses characterizing the properties of the dataset including genetic ancestry and relatedness. We validated the data by replicating well-established genotype–phenotype associations including low-density lipoprotein cholesterol (LDL-C) and 117 additional diseases. These data are available through the All of Us Researcher Workbench, a cloud platform that embodies and enables programme priorities, facilitating equitable data and compute access while ensuring responsible conduct of research and protecting participant privacy through a passport data access model.

The All of Us Research Program

To accelerate health research, All of Us is committed to curating and releasing research data early and often 6 . Less than five years after national enrolment began in 2018, this fifth data release includes data from more than 413,000 All of Us participants. Summary data are made available through a public Data Browser, and individual-level participant data are made available to researchers through the Researcher Workbench (Fig. 1a and Data availability).

figure 1

a , The All of Us Research Hub contains a publicly accessible Data Browser for exploration of summary phenotypic and genomic data. The Researcher Workbench is a secure cloud-based environment of participant-level data in a Controlled Tier that is widely accessible to researchers. b , All of Us participants have rich phenotype data from a combination of physical measurements, survey responses, EHRs, wearables and genomic data. Dots indicate the presence of the specific data type for the given number of participants. c , Overall summary of participants under-represented in biomedical research (UBR) with data available in the Controlled Tier. The All of Us logo in a is reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Participant data include a rich combination of phenotypic and genomic data (Fig. 1b ). Participants are asked to complete consent for research use of data, sharing of electronic health records (EHRs), donation of biospecimens (blood or saliva, and urine), in-person provision of physical measurements (height, weight and blood pressure) and surveys initially covering demographics, lifestyle and overall health 7 . Participants are also consented for recontact. EHR data, harmonized using the Observational Medical Outcomes Partnership Common Data Model 8 ( Methods ), are available for more than 287,000 participants (69.42%) from more than 50 health care provider organizations. The EHR dataset is longitudinal, with a quarter of participants having 10 years of EHR data (Extended Data Fig. 1 ). Data include 245,388 WGSs and genome-wide genotyping on 312,925 participants. Sequenced and genotyped individuals in this data release were not prioritized on the basis of any clinical or phenotypic feature. Notably, 99% of participants with WGS data also have survey data and physical measurements, and 84% also have EHR data. In this data release, 77% of individuals with genomic data identify with groups historically under-represented in biomedical research, including 46% who self-identify with a racial or ethnic minority group (Fig. 1c , Supplementary Table 1 and Supplementary Note ).

Scaling the All of Us infrastructure

The genomic dataset generated from All of Us participants is a resource for research and discovery and serves as the basis for return of individual health-related DNA results to participants. Consequently, the US Food and Drug Administration determined that All of Us met the criteria for a significant risk device study. As such, the entire All of Us genomics effort from sample acquisition to sequencing meets clinical laboratory standards 9 .

All of Us participants were recruited through a national network of partners, starting in 2018, as previously described 5 . Participants may enrol through All of Us - funded health care provider organizations or direct volunteer pathways and all biospecimens, including blood and saliva, are sent to the central All of Us Biobank for processing and storage. Genomics data for this release were generated from blood-derived DNA. The programme began return of actionable genomic results in December 2022. As of April 2023, approximately 51,000 individuals were sent notifications asking whether they wanted to view their results, and approximately half have accepted. Return continues on an ongoing basis.

The All of Us Data and Research Center maintains all participant information and biospecimen ID linkage to ensure that participant confidentiality and coded identifiers (participant and aliquot level) are used to track each sample through the All of Us genomics workflow. This workflow facilitates weekly automated aliquot and plating requests to the Biobank, supplies relevant metadata for the sample shipments to the Genome Centers, and contains a feedback loop to inform action on samples that fail QC at any stage. Further, the consent status of each participant is checked before sample shipment to confirm that they are still active. Although all participants with genomic data are consented for the same general research use category, the programme accommodates different preferences for the return of genomic data to participants and only data for those individuals who have consented for return of individual health-related DNA results are distributed to the All of Us Clinical Validation Labs for further evaluation and health-related clinical reporting. All participants in All of Us that choose to get health-related DNA results have the option to schedule a genetic counselling appointment to discuss their results. Individuals with positive findings who choose to obtain results are required to schedule an appointment with a genetic counsellor to receive those findings.

Genome sequencing

To satisfy the requirements for clinical accuracy, precision and consistency across DNA sample extraction and sequencing, the All of Us Genome Centers and Biobank harmonized laboratory protocols, established standard QC methodologies and metrics, and conducted a series of validation experiments using previously characterized clinical samples and commercially available reference standards 9 . Briefly, PCR-free barcoded WGS libraries were constructed with the Illumina Kapa HyperPrep kit. Libraries were pooled and sequenced on the Illumina NovaSeq 6000 instrument. After demultiplexing, initial QC analysis is performed with the Illumina DRAGEN pipeline (Supplementary Table 2 ) leveraging lane, library, flow cell, barcode and sample level metrics as well as assessing contamination, mapping quality and concordance to genotyping array data independently processed from a different aliquot of DNA. The Genome Centers use these metrics to determine whether each sample meets programme specifications and then submits sequencing data to the Data and Research Center for further QC, joint calling and distribution to the research community ( Methods ).

This effort to harmonize sequencing methods, multi-level QC and use of identical data processing protocols mitigated the variability in sequencing location and protocols that often leads to batch effects in large genomic datasets 9 . As a result, the data are not only of clinical-grade quality, but also consistent in coverage (≥30× mean) and uniformity across Genome Centers (Supplementary Figs. 1 – 5 ).

Joint calling and variant discovery

We carried out joint calling across the entire All of Us WGS dataset (Extended Data Fig. 2 ). Joint calling leverages information across samples to prune artefact variants, which increases sensitivity, and enables flagging samples with potential issues that were missed during single-sample QC 10 (Supplementary Table 3 ). Scaling conventional approaches to whole-genome joint calling beyond 50,000 individuals is a notable computational challenge 11 , 12 . To address this, we developed a new cloud variant storage solution, the Genomic Variant Store (GVS), which is based on a schema designed for querying and rendering variants in which the variants are stored in GVS and rendered to an analysable variant file, as opposed to the variant file being the primary storage mechanism (Code availability). We carried out QC on the joint call set on the basis of the approach developed for gnomAD 3.1 (ref.  13 ). This included flagging samples with outlying values in eight metrics (Supplementary Table 4 , Supplementary Fig. 2 and Methods ).

To calculate the sensitivity and precision of the joint call dataset, we included four well-characterized samples. We sequenced the National Institute of Standards and Technology reference materials (DNA samples) from the Genome in a Bottle consortium 13 and carried out variant calling as described above. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations 14 . The overall sensitivity for single-nucleotide variants was over 98.7% and precision was more than 99.9%. For short insertions or deletions, the sensitivity was over 97% and precision was more than 99.6% (Supplementary Table 5 and Methods ).

The joint call set included more than 1 billion genetic variants. We annotated the joint call dataset on the basis of functional annotation (for example, gene symbol and protein change) using Illumina Nirvana 15 . We defined coding variants as those inducing an amino acid change on a canonical ENSEMBL transcript and found 272,051,104 non-coding and 3,913,722 coding variants that have not been described previously in dbSNP 16 v153 (Extended Data Table 1 ). A total of 3,912,832 (99.98%) of the coding variants are rare (allelic frequency < 0.01) and the remaining 883 (0.02%) are common (allelic frequency > 0.01). Of the coding variants, 454 (0.01%) are common in one or more of the non-European computed ancestries in All of Us, rare among participants of European ancestry, and have an allelic number greater than 1,000 (Extended Data Table 2 and Extended Data Fig. 3 ). The distributions of pathogenic, or likely pathogenic, ClinVar variant counts per participant, stratified by computed ancestry, filtered to only those variants that are found in individuals with an allele count of <40 are shown in Extended Data Fig. 4 . The potential medical implications of these known and new variants with respect to variant pathogenicity by ancestry are highlighted in a companion paper 17 . In particular, we find that the European ancestry subset has the highest rate of pathogenic variation (2.1%), which was twice the rate of pathogenic variation in individuals of East Asian ancestry 17 .The lower frequency of variants in East Asian individuals may be partially explained by the fact the sample size in that group is small and there may be knowledge bias in the variant databases that is reducing the number of findings in some of the less-studied ancestry groups.

Genetic ancestry and relatedness

Genetic ancestry inference confirmed that 51.1% of the All of Us WGS dataset is derived from individuals of non-European ancestry. Briefly, the ancestry categories are based on the same labels used in gnomAD 18 . We trained a classifier on a 16-dimensional principal component analysis (PCA) space of a diverse reference based on 3,202 samples and 151,159 autosomal single-nucleotide polymorphisms. We projected the All of Us samples into the PCA space of the training data, based on the same single-nucleotide polymorphisms from the WGS data, and generated categorical ancestry predictions from the trained classifier ( Methods ). Continuous genetic ancestry fractions for All of Us samples were inferred using the same PCA data, and participants’ patterns of ancestry and admixture were compared to their self-identified race and ethnicity (Fig. 2 and Methods ). Continuous ancestry inference carried out using genome-wide genotypes yields highly concordant estimates.

figure 2

a , b , Uniform manifold approximation and projection (UMAP) representations of All of Us WGS PCA data with self-described race ( a ) and ethnicity ( b ) labels. c , Proportion of genetic ancestry per individual in six distinct and coherent ancestry groups defined by Human Genome Diversity Project and 1000 Genomes samples.

Kinship estimation confirmed that All of Us WGS data consist largely of unrelated individuals with about 85% (215,107) having no first- or second-degree relatives in the dataset (Supplementary Fig. 6 ). As many genomic analyses leverage unrelated individuals, we identified the smallest set of samples that are required to be removed from the remaining individuals that had first- or second-degree relatives and retained one individual from each kindred. This procedure yielded a maximal independent set of 231,442 individuals (about 94%) with genome sequence data in the current release ( Methods ).

Genetic determinants of LDL-C

As a measure of data quality and utility, we carried out a single-variant genome-wide association study (GWAS) for LDL-C, a trait with well-established genomic architecture ( Methods ). Of the 245,388 WGS participants, 91,749 had one or more LDL-C measurements. The All of Us LDL-C GWAS identified 20 well-established genome-wide significant loci, with minimal genomic inflation (Fig. 3 , Extended Data Table 3 and Supplementary Fig. 7 ). We compared the results to those of a recent multi-ethnic LDL-C GWAS in the National Heart, Lung, and Blood Institute (NHLBI) TOPMed study that included 66,329 ancestrally diverse (56% non-European ancestry) individuals 19 . We found a strong correlation between the effect estimates for NHLBI TOPMed genome-wide significant loci and those of All of Us ( R 2  = 0.98, P  < 1.61 × 10 −45 ; Fig. 3 , inset). Notably, the per-locus effect sizes observed in All of Us are decreased compared to those in TOPMed, which is in part due to differences in the underlying statistical model, differences in the ancestral composition of these datasets and differences in laboratory value ascertainment between EHR-derived data and epidemiology studies. A companion manuscript extended this work to identify common and rare genetic associations for three diseases (atrial fibrillation, coronary artery disease and type 2 diabetes) and two quantitative traits (height and LDL-C) in the All of Us dataset and identified very high concordance with previous efforts across all of these diseases and traits 20 .

figure 3

Manhattan plot demonstrating robust replication of 20 well-established LDL-C genetic loci among 91,749 individuals with 1 or more LDL-C measurements. The red horizontal line denotes the genome wide significance threshold of P = 5 × 10 –8 . Inset, effect estimate ( β ) comparison between NHLBI TOPMed LDL-C GWAS ( x  axis) and All of Us LDL-C GWAS ( y  axis) for the subset of 194 independent variants clumped (window 250 kb, r2 0.5) that reached genome-wide significance in NHLBI TOPMed.

Genotype-by-phenotype associations

As another measure of data quality and utility, we tested replication rates of previously reported phenotype–genotype associations in the five predicted genetic ancestry populations present in the Phenotype/Genotype Reference Map (PGRM): AFR, African ancestry; AMR, Latino/admixed American ancestry; EAS, East Asian ancestry; EUR, European ancestry; SAS, South Asian ancestry. The PGRM contains published associations in the GWAS catalogue in these ancestry populations that map to International Classification of Diseases-based phenotype codes 21 . This replication study specifically looked across 4,947 variants, calculating replication rates for powered associations in each ancestry population. The overall replication rates for associations powered at 80% were: 72.0% (18/25) in AFR, 100% (13/13) in AMR, 46.6% (7/15) in EAS, 74.9% (1,064/1,421) in EUR, and 100% (1/1) in SAS. With the exception of the EAS ancestry results, these powered replication rates are comparable to those of the published PGRM analysis where the replication rates of several single-site EHR-linked biobanks ranges from 76% to 85%. These results demonstrate the utility of the data and also highlight opportunities for further work understanding the specifics of the All of Us population and the potential contribution of gene–environment interactions to genotype–phenotype mapping and motivates the development of methods for multi-site EHR phenotype data extraction, harmonization and genetic association studies.

More broadly, the All of Us resource highlights the opportunities to identify genotype–phenotype associations that differ across diverse populations 22 . For example, the Duffy blood group locus ( ACKR1 ) is more prevalent in individuals of AFR ancestry and individuals of AMR ancestry than in individuals of EUR ancestry. Although the phenome-wide association study of this locus highlights the well-established association of the Duffy blood group with lower white blood cell counts both in individuals of AFR and AMR ancestry 23 , 24 , it also revealed genetic-ancestry-specific phenotype patterns, with minimal phenotypic associations in individuals of EAS ancestry and individuals of EUR ancestry (Fig. 4 and Extended Data Table 4 ). Conversely, rs9273363 in the HLA-DQB1 locus is associated with increased risk of type 1 diabetes 25 , 26 and diabetic complications across ancestries, but only associates with increased risk of coeliac disease in individuals of EUR ancestry (Extended Data Fig. 5 ). Similarly, the TCF7L2 locus 27 strongly associates with increased risk of type 2 diabetes and associated complications across several ancestries (Extended Data Fig. 6 ). Association testing results are available in Supplementary Dataset 1 .

figure 4

Results of genetic-ancestry-stratified phenome-wide association analysis among unrelated individuals highlighting ancestry-specific disease associations across the four most common genetic ancestries of participant. Bonferroni-adjusted phenome-wide significance threshold (<2.88 × 10 −5 ) is plotted as a red horizontal line. AFR ( n  = 34,037, minor allele fraction (MAF) 0.82); AMR ( n  = 28,901, MAF 0.10); EAS ( n  = 32,55, MAF 0.003); EUR ( n  = 101,613, MAF 0.007).

The cloud-based Researcher Workbench

All of Us genomic data are available in a secure, access-controlled cloud-based analysis environment: the All of Us Researcher Workbench. Unlike traditional data access models that require per-project approval, access in the Researcher Workbench is governed by a data passport model based on a researcher’s authenticated identity, institutional affiliation, and completion of self-service training and compliance attestation 28 . After gaining access, a researcher may create a new workspace at any time to conduct a study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is regularly audited and made accessible publicly on the All of Us Research Projects Directory. This streamlined access model is guided by the principles that: participants are research partners and maintaining their privacy and data security is paramount; their data should be made as accessible as possible for authorized researchers; and we should continually seek to remove unnecessary barriers to accessing and using All of Us data.

For researchers at institutions with an existing institutional data use agreement, access can be gained as soon as they complete the required verification and compliance steps. As of August 2023, 556 institutions have agreements in place, allowing more than 5,000 approved researchers to actively work on more than 4,400 projects. The median time for a researcher from initial registration to completion of these requirements is 28.6 h (10th percentile: 48 min, 90th percentile: 14.9 days), a fraction of the weeks to months it can take to assemble a project-specific application and have it reviewed by an access board with conventional access models.

Given that the size of the project’s phenotypic and genomic dataset is expected to reach 4.75 PB in 2023, the use of a central data store and cloud analysis tools will save funders an estimated US$16.5 million per year when compared to the typical approach of allowing researchers to download genomic data. Storing one copy per institution of this data at 556 registered institutions would cost about US$1.16 billion per year. By contrast, storing a central cloud copy costs about US$1.14 million per year, a 99.9% saving. Importantly, cloud infrastructure also democratizes data access particularly for researchers who do not have high-performance local compute resources.

Here we present the All of Us Research Program’s approach to generating diverse clinical-grade genomic data at an unprecedented scale. We present the data release of about 245,000 genome sequences as part of a scalable framework that will grow to include genetic information and health data for one million or more people living across the USA. Our observations permit several conclusions.

First, the All of Us programme is making a notable contribution to improving the study of human biology through purposeful inclusion of under-represented individuals at scale 29 , 30 . Of the participants with genomic data in All of Us, 45.92% self-identified as a non-European race or ethnicity. This diversity enabled identification of more than 275 million new genetic variants across the dataset not previously captured by other large-scale genome aggregation efforts with diverse participants that have submitted variation to dbSNP v153, such as NHLBI TOPMed 31 freeze 8 (Extended Data Table 1 ). In contrast to gnomAD, All of Us permits individual-level genotype access with detailed phenotype data for all participants. Furthermore, unlike many genomics resources, All of Us is uniformly consented for general research use and enables researchers to go from initial account creation to individual-level data access in as little as a few hours. The All of Us cohort is significantly more diverse than those of other large contemporary research studies generating WGS data 32 , 33 . This enables a more equitable future for precision medicine (for example, through constructing polygenic risk scores that are appropriately calibrated to diverse populations 34 , 35 as the eMERGE programme has done leveraging All of Us data 36 , 37 ). Developing new tools and regulatory frameworks to enable analyses across multiple biobanks in the cloud to harness the unique strengths of each is an active area of investigation addressed in a companion paper to this work 38 .

Second, the All of Us Researcher Workbench embodies the programme’s design philosophy of open science, reproducible research, equitable access and transparency to researchers and to research participants 26 . Importantly, for research studies, no group of data users should have privileged access to All of Us resources based on anything other than data protection criteria. Although the All of Us Researcher Workbench initially targeted onboarding US academic, health care and non-profit organizations, it has recently expanded to international researchers. We anticipate further genomic and phenotypic data releases at regular intervals with data available to all researcher communities. We also anticipate additional derived data and functionality to be made available, such as reference data, structural variants and a service for array imputation using the All of Us genomic data.

Third, All of Us enables studying human biology at an unprecedented scale. The programmatic goal of sequencing one million or more genomes has required harnessing the output of multiple sequencing centres. Previous work has focused on achieving functional equivalence in data processing and joint calling pipelines 39 . To achieve clinical-grade data equivalence, All of Us required protocol equivalence at both sequencing production level and data processing across the sequencing centres. Furthermore, previous work has demonstrated the value of joint calling at scale 10 , 18 . The new GVS framework developed by the All of Us programme enables joint calling at extreme scales (Code availability). Finally, the provision of data access through cloud-native tools enables scalable and secure access and analysis to researchers while simultaneously enabling the trust of research participants and transparency underlying the All of Us data passport access model.

The clinical-grade sequencing carried out by All of Us enables not only research, but also the return of value to participants through clinically relevant genetic results and health-related traits to those who opt-in to receiving this information. In the years ahead, we anticipate that this partnership with All of Us participants will enable researchers to move beyond large-scale genomic discovery to understanding the consequences of implementing genomic medicine at scale.

The All of Us cohort

All of Us aims to engage a longitudinal cohort of one million or more US participants, with a focus on including populations that have historically been under-represented in biomedical research. Details of the All of Us cohort have been described previously 5 . Briefly, the primary objective is to build a robust research resource that can facilitate the exploration of biological, clinical, social and environmental determinants of health and disease. The programme will collect and curate health-related data and biospecimens, and these data and biospecimens will be made broadly available for research uses. Health data are obtained through the electronic medical record and through participant surveys. Survey templates can be found on our public website: https://www.researchallofus.org/data-tools/survey-explorer/ . Adults 18 years and older who have the capacity to consent and reside in the USA or a US territory at present are eligible. Informed consent for all participants is conducted in person or through an eConsent platform that includes primary consent, HIPAA Authorization for Research use of EHRs and other external health data, and Consent for Return of Genomic Results. The protocol was reviewed by the Institutional Review Board (IRB) of the All of Us Research Program. The All of Us IRB follows the regulations and guidance of the NIH Office for Human Research Protections for all studies, ensuring that the rights and welfare of research participants are overseen and protected uniformly.

Data accessibility through a ‘data passport’

Authorization for access to participant-level data in All of Us is based on a ‘data passport’ model, through which authorized researchers do not need IRB review for each research project. The data passport is required for gaining data access to the Researcher Workbench and for creating workspaces to carry out research projects using All of Us data. At present, data passports are authorized through a six-step process that includes affiliation with an institution that has signed a Data Use and Registration Agreement, account creation, identity verification, completion of ethics training, and attestation to a data user code of conduct. Results reported follow the All of Us Data and Statistics Dissemination Policy disallowing disclosure of group counts under 20 to protect participant privacy without seeking prior approval 40 .

At present, All of Us gathers EHR data from about 50 health care organizations that are funded to recruit and enrol participants as well as transfer EHR data for those participants who have consented to provide them. Data stewards at each provider organization harmonize their local data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, and then submit it to the All of Us Data and Research Center (DRC) so that it can be linked with other participant data and further curated for research use. OMOP is a common data model standardizing health information from disparate EHRs to common vocabularies and organized into tables according to data domains. EHR data are updated from the recruitment sites and sent to the DRC quarterly. Updated data releases to the research community occur approximately once a year. Supplementary Table 6 outlines the OMOP concepts collected by the DRC quarterly from the recruitment sites.

Biospecimen collection and processing

Participants who consented to participate in All of Us donated fresh whole blood (4 ml EDTA and 10 ml EDTA) as a primary source of DNA. The All of Us Biobank managed by the Mayo Clinic extracted DNA from 4 ml EDTA whole blood, and DNA was stored at −80 °C at an average concentration of 150 ng µl −1 . The buffy coat isolated from 10 ml EDTA whole blood has been used for extracting DNA in the case of initial extraction failure or absence of 4 ml EDTA whole blood. The Biobank plated 2.4 µg DNA with a concentration of 60 ng µl −1 in duplicate for array and WGS samples. The samples are distributed to All of Us Genome Centers weekly, and a negative (empty well) control and National Institute of Standards and Technology controls are incorporated every two months for QC purposes.

Genome Center sample receipt, accession and QC

On receipt of DNA sample shipments, the All of Us Genome Centers carry out an inspection of the packaging and sample containers to ensure that sample integrity has not been compromised during transport and to verify that the sample containers correspond to the shipping manifest. QC of the submitted samples also includes DNA quantification, using routine procedures to confirm volume and concentration (Supplementary Table 7 ). Any issues or discrepancies are recorded, and affected samples are put on hold until resolved. Samples that meet quality thresholds are accessioned in the Laboratory Information Management System, and sample aliquots are prepared for library construction processing (for example, normalized with respect to concentration and volume).

WGS library construction, sequencing and primary data QC

The DNA sample is first sheared using a Covaris sonicator and is then size-selected using AMPure XP beads to restrict the range of library insert sizes. Using the PCR Free Kapa HyperPrep library construction kit, enzymatic steps are completed to repair the jagged ends of DNA fragments, add proper A-base segments, and ligate indexed adapter barcode sequences onto samples. Excess adaptors are removed using AMPure XP beads for a final clean-up. Libraries are quantified using quantitative PCR with the Illumina Kapa DNA Quantification Kit and then normalized and pooled for sequencing (Supplementary Table 7 ).

Pooled libraries are loaded on the Illumina NovaSeq 6000 instrument. The data from the initial sequencing run are used to QC individual libraries and to remove non-conforming samples from the pipeline. The data are also used to calibrate the pooling volume of each individual library and re-pool the libraries for additional NovaSeq sequencing to reach an average coverage of 30×.

After demultiplexing, WGS analysis occurs on the Illumina DRAGEN platform. The DRAGEN pipeline consists of highly optimized algorithms for mapping, aligning, sorting, duplicate marking and haplotype variant calling and makes use of platform features such as compression and BCL conversion. Alignment uses the GRCh38dh reference genome. QC data are collected at every stage of the analysis protocol, providing high-resolution metrics required to ensure data consistency for large-scale multiplexing. The DRAGEN pipeline produces a large number of metrics that cover lane, library, flow cell, barcode and sample-level metrics for all runs as well as assessing contamination and mapping quality. The All of Us Genome Centers use these metrics to determine pass or fail for each sample before submitting the CRAM files to the All of Us DRC. For mapping and variant calling, all Genome Centers have harmonized on a set of DRAGEN parameters, which ensures consistency in processing (Supplementary Table 2 ).

Every step through the WGS procedure is rigorously controlled by predefined QC measures. Various control mechanisms and acceptance criteria were established during WGS assay validation. Specific metrics for reviewing and releasing genome data are: mean coverage (threshold of ≥30×), genome coverage (threshold of ≥90% at 20×), coverage of hereditary disease risk genes (threshold of ≥95% at 20×), aligned Q30 bases (threshold of ≥8 × 10 10 ), contamination (threshold of ≤1%) and concordance to independently processed array data.

Array genotyping

Samples are processed for genotyping at three All of Us Genome Centers (Broad, Johns Hopkins University and University of Washington). DNA samples are received from the Biobank and the process is facilitated by the All of Us genomics workflow described above. All three centres used an identical array product, scanners, resource files and genotype calling software for array processing to reduce batch effects. Each centre has its own Laboratory Information Management System that manages workflow control, sample and reagent tracking, and centre-specific liquid handling robotics.

Samples are processed using the Illumina Global Diversity Array (GDA) with Illumina Infinium LCG chemistry using the automated protocol and scanned on Illumina iSCANs with Automated Array Loaders. Illumina IAAP software converts raw data (IDAT files; 2 per sample) into a single GTC file per sample using the BPM file (defines strand, probe sequences and illumicode address) and the EGT file (defines the relationship between intensities and genotype calls). Files used for this data release are: GDA-8v1-0_A5.bpm, GDA-8v1-0_A1_ClusterFile.egt, gentrain v3, reference hg19 and gencall cutoff 0.15. The GDA array assays a total of 1,914,935 variant positions including 1,790,654 single-nucleotide variants, 44,172 indels, 9,935 intensity-only probes for CNV calling, and 70,174 duplicates (same position, different probes). Picard GtcToVcf is used to convert the GTC files to VCF format. Resulting VCF and IDAT files are submitted to the DRC for ingestion and further processing. The VCF file contains assay name, chromosome, position, genotype calls, quality score, raw and normalized intensities, B allele frequency and log R ratio values. Each genome centre is running the GDA array under Clinical Laboratory Improvement Amendments-compliant protocols. The GTC files are parsed and metrics are uploaded to in-house Laboratory Information Management System systems for QC review.

At batch level (each set of 96-well plates run together in the laboratory at one time), each genome centre includes positive control samples that are required to have >98% call rate and >99% concordance to existing data to approve release of the batch of data. At the sample level, the call rate and sex are the key QC determinants 41 . Contamination is also measured using BAFRegress 42 and reported out as metadata. Any sample with a call rate below 98% is repeated one time in the laboratory. Genotyped sex is determined by plotting normalized x versus normalized y intensity values for a batch of samples. Any sample discordant with ‘sex at birth’ reported by the All of Us participant is flagged for further detailed review and repeated one time in the laboratory. If several sex-discordant samples are clustered on an array or on a 96-well plate, the entire array or plate will have data production repeated. Samples identified with sex chromosome aneuploidies are also reported back as metadata (XXX, XXY, XYY and so on). A final processing status of ‘pass’, ‘fail’ or ‘abandon’ is determined before release of data to the All of Us DRC. An array sample will pass if the call rate is >98% and the genotyped sex and sex at birth are concordant (or the sex at birth is not applicable). An array sample will fail if the genotyped sex and the sex at birth are discordant. An array sample will have the status of abandon if the call rate is <98% after at least two attempts at the genome centre.

Data from the arrays are used for participant return of genetic ancestry and non-health-related traits for those who consent, and they are also used to facilitate additional QC of the matched WGS data. Contamination is assessed in the array data to determine whether DNA re-extraction is required before WGS. Re-extraction is prompted by level of contamination combined with consent status for return of results. The arrays are also used to confirm sample identity between the WGS data and the matched array data by assessing concordance at 100 unique sites. To establish concordance, a fingerprint file of these 100 sites is provided to the Genome Centers to assess concordance with the same sites in the WGS data before CRAM submission.

Genomic data curation

As seen in Extended Data Fig. 2 , we generate a joint call set for all WGS samples and make these data available in their entirety and by sample subsets to researchers. A breakdown of the frequencies, stratified by computed ancestries for which we had more than 10,000 participants can be found in Extended Data Fig. 3 . The joint call set process allows us to leverage information across samples to improve QC and increase accuracy.

Single-sample QC

If a sample fails single-sample QC, it is excluded from the release and is not reported in this document. These tests detect sample swaps, cross-individual contamination and sample preparation errors. In some cases, we carry out these tests twice (at both the Genome Center and the DRC), for two reasons: to confirm internal consistency between sites; and to mark samples as passing (or failing) QC on the basis of the research pipeline criteria. The single-sample QC process accepts a higher contamination rate than the clinical pipeline (0.03 for the research pipeline versus 0.01 for the clinical pipeline), but otherwise uses identical thresholds. The list of specific QC processes, passing criteria, error modes addressed and an overview of the results can be found in Supplementary Table 3 .

Joint call set QC

During joint calling, we carry out additional QC steps using information that is available across samples including hard thresholds, population outliers, allele-specific filters, and sensitivity and precision evaluation. Supplementary Table 4 summarizes both the steps that we took and the results obtained for the WGS data. More detailed information about the methods and specific parameters can be found in the All of Us Genomic Research Data Quality Report 36 .

Batch effect analysis

We analysed cross-sequencing centre batch effects in the joint call set. To quantify the batch effect, we calculated Cohen’s d (ref.  43 ) for four metrics (insertion/deletion ratio, single-nucleotide polymorphism count, indel count and single-nucleotide polymorphism transition/transversion ratio) across the three genome sequencing centres (Baylor College of Medicine, Broad Institute and University of Washington), stratified by computed ancestry and seven regions of the genome (whole genome, high-confidence calling, repetitive, GC content of >0.85, GC content of <0.15, low mappability, the ACMG59 genes and regions of large duplications (>1 kb)). Using random batches as a control set, all comparisons had a Cohen’s d of <0.35. Here we report any Cohen’s d results >0.5, which we chose before this analysis and is conventionally the threshold of a medium effect size 44 .

We found that there was an effect size in indel counts (Cohen’s d of 0.53) in the entire genome, between Broad Institute and University of Washington, but this was being driven by repetitive and low-mappability regions. We found no batch effects with Cohen’s d of >0.5 in the ratio metrics or in any metrics in the high-confidence calling, low or high GC content, or ACMG59 regions. A complete list of the batch effects with Cohen’s d of >0.5 are found in Supplementary Table 8 .

Sensitivity and precision evaluation

To determine sensitivity and precision, we included four well-characterized control samples (four National Institute of Standards and Technology Genome in a Bottle samples (HG-001, HG-003, HG-004 and HG-005). The samples were sequenced with the same protocol as All of Us. Of note, these samples were not included in data released to researchers. We used the corresponding published set of variant calls for each sample as the ground truth in our sensitivity and precision calculations. We use the high-confidence calling region, defined by Genome in a Bottle v4.2.1, as the source of ground truth. To be called a true positive, a variant must match the chromosome, position, reference allele, alternate allele and zygosity. In cases of sites with multiple alternative alleles, each alternative allele is considered separately. Sensitivity and precision results are reported in Supplementary Table 5 .

Genetic ancestry inference

We computed categorical ancestry for all WGS samples in All of Us and made these available to researchers. These predictions are also the basis for population allele frequency calculations in the Genomic Variants section of the public Data Browser. We used the high-quality set of sites to determine an ancestry label for each sample. The ancestry categories are based on the same labels used in gnomAD 18 , the Human Genome Diversity Project (HGDP) 45 and 1000 Genomes 1 : African (AFR); Latino/admixed American (AMR); East Asian (EAS); Middle Eastern (MID); European (EUR), composed of Finnish (FIN) and Non-Finnish European (NFE); Other (OTH), not belonging to one of the other ancestries or is an admixture; South Asian (SAS).

We trained a random forest classifier 46 on a training set of the HGDP and 1000 Genomes samples variants on the autosome, obtained from gnomAD 11 . We generated the first 16 principal components (PCs) of the training sample genotypes (using the hwe_normalized_pca in Hail) at the high-quality variant sites for use as the feature vector for each training sample. We used the truth labels from the sample metadata, which can be found alongside the VCFs. Note that we do not train the classifier on the samples labelled as Other. We use the label probabilities (‘confidence’) of the classifier on the other ancestries to determine ancestry of Other.

To determine the ancestry of All of Us samples, we project the All of Us samples into the PCA space of the training data and apply the classifier. As a proxy for the accuracy of our All of Us predictions, we look at the concordance between the survey results and the predicted ancestry. The concordance between self-reported ethnicity and the ancestry predictions was 87.7%.

PC data from All of Us samples and the HGDP and 1000 Genomes samples were used to compute individual participant genetic ancestry fractions for All of Us samples using the Rye program. Rye uses PC data to carry out rapid and accurate genetic ancestry inference on biobank-scale datasets 47 . HGDP and 1000 Genomes reference samples were used to define a set of six distinct and coherent ancestry groups—African, East Asian, European, Middle Eastern, Latino/admixed American and South Asian—corresponding to participant self-identified race and ethnicity groups. Rye was run on the first 16 PCs, using the defined reference ancestry groups to assign ancestry group fractions to individual All of Us participant samples.

Relatedness

We calculated the kinship score using the Hail pc_relate function and reported any pairs with a kinship score above 0.1. The kinship score is half of the fraction of the genetic material shared (ranges from 0.0 to 0.5). We determined the maximal independent set 41 for related samples. We identified a maximally unrelated set of 231,442 samples (94%) for kinship scored greater than 0.1.

LDL-C common variant GWAS

The phenotypic data were extracted from the Curated Data Repository (CDR, Control Tier Dataset v7) in the All of Us Researcher Workbench. The All of Us Cohort Builder and Dataset Builder were used to extract all LDL cholesterol measurements from the Lab and Measurements criteria in EHR data for all participants who have WGS data. The most recent measurements were selected as the phenotype and adjusted for statin use 19 , age and sex. A rank-based inverse normal transformation was applied for this continuous trait to increase power and deflate type I error. Analysis was carried out on the Hail MatrixTable representation of the All of Us WGS joint-called data including removing monomorphic variants, variants with a call rate of <95% and variants with extreme Hardy–Weinberg equilibrium values ( P  < 10 −15 ). A linear regression was carried out with REGENIE 48 on variants with a minor allele frequency >5%, further adjusting for relatedness to the first five ancestry PCs. The final analysis included 34,924 participants and 8,589,520 variants.

Genotype-by-phenotype replication

We tested replication rates of known phenotype–genotype associations in three of the four largest populations: EUR, AFR and EAS. The AMR population was not included because they have no registered GWAS. This method is a conceptual extension of the original GWAS × phenome-wide association study, which replicated 66% of powered associations in a single EHR-linked biobank 49 . The PGRM is an expansion of this work by Bastarache et al., based on associations in the GWAS catalogue 50 in June 2020 (ref.  51 ). After directly matching the Experimental Factor Ontology terms to phecodes, the authors identified 8,085 unique loci and 170 unique phecodes that compose the PGRM. They showed replication rates in several EHR-linked biobanks ranging from 76% to 85%. For this analysis, we used the EUR-, and AFR-based maps, considering only catalogue associations that were P  < 5 × 10 −8 significant.

The main tools used were the Python package Hail for data extraction, plink for genomic associations, and the R packages PheWAS and pgrm for further analysis and visualization. The phenotypes, participant-reported sex at birth, and year of birth were extracted from the All of Us CDR (Controlled Tier Dataset v7). These phenotypes were then loaded into a plink-compatible format using the PheWAS package, and related samples were removed by sub-setting to the maximally unrelated dataset ( n  = 231,442). Only samples with EHR data were kept, filtered by selected loci, annotated with demographic and phenotypic information extracted from the CDR and ancestry prediction information provided by All of Us, ultimately resulting in 181,345 participants for downstream analysis. The variants in the PGRM were filtered by a minimum population-specific allele frequency of >1% or population-specific allele count of >100, leaving 4,986 variants. Results for which there were at least 20 cases in the ancestry group were included. Then, a series of Firth logistic regression tests with phecodes as the outcome and variants as the predictor were carried out, adjusting for age, sex (for non-sex-specific phenotypes) and the first three genomic PC features as covariates. The PGRM was annotated with power calculations based on the case counts and reported allele frequencies. Power of 80% or greater was considered powered for this analysis.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The All of Us Research Hub has a tiered data access data passport model with three data access tiers. The Public Tier dataset contains only aggregate data with identifiers removed. These data are available to the public through Data Snapshots ( https://www.researchallofus.org/data-tools/data-snapshots/ ) and the public Data Browser ( https://databrowser.researchallofus.org/ ). The Registered Tier curated dataset contains individual-level data, available only to approved researchers on the Researcher Workbench. At present, the Registered Tier includes data from EHRs, wearables and surveys, as well as physical measurements taken at the time of participant enrolment. The Controlled Tier dataset contains all data in the Registered Tier and additionally genomic data in the form of WGS and genotyping arrays, previously suppressed demographic data fields from EHRs and surveys, and unshifted dates of events. At present, Registered Tier and Controlled Tier data are available to researchers at academic institutions, non-profit institutions, and both non-profit and for-profit health care institutions. Work is underway to begin extending access to additional audiences, including industry-affiliated researchers. Researchers have the option to register for Registered Tier and/or Controlled Tier access by completing the All of Us Researcher Workbench access process, which includes identity verification and All of Us-specific training in research involving human participants ( https://www.researchallofus.org/register/ ). Researchers may create a new workspace at any time to conduct any research study, provided that they comply with all Data Use Policies and self-declare their research purpose. This information is made accessible publicly on the All of Us Research Projects Directory at https://allofus.nih.gov/protecting-data-and-privacy/research-projects-all-us-data .

Code availability

The GVS code is available at https://github.com/broadinstitute/gatk/tree/ah_var_store/scripts/variantstore . The LDL GWAS pipeline is available as a demonstration project in the Featured Workspace Library on the Researcher Workbench ( https://workbench.researchallofus.org/workspaces/aou-rw-5981f9dc/aouldlgwasregeniedsubctv6duplicate/notebooks ).

The 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526 , 68–74 (2015).

Article   Google Scholar  

Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577 , 179–189 (2020).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570 , 514–518 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lewis, A. C. F. et al. Getting genetic ancestry right for science and society. Science 376 , 250–252 (2022).

All of Us Program Investigators. The “All of Us” Research Program. N. Engl. J. Med. 381 , 668–676 (2019).

Ramirez, A. H., Gebo, K. A. & Harris, P. A. Progress with the All of Us Research Program: opening access for researchers. JAMA 325 , 2441–2442 (2021).

Article   PubMed   Google Scholar  

Ramirez, A. H. et al. The All of Us Research Program: data quality, utility, and diversity. Patterns 3 , 100570 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Overhage, J. M., Ryan, P. B., Reich, C. G., Hartzema, A. G. & Stang, P. E. Validation of a common data model for active safety surveillance research. J. Am. Med. Inform. Assoc. 19 , 54–60 (2012).

Venner, E. et al. Whole-genome sequencing as an investigational device for return of hereditary disease risk and pharmacogenomic results as part of the All of Us Research Program. Genome Med. 14 , 34 (2022).

Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536 , 285–291 (2016).

Tiao, G. & Goodrich, J. gnomAD v3.1 New Content, Methods, Annotations, and Data Availability ; https://gnomad.broadinstitute.org/news/2020-10-gnomad-v3-1-new-content-methods-annotations-and-data-availability/ .

Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625 , 92–100 (2022).

Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37 , 561–566 (2019).

Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37 , 555–560 (2019).

Stromberg, M. et al. Nirvana: clinical grade variant annotator. In Proc. 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics 596 (Association for Computing Machinery, 2017).

Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29 , 308–311 (2001).

Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol. https://doi.org/10.1038/s42003-023-05708-y (2024).

Karczewski, S. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581 , 434–443 (2020).

Selvaraj, M. S. et al. Whole genome sequence analysis of blood lipid levels in >66,000 individuals. Nat. Commun. 13 , 5995 (2022).

Wang, X. et al. Common and rare variants associated with cardiometabolic traits across 98,622 whole-genome sequences in the All of Us research program. J. Hum. Genet. 68 , 565–570 (2023).

Bastarache, L. et al. The phenotype-genotype reference map: improving biobank data science through replication. Am. J. Hum. Genet. 110 , 1522–1533 (2023).

Bianchi, D. W. et al. The All of Us Research Program is an opportunity to enhance the diversity of US biomedical research. Nat. Med. https://doi.org/10.1038/s41591-023-02744-3 (2024).

Van Driest, S. L. et al. Association between a common, benign genotype and unnecessary bone marrow biopsies among African American patients. JAMA Intern. Med. 181 , 1100–1105 (2021).

Chen, M.-H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182 , 1198–1213 (2020).

Chiou, J. et al. Interpreting type 1 diabetes risk with genetics and single-cell epigenomics. Nature 594 , 398–402 (2021).

Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47 , 898–905 (2015).

Grant, S. F. A. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat. Genet. 38 , 320–323 (2006).

Article   CAS   PubMed   Google Scholar  

All of Us Research Program. Framework for Access to All of Us Data Resources v1.1 (2021); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/data&tools/data-access-use/AoU_Data_Access_Framework_508.pdf .

Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177 , 58–69 (2019).

Mapes, B. M. et al. Diversity and inclusion for the All of Us research program: A scoping review. PLoS ONE 15 , e0234962 (2020).

Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590 , 290–299 (2021).

Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562 , 203–209 (2018).

Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607 , 732–740 (2022).

Kurniansyah, N. et al. Evaluating the use of blood pressure polygenic risk scores across race/ethnic background groups. Nat. Commun. 14 , 3202 (2023).

Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55 , 549– 558 (2022).

Linder, J. E. et al. Returning integrated genomic risk and clinical recommendations: the eMERGE study. Genet. Med. 25 , 100006 (2023).

Lennon, N. J. et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat. Med. https://doi.org/10.1038/s41591-024-02796-z (2024).

Deflaux, N. et al. Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis. Nat. Commun. 14 , 5419 (2023).

Regier, A. A. et al. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat. Commun. 9 , 4038 (2018).

Article   ADS   PubMed   PubMed Central   Google Scholar  

All of Us Research Program. Data and Statistics Dissemination Policy (2020); https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/2020/05/AoU_Policy_Data_and_Statistics_Dissemination_508.pdf .

Laurie, C. C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34 , 591–602 (2010).

Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91 , 839–848 (2012).

Cohen, J. Statistical Power Analysis for the Behavioral Sciences (Routledge, 2013).

Andrade, C. Mean difference, standardized mean difference (SMD), and their use in meta-analysis. J. Clin. Psychiatry 81 , 20f13681 (2020).

Cavalli-Sforza, L. L. The Human Genome Diversity Project: past, present and future. Nat. Rev. Genet. 6 , 333–340 (2005).

Ho, T. K. Random decision forests. In Proc. 3rd International Conference on Document Analysis and Recognition (IEEE Computer Society Press, 2002).

Conley, A. B. et al. Rye: genetic ancestry inference at biobank scale. Nucleic Acids Res. 51 , e44 (2023).

Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53 , 1097–1103 (2021).

Denny, J. C. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotech. 31 , 1102–1111 (2013).

Buniello, A. et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47 , D1005–D1012 (2019).

Bastarache, L. et al. The Phenotype-Genotype Reference Map: improving biobank data science through replication. Am. J. Hum. Genet. 10 , 1522–1533 (2023).

Download references

Acknowledgements

The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers (OT2 OD026549; OT2 OD026554; OT2 OD026557; OT2 OD026556; OT2 OD026550; OT2 OD 026552; OT2 OD026553; OT2 OD026548; OT2 OD026551; OT2 OD026555); Inter agency agreement AOD 16037; Federally Qualified Health Centers HHSN 263201600085U; Data and Research Center: U2C OD023196; Genome Centers (OT2 OD002748; OT2 OD002750; OT2 OD002751); Biobank: U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: U24 OD023163; Communications and Engagement: OT2 OD023205; OT2 OD023206; and Community Partners (OT2 OD025277; OT2 OD025315; OT2 OD025337; OT2 OD025276). In addition, the All of Us Research Program would not be possible without the partnership of its participants. All of Us and the All of Us logo are service marks of the US Department of Health and Human Services. E.E.E. is an investigator of the Howard Hughes Medical Institute. We acknowledge the foundational contributions of our friend and colleague, the late Deborah A. Nickerson. Debbie’s years of insightful contributions throughout the formation of the All of Us genomics programme are permanently imprinted, and she shares credit for all of the successes of this programme.

Author information

Authors and affiliations.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Alexander G. Bick & Henry R. Condon

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA

Ginger A. Metcalf, Eric Boerwinkle, Richard A. Gibbs, Donna M. Muzny, Eric Venner, Kimberly Walker, Jianhong Hu, Harsha Doddapaneni, Christie L. Kovar, Mullai Murugan, Shannon Dugan, Ziad Khan & Eric Boerwinkle

Vanderbilt Institute of Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA

Kelsey R. Mayo, Jodell E. Linder, Melissa Basford, Ashley Able, Ashley E. Green, Robert J. Carroll, Jennifer Zhang & Yuanyuan Wang

Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA

Lee Lichtenstein, Anthony Philippakis, Sophie Schwartz, M. Morgan T. Aster, Kristian Cibulskis, Andrea Haessly, Rebecca Asch, Aurora Cremer, Kylee Degatano, Akum Shergill, Laura D. Gauthier, Samuel K. Lee, Aaron Hatcher, George B. Grant, Genevieve R. Brandt, Miguel Covarrubias, Eric Banks & Wail Baalawi

Verily, South San Francisco, CA, USA

Shimon Rura, David Glazer, Moira K. Dillon & C. H. Albach

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA

Robert J. Carroll, Paul A. Harris & Dan M. Roden

All of Us Research Program, National Institutes of Health, Bethesda, MD, USA

Anjene Musick, Andrea H. Ramirez, Sokny Lim, Siddhartha Nambiar, Bradley Ozenberger, Anastasia L. Wise, Chris Lunt, Geoffrey S. Ginsburg & Joshua C. Denny

School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA

I. King Jordan, Shashwat Deepali Nagar & Shivam Sharma

Neuroscience Institute, Institute of Translational Genomic Medicine, Morehouse School of Medicine, Atlanta, GA, USA

Robert Meller

Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA

Mine S. Cicek, Stephen N. Thibodeau & Mine S. Cicek

Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA

Kimberly F. Doheny, Michelle Z. Mawhinney, Sean M. L. Griffith, Elvin Hsu, Hua Ling & Marcia K. Adams

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA

Evan E. Eichler, Joshua D. Smith, Christian D. Frazar, Colleen P. Davis, Karynne E. Patterson, Marsha M. Wheeler, Sean McGee, Mitzi L. Murray, Valeria Vasta, Dru Leistritz, Matthew A. Richardson, Aparna Radhakrishnan & Brenna W. Ehmen

Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA

Evan E. Eichler

Broad Institute of MIT and Harvard, Cambridge, MA, USA

Stacey Gabriel, Heidi L. Rehm, Niall J. Lennon, Christina Austin-Tse, Eric Banks, Michael Gatzen, Namrata Gupta, Katie Larsson, Sheli McDonough, Steven M. Harrison, Christopher Kachulis, Matthew S. Lebo, Seung Hoan Choi & Xin Wang

Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA

Gail P. Jarvik & Elisabeth A. Rosenthal

Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Dan M. Roden

Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA

Center for Individualized Medicine, Biorepository Program, Mayo Clinic, Rochester, MN, USA

Stephen N. Thibodeau, Ashley L. Blegen, Samantha J. Wirkus, Victoria A. Wagner, Jeffrey G. Meyer & Mine S. Cicek

Color Health, Burlingame, CA, USA

Scott Topper, Cynthia L. Neben, Marcie Steeves & Alicia Y. Zhou

School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA

Eric Boerwinkle

Laboratory for Molecular Medicine, Massachusetts General Brigham Personalized Medicine, Cambridge, MA, USA

Christina Austin-Tse, Emma Henricks & Matthew S. Lebo

Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA, USA

Christina M. Lockwood, Brian H. Shirts, Colin C. Pritchard, Jillian G. Buchan & Niklas Krumm

Manuscript Writing Group

  • Alexander G. Bick
  • , Ginger A. Metcalf
  • , Kelsey R. Mayo
  • , Lee Lichtenstein
  • , Shimon Rura
  • , Robert J. Carroll
  • , Anjene Musick
  • , Jodell E. Linder
  • , I. King Jordan
  • , Shashwat Deepali Nagar
  • , Shivam Sharma
  •  & Robert Meller

All of Us Research Program Genomics Principal Investigators

  • Melissa Basford
  • , Eric Boerwinkle
  • , Mine S. Cicek
  • , Kimberly F. Doheny
  • , Evan E. Eichler
  • , Stacey Gabriel
  • , Richard A. Gibbs
  • , David Glazer
  • , Paul A. Harris
  • , Gail P. Jarvik
  • , Anthony Philippakis
  • , Heidi L. Rehm
  • , Dan M. Roden
  • , Stephen N. Thibodeau
  •  & Scott Topper

Biobank, Mayo

  • Ashley L. Blegen
  • , Samantha J. Wirkus
  • , Victoria A. Wagner
  • , Jeffrey G. Meyer
  •  & Stephen N. Thibodeau

Genome Center: Baylor-Hopkins Clinical Genome Center

  • Donna M. Muzny
  • , Eric Venner
  • , Michelle Z. Mawhinney
  • , Sean M. L. Griffith
  • , Elvin Hsu
  • , Marcia K. Adams
  • , Kimberly Walker
  • , Jianhong Hu
  • , Harsha Doddapaneni
  • , Christie L. Kovar
  • , Mullai Murugan
  • , Shannon Dugan
  • , Ziad Khan
  •  & Richard A. Gibbs

Genome Center: Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine

  • Niall J. Lennon
  • , Christina Austin-Tse
  • , Eric Banks
  • , Michael Gatzen
  • , Namrata Gupta
  • , Emma Henricks
  • , Katie Larsson
  • , Sheli McDonough
  • , Steven M. Harrison
  • , Christopher Kachulis
  • , Matthew S. Lebo
  • , Cynthia L. Neben
  • , Marcie Steeves
  • , Alicia Y. Zhou
  • , Scott Topper
  •  & Stacey Gabriel

Genome Center: University of Washington

  • Gail P. Jarvik
  • , Joshua D. Smith
  • , Christian D. Frazar
  • , Colleen P. Davis
  • , Karynne E. Patterson
  • , Marsha M. Wheeler
  • , Sean McGee
  • , Christina M. Lockwood
  • , Brian H. Shirts
  • , Colin C. Pritchard
  • , Mitzi L. Murray
  • , Valeria Vasta
  • , Dru Leistritz
  • , Matthew A. Richardson
  • , Jillian G. Buchan
  • , Aparna Radhakrishnan
  • , Niklas Krumm
  •  & Brenna W. Ehmen

Data and Research Center

  • Lee Lichtenstein
  • , Sophie Schwartz
  • , M. Morgan T. Aster
  • , Kristian Cibulskis
  • , Andrea Haessly
  • , Rebecca Asch
  • , Aurora Cremer
  • , Kylee Degatano
  • , Akum Shergill
  • , Laura D. Gauthier
  • , Samuel K. Lee
  • , Aaron Hatcher
  • , George B. Grant
  • , Genevieve R. Brandt
  • , Miguel Covarrubias
  • , Melissa Basford
  • , Alexander G. Bick
  • , Ashley Able
  • , Ashley E. Green
  • , Jennifer Zhang
  • , Henry R. Condon
  • , Yuanyuan Wang
  • , Moira K. Dillon
  • , C. H. Albach
  • , Wail Baalawi
  •  & Dan M. Roden

All of Us Research Demonstration Project Teams

  • Seung Hoan Choi
  • , Elisabeth A. Rosenthal

NIH All of Us Research Program Staff

  • Andrea H. Ramirez
  • , Sokny Lim
  • , Siddhartha Nambiar
  • , Bradley Ozenberger
  • , Anastasia L. Wise
  • , Chris Lunt
  • , Geoffrey S. Ginsburg
  •  & Joshua C. Denny

Contributions

The All of Us Biobank (Mayo Clinic) collected, stored and plated participant biospecimens. The All of Us Genome Centers (Baylor-Hopkins Clinical Genome Center; Broad, Color, and Mass General Brigham Laboratory for Molecular Medicine; and University of Washington School of Medicine) generated and QCed the whole-genomic data. The All of Us Data and Research Center (Vanderbilt University Medical Center, Broad Institute of MIT and Harvard, and Verily) generated the WGS joint call set, carried out quality assurance and QC analyses and developed the Researcher Workbench. All of Us Research Demonstration Project Teams contributed analyses. The other All of Us Genomics Investigators and NIH All of Us Research Program Staff provided crucial programmatic support. Members of the manuscript writing group (A.G.B., G.A.M., K.R.M., L.L., S.R., R.J.C. and A.M.) wrote the first draft of this manuscript, which was revised with contributions and feedback from all authors.

Corresponding author

Correspondence to Alexander G. Bick .

Ethics declarations

Competing interests.

D.M.M., G.A.M., E.V., K.W., J.H., H.D., C.L.K., M.M., S.D., Z.K., E. Boerwinkle and R.A.G. declare that Baylor Genetics is a Baylor College of Medicine affiliate that derives revenue from genetic testing. Eric Venner is affiliated with Codified Genomics, a provider of genetic interpretation. E.E.E. is a scientific advisory board member of Variant Bio, Inc. A.G.B. is a scientific advisory board member of TenSixteen Bio. The remaining authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Timothy Frayling and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 historic availability of ehr records in all of us v7 controlled tier curated data repository (n = 413,457)..

For better visibility, the plot shows growth starting in 2010.

Extended Data Fig. 2 Overview of the Genomic Data Curation Pipeline for WGS samples.

The Data and Research Center (DRC) performs additional single sample quality control (QC) on the data as it arrives from the Genome Centers. The variants from samples that pass this QC are loaded into the Genomic Variant Store (GVS), where we jointly call the variants and apply additional QC. We apply a joint call set QC process, which is stored with the call set. The entire joint call set is rendered as a Hail Variant Dataset (VDS), which can be accessed from the analysis notebooks in the Researcher Workbench. Subsections of the genome are extracted from the VDS and rendered in different formats with all participants. Auxiliary data can also be accessed through the Researcher Workbench. This includes variant functional annotations, joint call set QC results, predicted ancestry, and relatedness. Auxiliary data are derived from GVS (arrow not shown) and the VDS. The Cohort Builder directly queries GVS when researchers request genomic data for subsets of samples. Aligned reads, as cram files, are available in the Researcher Workbench (not shown). The graphics of the dish, gene and computer and the All of Us logo are reproduced with permission of the National Institutes of Health’s All of Us Research Program.

Extended Data Fig. 3 Proportion of allelic frequencies (AF), stratified by computed ancestry with over 10,000 participants.

Bar counts are not cumulative (eg, “pop AF < 0.01” does not include “pop AF < 0.001”).

Extended Data Fig. 4 Distribution of pathogenic, and likely pathogenic ClinVar variants.

Stratified by ancestry filtered to only those variants that are found in allele count (AC) < 40 individuals for 245,388 short read WGS samples.

Extended Data Fig. 5 Ancestry specific HLA-DQB1 ( rs9273363 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight ancestry specific consequences across ancestries.

Extended Data Fig. 6 Ancestry specific TCF7L2 ( rs7903146 ) locus associations in 231,442 unrelated individuals.

Phenome-wide (PheWAS) associations highlight diabetic consequences across ancestries.

Supplementary information

Supplementary information.

Supplementary Figs. 1–7, Tables 1–8 and Note.

Reporting Summary

Supplementary dataset 1.

Associations of ACKR1, HLA-DQB1 and TCF7L2 loci with all Phecodes stratified by genetic ancestry.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature (2024). https://doi.org/10.1038/s41586-023-06957-x

Download citation

Received : 22 July 2022

Accepted : 08 December 2023

Published : 19 February 2024

DOI : https://doi.org/10.1038/s41586-023-06957-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

research on scientific journal

A once-ignored community of science sleuths now has the research community on its heels

research on scientific journal

A community of sleuths hunting for errors in scientific research have sent shockwaves through some of the most prestigious research institutions in the world — and the science community at large.

High-profile cases of alleged image manipulations in papers authored by the former president at Stanford University and leaders at the Dana-Farber Cancer Institute have made national media headlines, and some top science leaders think this could be just the start.

“At the rate things are going, we expect another one of these to come up every few weeks,” said Holden Thorp, the editor-in-chief of the Science family of scientific journals, whose namesake publication is one of the two most influential in the field. 

The sleuths argue their work is necessary to correct the scientific record and prevent generations of researchers from pursuing dead-end topics because of flawed papers. And some scientists say it’s time for universities and academic publishers to reform how they address flawed research. 

“I understand why the sleuths finding these things are so pissed off,” said Michael Eisen, a biologist, the former editor of the journal eLife and a prominent voice of reform in scientific publishing. “Everybody — the author, the journal, the institution, everybody — is incentivized to minimize the importance of these things.” 

For about a decade, science sleuths unearthed widespread problems in scientific images in published papers, publishing concerns online but receiving little attention. 

That began to change last summer after then-Stanford President Marc Tessier-Lavigne, who is a neuroscientist, stepped down from his post after scrutiny of alleged image manipulations in studies he helped author and a report criticizing his laboratory culture. Tessier-Lavigne was not found to have engaged in misconduct himself, but members of his lab appeared to manipulate images in dubious ways, a report from a scientific panel hired to examine the allegations said. 

In January, a scathing post from a blogger exposed questionable work from top leaders at the Dana-Farber Cancer Institute , which subsequently asked journals to retract six articles and issue corrections for dozens more. 

In a resignation statement , Tessier-Lavigne noted that the panel did not find that he knew of misconduct and that he never submitted papers he didn’t think were accurate. In a statement from its research integrity officer, Dana-Farber said it took decisive action to correct the scientific record and that image discrepancies were not necessarily evidence an author sought to deceive. 

“We’re certainly living through a moment — a public awareness — that really hit an inflection when the Marc Tessier-Lavigne matter happened and has continued steadily since then, with Dana-Farber being the latest,” Thorp said. 

Now, the long-standing problem is in the national spotlight, and new artificial intelligence tools are only making it easier to spot problems that range from decades-old errors and sloppy science to images enhanced unethically in photo-editing software.  

This heightened scrutiny is reshaping how some publishers are operating. And it’s pushing universities, journals and researchers to reckon with new technology, a potential backlog of undiscovered errors and how to be more transparent when problems are identified. 

This comes at a fraught time in academic halls. Bill Ackman, a venture capitalist, in a post on X last month discussed weaponizing artificial intelligence to identify plagiarism of leaders at top-flight universities where he has had ideological differences, raising questions about political motivations in plagiarism investigations. More broadly, public trust in scientists and science has declined steadily in recent years, according to the Pew Research Center .

Eisen said he didn’t think sleuths’ concerns over scientific images had veered into “McCarthyist” territory.

“I think they’ve been targeting a very specific type of problem in the literature, and they’re right — it’s bad,” Eisen said. 

Scientific publishing builds the base of what scientists understand about their disciplines, and it’s the primary way that researchers with new findings outline their work for colleagues. Before publication, scientific journals consider submissions and send them to outside researchers in the field for vetting and to spot errors or faulty reasoning, which is called peer review. Journal editors will review studies for plagiarism and for copy edits before they’re published. 

That system is not perfect and still relies on good-faith efforts by researchers to not manipulate their findings.

Over the past 15 years, scientists have grown increasingly concerned about problems that some researchers were digitally altering images in their papers to skew or emphasize results. Discovering irregularities in images — typically of experiments involving mice, gels or blots — has become a larger priority of scientific journals’ work.   

Jana Christopher, an expert on scientific images who works for the Federation of European Biochemical Societies and its journals, said the field of image integrity screening has grown rapidly since she began working in it about 15 years ago. 

At the time, “nobody was doing this and people were kind of in denial about research fraud,” Christopher said. “The common view was that it was very rare and every now and then you would find someone who fudged their results.” 

Today, scientific journals have entire teams dedicated to dealing with images and trying to ensure their accuracy. More papers are being retracted than ever — with a record 10,000-plus pulled last year, according to a Nature analysis . 

A loose group of scientific sleuths have added outside pressure. Sleuths often discover and flag errors or potential manipulations on the online forum PubPeer. Some sleuths receive little or no payment or public recognition for their work.

“To some extent, there is a vigilantism around it,” Eisen said. 

An analysis of comments on more than 24,000 articles posted on PubPeer found that more than 62% of comments on PubPeer were related to image manipulation. 

For years, sleuths relied on sharp eyes, keen pattern recognition and an understanding of photo manipulation tools. In the past few years, rapidly developing artificial intelligence tools, which can scan papers for irregularities, are supercharging their work. 

Now, scientific journals are adopting similar technology to try to prevent errors from reaching publication. In January, Science announced that it was using an artificial intelligence tool called Proofig to scan papers that were being edited and peer-reviewed for publication. 

Thorp, the Science editor-in-chief, said the family of six journals added the tool “quietly” into its workflow about six months before that January announcement. Before, the journal was reliant on eye-checks to catch these types of problems. 

Thorp said Proofig identified several papers late in the editorial process that were not published because of problematic images that were difficult to explain and other instances in which authors had “logical explanations” for issues they corrected before publication.

“The serious errors that cause us not to publish a paper are less than 1%,” Thorp said.

In a statement, Chris Graf, the research integrity director at the publishing company Springer Nature, said his company is developing and testing “in-house AI image integrity software” to check for image duplications. Graf’s research integrity unit currently uses Proofig to help assess articles if concerns are raised after publication. 

Graf said processes varied across its journals, but that some Springer Nature publications manually check images for manipulations with Adobe Photoshop tools and look for inconsistencies in raw data for experiments that visualize cell components or common scientific experiments.

“While the AI-based tools are helpful in speeding up and scaling up the investigations, we still consider the human element of all our investigations to be crucial,” Graf said, adding that image recognition software is not perfect and that human expertise is required to protect against false positives and negatives. 

No tool will catch every mistake or cheat. 

“There’s a lot of human beings in that process. We’re never going to catch everything,” Thorp said. “We need to get much better at managing this when it happens, as journals, institutions and authors.”

Many science sleuths had grown frustrated after their concerns seemed to be ignored or as investigations trickled along slowly and without a public resolution.  

Sholto David, who publicly exposed concerns about Dana-Farber research in a blog post, said he largely “gave up” on writing letters to journal editors about errors he discovered because their responses were so insufficient. 

Elisabeth Bik, a microbiologist and longtime image sleuth, said she has frequently flagged image problems and “nothing happens.” 

Leaving public comments questioning research figures on PubPeer can start a public conversation over questionable research, but authors and research institutions often don’t respond directly to the online critiques. 

While journals can issue corrections or retractions, it’s typically a research institution’s or a university’s responsibility to investigate cases. When cases involve biomedical research supported by federal funding, the federal Office of Research Integrity can investigate. 

Thorp said the institutions need to move more swiftly to take responsibility when errors are discovered and speak plainly and publicly about what happened to earn the public’s trust.  

“Universities are so slow at responding and so slow at running through their processes, and the longer that goes on, the more damage that goes on,” Thorp said. “We don’t know what happened if instead of launching this investigation Stanford said, ‘These papers are wrong. We’re going to retract them. It’s our responsibility. But for now, we’re taking the blame and owning up to this.’” 

Some scientists worry that image concerns are only scratching the surface of science’s integrity issues — problems in images are simply much easier to spot than data errors in spreadsheets. 

And while policing bad papers and seeking accountability is important, some scientists think those measures will be treating symptoms of the larger problem: a culture that rewards the careers of those who publish the most exciting results, rather than the ones that hold up over time. 

“The scientific culture itself does not say we care about being right; it says we care about getting splashy papers,” Eisen said. 

Evan Bush is a science reporter for NBC News. He can be reached at [email protected].

  • Skip to main content
  • Keyboard shortcuts for audio player

Shots - Health News

  • Your Health
  • Treatments & Tests
  • Health Inc.
  • Public Health

Reproductive rights in America

Research at the heart of a federal case against the abortion pill has been retracted.

Selena Simmons-Duffin

Selena Simmons-Duffin

research on scientific journal

The Supreme Court will hear the case against the abortion pill mifepristone on March 26. It's part of a two-drug regimen with misoprostol for abortions in the first 10 weeks of pregnancy. Anna Moneymaker/Getty Images hide caption

The Supreme Court will hear the case against the abortion pill mifepristone on March 26. It's part of a two-drug regimen with misoprostol for abortions in the first 10 weeks of pregnancy.

A scientific paper that raised concerns about the safety of the abortion pill mifepristone was retracted by its publisher this week. The study was cited three times by a federal judge who ruled against mifepristone last spring. That case, which could limit access to mifepristone throughout the country, will soon be heard in the Supreme Court.

The now retracted study used Medicaid claims data to track E.R. visits by patients in the month after having an abortion. The study found a much higher rate of complications than similar studies that have examined abortion safety.

Sage, the publisher of the journal, retracted the study on Monday along with two other papers, explaining in a statement that "expert reviewers found that the studies demonstrate a lack of scientific rigor that invalidates or renders unreliable the authors' conclusions."

It also noted that most of the authors on the paper worked for the Charlotte Lozier Institute, the research arm of anti-abortion lobbying group Susan B. Anthony Pro-Life America, and that one of the original peer reviewers had also worked for the Lozier Institute.

The Sage journal, Health Services Research and Managerial Epidemiology , published all three research articles, which are still available online along with the retraction notice. In an email to NPR, a spokesperson for Sage wrote that the process leading to the retractions "was thorough, fair, and careful."

The lead author on the paper, James Studnicki, fiercely defends his work. "Sage is targeting us because we have been successful for a long period of time," he says on a video posted online this week . He asserts that the retraction has "nothing to do with real science and has everything to do with a political assassination of science."

He says that because the study's findings have been cited in legal cases like the one challenging the abortion pill, "we have become visible – people are quoting us. And for that reason, we are dangerous, and for that reason, they want to cancel our work," Studnicki says in the video.

In an email to NPR, a spokesperson for the Charlotte Lozier Institute said that they "will be taking appropriate legal action."

Role in abortion pill legal case

Anti-abortion rights groups, including a group of doctors, sued the federal Food and Drug Administration in 2022 over the approval of mifepristone, which is part of a two-drug regimen used in most medication abortions. The pill has been on the market for over 20 years, and is used in more than half abortions nationally. The FDA stands by its research that finds adverse events from mifepristone are extremely rare.

Judge Matthew Kacsmaryk, the district court judge who initially ruled on the case, pointed to the now-retracted study to support the idea that the anti-abortion rights physicians suing the FDA had the right to do so. "The associations' members have standing because they allege adverse events from chemical abortion drugs can overwhelm the medical system and place 'enormous pressure and stress' on doctors during emergencies and complications," he wrote in his decision, citing Studnicki. He ruled that mifepristone should be pulled from the market nationwide, although his decision never took effect.

research on scientific journal

Matthew Kacsmaryk at his confirmation hearing for the federal bench in 2017. AP hide caption

Matthew Kacsmaryk at his confirmation hearing for the federal bench in 2017.

Kacsmaryk is a Trump appointee who was a vocal abortion opponent before becoming a federal judge.

"I don't think he would view the retraction as delegitimizing the research," says Mary Ziegler , a law professor and expert on the legal history of abortion at U.C. Davis. "There's been so much polarization about what the reality of abortion is on the right that I'm not sure how much a retraction would affect his reasoning."

Ziegler also doubts the retractions will alter much in the Supreme Court case, given its conservative majority. "We've already seen, when it comes to abortion, that the court has a propensity to look at the views of experts that support the results it wants," she says. The decision that overturned Roe v. Wade is an example, she says. "The majority [opinion] relied pretty much exclusively on scholars with some ties to pro-life activism and didn't really cite anybody else even or really even acknowledge that there was a majority scholarly position or even that there was meaningful disagreement on the subject."

In the mifepristone case, "there's a lot of supposition and speculation" in the argument about who has standing to sue, she explains. "There's a probability that people will take mifepristone and then there's a probability that they'll get complications and then there's a probability that they'll get treatment in the E.R. and then there's a probability that they'll encounter physicians with certain objections to mifepristone. So the question is, if this [retraction] knocks out one leg of the stool, does that somehow affect how the court is going to view standing? I imagine not."

It's impossible to know who will win the Supreme Court case, but Ziegler thinks that this retraction probably won't sway the outcome either way. "If the court is skeptical of standing because of all these aforementioned weaknesses, this is just more fuel to that fire," she says. "It's not as if this were an airtight case for standing and this was a potentially game-changing development."

Oral arguments for the case, Alliance for Hippocratic Medicine v. FDA , are scheduled for March 26 at the Supreme Court. A decision is expected by summer. Mifepristone remains available while the legal process continues.

  • Abortion policy
  • abortion pill
  • judge matthew kacsmaryk
  • mifepristone
  • retractions
  • Abortion rights
  • Supreme Court

IMAGES

  1. How to read a scientific journal article

    research on scientific journal

  2. (PDF) International Journal of Scientific Research and Review

    research on scientific journal

  3. (PDF) How to write ‘introduction’ in scientific journal article

    research on scientific journal

  4. Vol., 6(3)

    research on scientific journal

  5. Scientific Journal Science Journal Examples

    research on scientific journal

  6. (PDF) Choosing the Right Journal for a Scientific Paper

    research on scientific journal

VIDEO

  1. Research Study Introduction

  2. Top Tips for Writing Scientific Papers High Impact Journals

  3. Research as Scientific

  4. Secrets To Finding High-Impact Research Topics (I NEVER Revealed These Before)

  5. Anatomy of a Journal Article

  6. Scientific Research

COMMENTS

  1. ScienceDirect.com

    Neuroscience Explore our wide selection of Life Sciences journal articles and book chapters featuring original research, insightful analysis, current theory and more. Popular Articles Female Penis, Male Vagina, and Their Correlated Evolution in a Cave Insect

  2. Research articles

    Read the latest Research articles from Scientific Reports

  3. Science

    Science is a leading outlet for scientific news, commentary, and cutting-edge research. Through its print and online incarnations, Science reaches an estimated worldwide readership of more than one million.Science's authorship is global too, and its articles consistently rank among the world's most cited research.

  4. Nature

    First published in 1869, Nature is the world's leading multidisciplinary science journal. Nature publishes the finest peer-reviewed research that drives ground-breaking discovery, and is read by ...

  5. Science Family of Journals

    Ocean-Land-Atmosphere Research. The Open Access journal Ocean-Land-Atmosphere Research (OLAR), published in association with SML-ZHUHAI, publishes technologically innovative research in marine, terrestrial, and atmospheric studies and the interactions among them.

  6. Science

    21 Feb. 2024 Powerful new antivenom raises hopes for a universal solution to lethal snakebites 21 Feb 2024 Sweeping chronic fatigue study brings clues but not clarity to mysterious syndrome

  7. PubMed

    PubMed is a comprehensive database of biomedical literature from various sources, including MEDLINE, life science journals, and online books. You can search for citations, access full text content, and explore topics related to health, medicine, and biology. PubMed also provides advanced search options and tools for researchers and clinicians.

  8. Home

    PubMed Central ® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM) ... Discover a digital archive of scholarly articles, spanning centuries of scientific research. User Guide Learn how to find and read articles of interest to ...

  9. Research articles

    Read the latest Research articles from Nature. Pinning-point changes over three epochs spanning the periods 1973-1989, 1989-2000 and 2000−2022 were measured, and by proxy the changes to ice ...

  10. Web of Science Master Journal List

    Browse, search, and explore journals indexed in the Web of Science. The Master Journal List is an invaluable tool to help you to find the right journal for your needs across multiple indices hosted on the Web of Science platform. Spanning all disciplines and regions, Web of Science Core Collection is at the heart of the Web of Science platform. Curated with care by an expert team of in-house ...

  11. About This Journal

    Launched in 2018, Research is the first journal in the Science Partner Journal (SPJ) program. Research is published by the American Association for the Advancement of Science (AAAS) in association with Science and Technology Review Publishing House, the publishing house under the leadership of China Association for Science and Technology (CAST).

  12. SJR : Scientific Journal Rankings

    International Scientific Journal & Country Ranking. SCImago Institutions Rankings SCImago Media Rankings SCImago Iber SCImago Research Centers Ranking SCImago Graphica Ediciones Profesionales de la Información

  13. ScienceDaily: Your source for the latest research news

    ScienceDaily features breaking news about the latest discoveries in science, health, the environment, technology, and more -- from leading universities, scientific journals, and research ...

  14. JSTOR Home

    Harness the power of visual materials—explore more than 3 million images now on JSTOR. Enhance your scholarly research with underground newspapers, magazines, and journals. Explore collections in the arts, sciences, and literature from the world's leading museums, archives, and scholars. JSTOR is a digital library of academic journals ...

  15. List of scientific journals

    The following is a partial list of scientific journals.There are thousands of scientific journals in publication, and many more have been published at various points in the past. The list given here is far from exhaustive, only containing some of the most influential, currently publishing journals in each field.

  16. Research

    Anna Rinaldi Alex Rajewski Jasper Callemeyn Elisabet Van Loon Baptiste Lamarthée Ambart Ester Covarrubias Jean Hou [...] Sanjeev Kumar +9 authors Science Vol. 383, NO. 6685 23 Feb 2024 Research Article An immunogenetic basis for lung cancer risk by Chirag Krishna Anniina Tervi Miriam Saffern Eric A. Wilson Seong-Keun Yoo Nina Mars Vladimir Roudko

  17. Research

    Research | SPJ Research The Open access journal Research, published in association with CAST, publishes innovative, wide-ranging research in life sciences, physical sciences, engineering and applied science. Information for Authors Latest Articles Research Article 23 Feb 2024

  18. Scientific Reports

    Top 100 Collections Explore subject area specific Collections of the most downloaded* papers from Scientific Reports in 2022. *Data obtained from SN Insights which is based on Digital Science's...

  19. SCIRP Open Access

    Scientific Research Publishing is an academic publisher with more than 200 open access journal in the areas of science, technology and medicine. It also publishes academic books and conference proceedings.

  20. Disentangling the Anacondas: Revealing a New Green Species and ...

    Editor's Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area.

  21. Scientific Research Journal (Scirj)

    Scirj, Scientific Research Journal, is a peer-reviewed open access journal that meets high quality standards by exercising peer review and editorial quality control. Scirj encourages open access and is universally accessible online journal. Scirj covers the publication of research articles from all areas of Science, art, management and technology.

  22. Journals

    Scientific Research Publishing is an academic publisher with more than 200 open access journal in the areas of science, technology and medicine. It also publishes academic books and conference proceedings.

  23. In Science Journals

    Sulfur Cycle The story in sulfur. H. Jesse Smith. The sulfur isotope composition of pyrite found in marine sediments and sedimentary rocks is often used to try to reconstruct the coupled cycles of carbon, oxygen, and sulfur. However, the resulting interpretations can be complicated by the competing effects of physical and biological processes.

  24. Here's What Happens When ChatGPT Writes a Scientific Article

    F irst came the students, who wanted help with their homework and essays. Now, ChatGPT is luring scientists, who are under pressure to publish papers in reputable scientific journals. AI is ...

  25. A Columbia Surgeon's Study Was Pulled. He Kept Publishing Flawed Data

    So did the American Association for Cancer Research, which published 10 articles under question from Dr. Yoon's lab across four journals. It is difficult to know who is responsible for errors in ...

  26. Genomic data in the All of Us Research Program

    To accelerate health research, All of Us is committed to curating and releasing research data early and often 6.Less than five years after national enrolment began in 2018, this fifth data release ...

  27. A once-ignored community of science sleuths now has the research

    Today, scientific journals have entire teams dedicated to dealing with images and trying to ensure their accuracy. More papers are being retracted than ever — with a record 10,000-plus pulled ...

  28. Solving the puzzle of Long Covid

    From an extensive body of mechanistic research in people affected by Long Covid, there appear to be multiple potential pathogenic pathways, including persistence of the virus or its components in tissue reservoirs; autoimmune or an unchecked, dysregulated immune response; mitochondrial dysfunction; vascular (endothelial) and/or neuronal inflammation; and microbiome dysbiosis ().

  29. The abortion pill case on its way to the Supreme Court cites a

    The Sage journal, Health Services Research and Managerial Epidemiology, published all three research articles, which are still available online along with the retraction notice. In an email to NPR ...