Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • CAREER FEATURE
  • 04 December 2020
  • Correction 09 December 2020

How to write a superb literature review

Andy Tay is a freelance writer based in Singapore.

You can also search for this author in PubMed   Google Scholar

Literature reviews are important resources for scientists. They provide historical context for a field while offering opinions on its future trajectory. Creating them can provide inspiration for one’s own research, as well as some practice in writing. But few scientists are trained in how to write a review — or in what constitutes an excellent one. Even picking the appropriate software to use can be an involved decision (see ‘Tools and techniques’). So Nature asked editors and working scientists with well-cited reviews for their tips.

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Rent or buy this article

Prices vary by article type

Prices may be subject to local taxes which are calculated during checkout

doi: https://doi.org/10.1038/d41586-020-03422-x

Interviews have been edited for length and clarity.

Updates & Corrections

Correction 09 December 2020 : An earlier version of the tables in this article included some incorrect details about the programs Zotero, Endnote and Manubot. These have now been corrected.

Hsing, I.-M., Xu, Y. & Zhao, W. Electroanalysis 19 , 755–768 (2007).

Article   Google Scholar  

Ledesma, H. A. et al. Nature Nanotechnol. 14 , 645–657 (2019).

Article   PubMed   Google Scholar  

Brahlek, M., Koirala, N., Bansal, N. & Oh, S. Solid State Commun. 215–216 , 54–62 (2015).

Choi, Y. & Lee, S. Y. Nature Rev. Chem . https://doi.org/10.1038/s41570-020-00221-w (2020).

Download references

Related Articles

review of research paper pdf

  • Research management

I help researchers build fantastic funding proposals — here’s how

I help researchers build fantastic funding proposals — here’s how

Career Q&A 26 FEB 24

‘This is exclusion’: Florida law restricts hiring of researchers from seven countries

‘This is exclusion’: Florida law restricts hiring of researchers from seven countries

Career News 26 FEB 24

Why the US border remains ‘a place of terror’ for Chinese researchers

Why the US border remains ‘a place of terror’ for Chinese researchers

Career Feature 26 FEB 24

How institutions can tap into research managers’ potential

How institutions can tap into research managers’ potential

Nature Index 26 FEB 24

Why it would be a dangerous folly to end US–China science pact

Why it would be a dangerous folly to end US–China science pact

Editorial 26 FEB 24

‘All of Us’ genetics chart stirs unease over controversial depiction of race

‘All of Us’ genetics chart stirs unease over controversial depiction of race

News 23 FEB 24

Open-access publishing: citation advantage is unproven

Correspondence 13 FEB 24

Postdoctoral Fellow on Myelin Regeneration and Rejuvenation Biology

Use emerging single-cell 'omic' technologies and rejuvenation strategies to enhance myelin regeneration

Edmonton (City), Alberta (CA)

University of Alberta

review of research paper pdf

Scientist / Postdoc (m/f/d): Bioimaging

The research project to be filled addresses the role of macrophages and pericytes in myocardial infarction and reperfusion injury. Using various im...

Dortmund, Nordrhein-Westfalen (DE)

Leibniz-Institut für Analytische Wissenschaften – ISAS – e.V.

review of research paper pdf

Scientist* Personalized Computational Genomics

Researcher to join our Personalized Computational Genomics team to support the development of cutting-edge personalized cancer vaccines

Mainz, Rheinland-Pfalz (DE)

BioNTech SE

review of research paper pdf

Software Engineer* Personalized Computational Genomics

Computer Scientist to join our Personalized Computational Genomics team to support the development of cutting-edge personalized cancer vaccines

Postdoctoral Associate

Houston, Texas (US)

Baylor College of Medicine (BCM)

review of research paper pdf

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Literature Review | Guide, Examples, & Templates

How to Write a Literature Review | Guide, Examples, & Templates

Published on January 2, 2023 by Shona McCombes . Revised on September 11, 2023.

What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research that you can later apply to your paper, thesis, or dissertation topic .

There are five key steps to writing a literature review:

  • Search for relevant literature
  • Evaluate sources
  • Identify themes, debates, and gaps
  • Outline the structure
  • Write your literature review

A good literature review doesn’t just summarize sources—it analyzes, synthesizes , and critically evaluates to give a clear picture of the state of knowledge on the subject.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

What is the purpose of a literature review, examples of literature reviews, step 1 – search for relevant literature, step 2 – evaluate and select sources, step 3 – identify themes, debates, and gaps, step 4 – outline your literature review’s structure, step 5 – write your literature review, free lecture slides, other interesting articles, frequently asked questions, introduction.

  • Quick Run-through
  • Step 1 & 2

When you write a thesis , dissertation , or research paper , you will likely have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to:

  • Demonstrate your familiarity with the topic and its scholarly context
  • Develop a theoretical framework and methodology for your research
  • Position your work in relation to other researchers and theorists
  • Show how your research addresses a gap or contributes to a debate
  • Evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

Writing literature reviews is a particularly important skill if you want to apply for graduate school or pursue a career in research. We’ve written a step-by-step guide that you can follow below.

Literature review guide

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Writing literature reviews can be quite challenging! A good starting point could be to look at some examples, depending on what kind of literature review you’d like to write.

  • Example literature review #1: “Why Do People Migrate? A Review of the Theoretical Literature” ( Theoretical literature review about the development of economic migration theory from the 1950s to today.)
  • Example literature review #2: “Literature review as a research methodology: An overview and guidelines” ( Methodological literature review about interdisciplinary knowledge acquisition and production.)
  • Example literature review #3: “The Use of Technology in English Language Learning: A Literature Review” ( Thematic literature review about the effects of technology on language acquisition.)
  • Example literature review #4: “Learners’ Listening Comprehension Difficulties in English Language Learning: A Literature Review” ( Chronological literature review about how the concept of listening skills has changed over time.)

You can also check out our templates with literature review examples and sample outlines at the links below.

Download Word doc Download Google doc

Before you begin searching for literature, you need a clearly defined topic .

If you are writing the literature review section of a dissertation or research paper, you will search for literature related to your research problem and questions .

Make a list of keywords

Start by creating a list of keywords related to your research question. Include each of the key concepts or variables you’re interested in, and list any synonyms and related terms. You can add to this list as you discover new keywords in the process of your literature search.

  • Social media, Facebook, Instagram, Twitter, Snapchat, TikTok
  • Body image, self-perception, self-esteem, mental health
  • Generation Z, teenagers, adolescents, youth

Search for relevant sources

Use your keywords to begin searching for sources. Some useful databases to search for journals and articles include:

  • Your university’s library catalogue
  • Google Scholar
  • Project Muse (humanities and social sciences)
  • Medline (life sciences and biomedicine)
  • EconLit (economics)
  • Inspec (physics, engineering and computer science)

You can also use boolean operators to help narrow down your search.

Make sure to read the abstract to find out whether an article is relevant to your question. When you find a useful book or article, you can check the bibliography to find other relevant sources.

You likely won’t be able to read absolutely everything that has been written on your topic, so it will be necessary to evaluate which sources are most relevant to your research question.

For each publication, ask yourself:

  • What question or problem is the author addressing?
  • What are the key concepts and how are they defined?
  • What are the key theories, models, and methods?
  • Does the research use established frameworks or take an innovative approach?
  • What are the results and conclusions of the study?
  • How does the publication relate to other literature in the field? Does it confirm, add to, or challenge established knowledge?
  • What are the strengths and weaknesses of the research?

Make sure the sources you use are credible , and make sure you read any landmark studies and major theories in your field of research.

You can use our template to summarize and evaluate sources you’re thinking about using. Click on either button below to download.

Take notes and cite your sources

As you read, you should also begin the writing process. Take notes that you can later incorporate into the text of your literature review.

It is important to keep track of your sources with citations to avoid plagiarism . It can be helpful to make an annotated bibliography , where you compile full citation information and write a paragraph of summary and analysis for each source. This helps you remember what you read and saves time later in the process.

The only proofreading tool specialized in correcting academic writing - try for free!

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

review of research paper pdf

Try for free

To begin organizing your literature review’s argument and structure, be sure you understand the connections and relationships between the sources you’ve read. Based on your reading and notes, you can look for:

  • Trends and patterns (in theory, method or results): do certain approaches become more or less popular over time?
  • Themes: what questions or concepts recur across the literature?
  • Debates, conflicts and contradictions: where do sources disagree?
  • Pivotal publications: are there any influential theories or studies that changed the direction of the field?
  • Gaps: what is missing from the literature? Are there weaknesses that need to be addressed?

This step will help you work out the structure of your literature review and (if applicable) show how your own research will contribute to existing knowledge.

  • Most research has focused on young women.
  • There is an increasing interest in the visual aspects of social media.
  • But there is still a lack of robust research on highly visual platforms like Instagram and Snapchat—this is a gap that you could address in your own research.

There are various approaches to organizing the body of a literature review. Depending on the length of your literature review, you can combine several of these strategies (for example, your overall structure might be thematic, but each theme is discussed chronologically).

Chronological

The simplest approach is to trace the development of the topic over time. However, if you choose this strategy, be careful to avoid simply listing and summarizing sources in order.

Try to analyze patterns, turning points and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred.

If you have found some recurring central themes, you can organize your literature review into subsections that address different aspects of the topic.

For example, if you are reviewing literature about inequalities in migrant health outcomes, key themes might include healthcare policy, language barriers, cultural attitudes, legal status, and economic access.

Methodological

If you draw your sources from different disciplines or fields that use a variety of research methods , you might want to compare the results and conclusions that emerge from different approaches. For example:

  • Look at what results have emerged in qualitative versus quantitative research
  • Discuss how the topic has been approached by empirical versus theoretical scholarship
  • Divide the literature into sociological, historical, and cultural sources

Theoretical

A literature review is often the foundation for a theoretical framework . You can use it to discuss various theories, models, and definitions of key concepts.

You might argue for the relevance of a specific theoretical approach, or combine various theoretical concepts to create a framework for your research.

Like any other academic text , your literature review should have an introduction , a main body, and a conclusion . What you include in each depends on the objective of your literature review.

The introduction should clearly establish the focus and purpose of the literature review.

Depending on the length of your literature review, you might want to divide the body into subsections. You can use a subheading for each theme, time period, or methodological approach.

As you write, you can follow these tips:

  • Summarize and synthesize: give an overview of the main points of each source and combine them into a coherent whole
  • Analyze and interpret: don’t just paraphrase other researchers — add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
  • Critically evaluate: mention the strengths and weaknesses of your sources
  • Write in well-structured paragraphs: use transition words and topic sentences to draw connections, comparisons and contrasts

In the conclusion, you should summarize the key findings you have taken from the literature and emphasize their significance.

When you’ve finished writing and revising your literature review, don’t forget to proofread thoroughly before submitting. Not a language expert? Check out Scribbr’s professional proofreading services !

This article has been adapted into lecture slides that you can use to teach your students about writing a literature review.

Scribbr slides are free to use, customize, and distribute for educational purposes.

Open Google Slides Download PowerPoint

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarize yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

The literature review usually comes near the beginning of your thesis or dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other  academic texts , with an introduction , a main body, and a conclusion .

An  annotated bibliography is a list of  source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a  paper .  

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, September 11). How to Write a Literature Review | Guide, Examples, & Templates. Scribbr. Retrieved February 26, 2024, from https://www.scribbr.com/dissertation/literature-review/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research methodology | steps & tips, how to write a research proposal | examples & templates, what is your plagiarism score.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Article review writing format, steps, examples and illustration PDF Compiled by Mohammed Yismaw

Profile image of Muhammed Yismaw

2021, Article review writing format, steps, examples and illustration PDF Compiled by Mohammed Yismaw

The purpose of this document is to help students and researchers understand how a review of an academic journal is conducted and reported in different fields of study. Review articles in academic journals that analyze or discuss researches previously published by others, rather than reporting new research results or findings. Summaries and critiques are two ways to write a review of a scientific journal article. Both types of writing ask you first to read and understand an article from the primary literature about your topic. The summary involves briefly but accurately stating the key points of the article for a reader who has not read the original article. The critique begins by summarizing the article and then analyzes and evaluates the author’s research. Summaries and critiques help you learn to synthesize information from different sources and are usually limited to two pages maximum.

Related Papers

Harald von Kortzfleisch , Christoph Kahle

Neue Technologien und Innovationen stellen heutzutage wichtige Schlüsselelemente der Wachstums und Erfolgssicherung von Unternehmen dar. Durch einen in Geschwindigkeit und Intensität immer schneller zunehmenden Wettbewerb nehmen Innovationen eine immer zentralere Rolle im Praxisalltag von Unternehmen ein. Dieser technische Fortschritt treibt auch in der Wissenschaft das Thema des Innovationsmanagements in den letzten Jahrzehnten immer stärker voran und wird dort ausgiebig diskutiert. Die Bedeutung von Innovationen wächst dabei ebenfalls aus der Sicht der Kunden, welche heutzutage viel differenzierter als früher Produkte und Dienste nachfragen und somit Unternehmen vor neue Herausforderungen stellen. Überdies stellen Innovationen heute ein entscheidendes Bindeglied zwischen Marktorientierung und erhofften Unternehmenserfolg dar. Seit einigen Jahren lässt sich eine Öffnung der Unternehmensgrenzen für externe Quellen wie Kunden, Zulieferer, Universitäten oder teilweise auch M...

review of research paper pdf

SSRN Electronic Journal

Helmut Krcmar

Dominic Lindner

Alexandra Waluszewski

Research Policy

Nuria Gonzalez Alvarez

Creativity and Innovation Management

Matti Pihlajamaa

Firms tap into user knowledge to learn about the users’ needs. While users have been recognized as a valuable source of knowledge for innovation, few studies have investigated how their knowledge is integrated into innovation processes in the context of complex products and systems (CoPS). The purpose of this study is to reveal the practices of CoPS manufacturers to facilitate user knowledge utilization for innovation. We investigate two case companies, a medical device manufacturer and an aircraft manufacturer, and report on seven managerial practices for utilizing user knowledge. We adopt the absorptive capacity model in structuring our findings and elaborate three of the model's sub-capabilities (recognition of the value of user knowledge, acquisition of user knowledge, and assimilation/transformation of user knowledge) by proposing that each is associated with a distinct managerial goal and related practices: (1) Sensitizing the organization to the innovation potential of user knowledge, (2) identifying and gaining access to suitable user knowledge, and (3) analyzing and interpreting user knowledge and integrating it into product development. Our study contributes to the innovation management literature by analyzing the capabilities required to utilize user knowledge throughout the CoPS innovation process.

Information & Management

Diffusion of digital technologies into the manufacturing industry has created new opportunities for innovation that firms must address to remain competitive. We investigate the role of customer and user knowledge in the digital innovation processes of three global B2B manufacturing companies. We find that the B2B manufacturing industry's characteristics influence how users and customers may be leveraged. Customers making the purchasing decisions are considered for knowledge about short-term changes in market needs, while users working directly with the products provide long-term guidance for digital innovation. We identify practices for acquiring, distributing, and using customer and user knowledge for digital innovation.

Journal of business market management

Patricia Sandmeier

Journal of Entrepreneurship, Management and Innovation

Journal of Entrepreneurship, Management and Innovation JEMI

Given the rising role of users in innovation processes and the increasing amount of research in this field the aim of this paper is to explore the limits of our understanding of the user innovation (UI) concept. In doing so, the study addresses four basic questions: (1) Why do users create and share innovation? (2) Who is the user-innovator? (3) What type of innovation do users create? (4) How do users innovate? The results of a systematic literature review identified the main research streams on user innovation, together with weaknesses of past research and perspectives for future studies.

RELATED PAPERS

Gernot Grabher

Journal of Computer‐ …

Petra Schubert , Kathrin Möslein

Mossimo Sesom

Shahab Zare

Arthur Shulman

International Journal of Technology Management

Richard Farr

European Journal of Dental Education

Y.P. CHANDRA

Chandra Yanto

Management Science

John Roberts

Maria Antikainen

Johanna Bragge

intechopen.com

Ivona Vrdoljak Raguz

Service Science

Tuure Tuunanen

Jouni K Juntunen

Benji Decker

Eva Heiskanen

Handbook of Marketing

Jerome Hauser

Service Industries Journal

Christian Kowalkowski

Journal of Engineering Design

Ola Isaksson , Anna Rönnbäck

Journal of Management

Bettina Bastian

International Journal of Innovation Management

Harald von Kortzfleisch

Guido H Baltes

Technology Analysis & Strategic Management

Raimo Lovio

Marco Bertoni , Christian Johansson

Dominik Walcher

Managing Service Quality

Tor W. Andreassen

Journal of Product Innovation Management

Gary Schirr

System Sciences, 2004. …

Ralf Reichwald , Dominik Walcher

Edina Vadovics

Jouni Similä

Luis Cancino Muñoz

Shell Artillery

Ralf Reichwald

Journal of the Academy of …

Ian Wilkinson , Subroto Roy

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Page Content

Overview of the review report format, the first read-through, first read considerations, spotting potential major flaws, concluding the first reading, rejection after the first reading, before starting the second read-through, doing the second read-through, the second read-through: section by section guidance, how to structure your report, on presentation and style, criticisms & confidential comments to editors, the recommendation, when recommending rejection, additional resources, step by step guide to reviewing a manuscript.

When you receive an invitation to peer review, you should be sent a copy of the paper's abstract to help you decide whether you wish to do the review. Try to respond to invitations promptly - it will prevent delays. It is also important at this stage to declare any potential Conflict of Interest.

The structure of the review report varies between journals. Some follow an informal structure, while others have a more formal approach.

" Number your comments!!! " (Jonathon Halbesleben, former Editor of Journal of Occupational and Organizational Psychology)

Informal Structure

Many journals don't provide criteria for reviews beyond asking for your 'analysis of merits'. In this case, you may wish to familiarize yourself with examples of other reviews done for the journal, which the editor should be able to provide or, as you gain experience, rely on your own evolving style.

Formal Structure

Other journals require a more formal approach. Sometimes they will ask you to address specific questions in your review via a questionnaire. Or they might want you to rate the manuscript on various attributes using a scorecard. Often you can't see these until you log in to submit your review. So when you agree to the work, it's worth checking for any journal-specific guidelines and requirements. If there are formal guidelines, let them direct the structure of your review.

In Both Cases

Whether specifically required by the reporting format or not, you should expect to compile comments to authors and possibly confidential ones to editors only.

Reviewing with Empathy

Following the invitation to review, when you'll have received the article abstract, you should already understand the aims, key data and conclusions of the manuscript. If you don't, make a note now that you need to feedback on how to improve those sections.

The first read-through is a skim-read. It will help you form an initial impression of the paper and get a sense of whether your eventual recommendation will be to accept or reject the paper.

Keep a pen and paper handy when skim-reading.

Try to bear in mind the following questions - they'll help you form your overall impression:

  • What is the main question addressed by the research? Is it relevant and interesting?
  • How original is the topic? What does it add to the subject area compared with other published material?
  • Is the paper well written? Is the text clear and easy to read?
  • Are the conclusions consistent with the evidence and arguments presented? Do they address the main question posed?
  • If the author is disagreeing significantly with the current academic consensus, do they have a substantial case? If not, what would be required to make their case credible?
  • If the paper includes tables or figures, what do they add to the paper? Do they aid understanding or are they superfluous?

While you should read the whole paper, making the right choice of what to read first can save time by flagging major problems early on.

Editors say, " Specific recommendations for remedying flaws are VERY welcome ."

Examples of possibly major flaws include:

  • Drawing a conclusion that is contradicted by the author's own statistical or qualitative evidence
  • The use of a discredited method
  • Ignoring a process that is known to have a strong influence on the area under study

If experimental design features prominently in the paper, first check that the methodology is sound - if not, this is likely to be a major flaw.

You might examine:

  • The sampling in analytical papers
  • The sufficient use of control experiments
  • The precision of process data
  • The regularity of sampling in time-dependent studies
  • The validity of questions, the use of a detailed methodology and the data analysis being done systematically (in qualitative research)
  • That qualitative research extends beyond the author's opinions, with sufficient descriptive elements and appropriate quotes from interviews or focus groups

Major Flaws in Information

If methodology is less of an issue, it's often a good idea to look at the data tables, figures or images first. Especially in science research, it's all about the information gathered. If there are critical flaws in this, it's very likely the manuscript will need to be rejected. Such issues include:

  • Insufficient data
  • Unclear data tables
  • Contradictory data that either are not self-consistent or disagree with the conclusions
  • Confirmatory data that adds little, if anything, to current understanding - unless strong arguments for such repetition are made

If you find a major problem, note your reasoning and clear supporting evidence (including citations).

After the initial read and using your notes, including those of any major flaws you found, draft the first two paragraphs of your review - the first summarizing the research question addressed and the second the contribution of the work. If the journal has a prescribed reporting format, this draft will still help you compose your thoughts.

The First Paragraph

This should state the main question addressed by the research and summarize the goals, approaches, and conclusions of the paper. It should:

  • Help the editor properly contextualize the research and add weight to your judgement
  • Show the author what key messages are conveyed to the reader, so they can be sure they are achieving what they set out to do
  • Focus on successful aspects of the paper so the author gets a sense of what they've done well

The Second Paragraph

This should provide a conceptual overview of the contribution of the research. So consider:

  • Is the paper's premise interesting and important?
  • Are the methods used appropriate?
  • Do the data support the conclusions?

After drafting these two paragraphs, you should be in a position to decide whether this manuscript is seriously flawed and should be rejected (see the next section). Or whether it is publishable in principle and merits a detailed, careful read through.

Even if you are coming to the opinion that an article has serious flaws, make sure you read the whole paper. This is very important because you may find some really positive aspects that can be communicated to the author. This could help them with future submissions.

A full read-through will also make sure that any initial concerns are indeed correct and fair. After all, you need the context of the whole paper before deciding to reject. If you still intend to recommend rejection, see the section "When recommending rejection."

Once the paper has passed your first read and you've decided the article is publishable in principle, one purpose of the second, detailed read-through is to help prepare the manuscript for publication. You may still decide to recommend rejection following a second reading.

" Offer clear suggestions for how the authors can address the concerns raised. In other words, if you're going to raise a problem, provide a solution ." (Jonathon Halbesleben, Editor of Journal of Occupational and Organizational Psychology)

Preparation

To save time and simplify the review:

  • Don't rely solely upon inserting comments on the manuscript document - make separate notes
  • Try to group similar concerns or praise together
  • If using a review program to note directly onto the manuscript, still try grouping the concerns and praise in separate notes - it helps later
  • Note line numbers of text upon which your notes are based - this helps you find items again and also aids those reading your review

Now that you have completed your preparations, you're ready to spend an hour or so reading carefully through the manuscript.

As you're reading through the manuscript for a second time, you'll need to keep in mind the argument's construction, the clarity of the language and content.

With regard to the argument’s construction, you should identify:

  • Any places where the meaning is unclear or ambiguous
  • Any factual errors
  • Any invalid arguments

You may also wish to consider:

  • Does the title properly reflect the subject of the paper?
  • Does the abstract provide an accessible summary of the paper?
  • Do the keywords accurately reflect the content?
  • Is the paper an appropriate length?
  • Are the key messages short, accurate and clear?

Not every submission is well written. Part of your role is to make sure that the text’s meaning is clear.

Editors say, " If a manuscript has many English language and editing issues, please do not try and fix it. If it is too bad, note that in your review and it should be up to the authors to have the manuscript edited ."

If the article is difficult to understand, you should have rejected it already. However, if the language is poor but you understand the core message, see if you can suggest improvements to fix the problem:

  • Are there certain aspects that could be communicated better, such as parts of the discussion?
  • Should the authors consider resubmitting to the same journal after language improvements?
  • Would you consider looking at the paper again once these issues are dealt with?

On Grammar and Punctuation

Your primary role is judging the research content. Don't spend time polishing grammar or spelling. Editors will make sure that the text is at a high standard before publication. However, if you spot grammatical errors that affect clarity of meaning, then it's important to highlight these. Expect to suggest such amendments - it's rare for a manuscript to pass review with no corrections.

A 2010 study of nursing journals found that 79% of recommendations by reviewers were influenced by grammar and writing style (Shattel, et al., 2010).

1. The Introduction

A well-written introduction:

  • Sets out the argument
  • Summarizes recent research related to the topic
  • Highlights gaps in current understanding or conflicts in current knowledge
  • Establishes the originality of the research aims by demonstrating the need for investigations in the topic area
  • Gives a clear idea of the target readership, why the research was carried out and the novelty and topicality of the manuscript

Originality and Topicality

Originality and topicality can only be established in the light of recent authoritative research. For example, it's impossible to argue that there is a conflict in current understanding by referencing articles that are 10 years old.

Authors may make the case that a topic hasn't been investigated in several years and that new research is required. This point is only valid if researchers can point to recent developments in data gathering techniques or to research in indirectly related fields that suggest the topic needs revisiting. Clearly, authors can only do this by referencing recent literature. Obviously, where older research is seminal or where aspects of the methodology rely upon it, then it is perfectly appropriate for authors to cite some older papers.

Editors say, "Is the report providing new information; is it novel or just confirmatory of well-known outcomes ?"

It's common for the introduction to end by stating the research aims. By this point you should already have a good impression of them - if the explicit aims come as a surprise, then the introduction needs improvement.

2. Materials and Methods

Academic research should be replicable, repeatable and robust - and follow best practice.

Replicable Research

This makes sufficient use of:

  • Control experiments
  • Repeated analyses
  • Repeated experiments

These are used to make sure observed trends are not due to chance and that the same experiment could be repeated by other researchers - and result in the same outcome. Statistical analyses will not be sound if methods are not replicable. Where research is not replicable, the paper should be recommended for rejection.

Repeatable Methods

These give enough detail so that other researchers are able to carry out the same research. For example, equipment used or sampling methods should all be described in detail so that others could follow the same steps. Where methods are not detailed enough, it's usual to ask for the methods section to be revised.

Robust Research

This has enough data points to make sure the data are reliable. If there are insufficient data, it might be appropriate to recommend revision. You should also consider whether there is any in-built bias not nullified by the control experiments.

Best Practice

During these checks you should keep in mind best practice:

  • Standard guidelines were followed (e.g. the CONSORT Statement for reporting randomized trials)
  • The health and safety of all participants in the study was not compromised
  • Ethical standards were maintained

If the research fails to reach relevant best practice standards, it's usual to recommend rejection. What's more, you don't then need to read any further.

3. Results and Discussion

This section should tell a coherent story - What happened? What was discovered or confirmed?

Certain patterns of good reporting need to be followed by the author:

  • They should start by describing in simple terms what the data show
  • They should make reference to statistical analyses, such as significance or goodness of fit
  • Once described, they should evaluate the trends observed and explain the significance of the results to wider understanding. This can only be done by referencing published research
  • The outcome should be a critical analysis of the data collected

Discussion should always, at some point, gather all the information together into a single whole. Authors should describe and discuss the overall story formed. If there are gaps or inconsistencies in the story, they should address these and suggest ways future research might confirm the findings or take the research forward.

4. Conclusions

This section is usually no more than a few paragraphs and may be presented as part of the results and discussion, or in a separate section. The conclusions should reflect upon the aims - whether they were achieved or not - and, just like the aims, should not be surprising. If the conclusions are not evidence-based, it's appropriate to ask for them to be re-written.

5. Information Gathered: Images, Graphs and Data Tables

If you find yourself looking at a piece of information from which you cannot discern a story, then you should ask for improvements in presentation. This could be an issue with titles, labels, statistical notation or image quality.

Where information is clear, you should check that:

  • The results seem plausible, in case there is an error in data gathering
  • The trends you can see support the paper's discussion and conclusions
  • There are sufficient data. For example, in studies carried out over time are there sufficient data points to support the trends described by the author?

You should also check whether images have been edited or manipulated to emphasize the story they tell. This may be appropriate but only if authors report on how the image has been edited (e.g. by highlighting certain parts of an image). Where you feel that an image has been edited or manipulated without explanation, you should highlight this in a confidential comment to the editor in your report.

6. List of References

You will need to check referencing for accuracy, adequacy and balance.

Where a cited article is central to the author's argument, you should check the accuracy and format of the reference - and bear in mind different subject areas may use citations differently. Otherwise, it's the editor’s role to exhaustively check the reference section for accuracy and format.

You should consider if the referencing is adequate:

  • Are important parts of the argument poorly supported?
  • Are there published studies that show similar or dissimilar trends that should be discussed?
  • If a manuscript only uses half the citations typical in its field, this may be an indicator that referencing should be improved - but don't be guided solely by quantity
  • References should be relevant, recent and readily retrievable

Check for a well-balanced list of references that is:

  • Helpful to the reader
  • Fair to competing authors
  • Not over-reliant on self-citation
  • Gives due recognition to the initial discoveries and related work that led to the work under assessment

You should be able to evaluate whether the article meets the criteria for balanced referencing without looking up every reference.

7. Plagiarism

By now you will have a deep understanding of the paper's content - and you may have some concerns about plagiarism.

Identified Concern

If you find - or already knew of - a very similar paper, this may be because the author overlooked it in their own literature search. Or it may be because it is very recent or published in a journal slightly outside their usual field.

You may feel you can advise the author how to emphasize the novel aspects of their own study, so as to better differentiate it from similar research. If so, you may ask the author to discuss their aims and results, or modify their conclusions, in light of the similar article. Of course, the research similarities may be so great that they render the work unoriginal and you have no choice but to recommend rejection.

"It's very helpful when a reviewer can point out recent similar publications on the same topic by other groups, or that the authors have already published some data elsewhere ." (Editor feedback)

Suspected Concern

If you suspect plagiarism, including self-plagiarism, but cannot recall or locate exactly what is being plagiarized, notify the editor of your suspicion and ask for guidance.

Most editors have access to software that can check for plagiarism.

Editors are not out to police every paper, but when plagiarism is discovered during peer review it can be properly addressed ahead of publication. If plagiarism is discovered only after publication, the consequences are worse for both authors and readers, because a retraction may be necessary.

For detailed guidelines see COPE's Ethical guidelines for reviewers and Wiley's Best Practice Guidelines on Publishing Ethics .

8. Search Engine Optimization (SEO)

After the detailed read-through, you will be in a position to advise whether the title, abstract and key words are optimized for search purposes. In order to be effective, good SEO terms will reflect the aims of the research.

A clear title and abstract will improve the paper's search engine rankings and will influence whether the user finds and then decides to navigate to the main article. The title should contain the relevant SEO terms early on. This has a major effect on the impact of a paper, since it helps it appear in search results. A poor abstract can then lose the reader's interest and undo the benefit of an effective title - whilst the paper's abstract may appear in search results, the potential reader may go no further.

So ask yourself, while the abstract may have seemed adequate during earlier checks, does it:

  • Do justice to the manuscript in this context?
  • Highlight important findings sufficiently?
  • Present the most interesting data?

Editors say, " Does the Abstract highlight the important findings of the study ?"

If there is a formal report format, remember to follow it. This will often comprise a range of questions followed by comment sections. Try to answer all the questions. They are there because the editor felt that they are important. If you're following an informal report format you could structure your report in three sections: summary, major issues, minor issues.

  • Give positive feedback first. Authors are more likely to read your review if you do so. But don't overdo it if you will be recommending rejection
  • Briefly summarize what the paper is about and what the findings are
  • Try to put the findings of the paper into the context of the existing literature and current knowledge
  • Indicate the significance of the work and if it is novel or mainly confirmatory
  • Indicate the work's strengths, its quality and completeness
  • State any major flaws or weaknesses and note any special considerations. For example, if previously held theories are being overlooked

Major Issues

  • Are there any major flaws? State what they are and what the severity of their impact is on the paper
  • Has similar work already been published without the authors acknowledging this?
  • Are the authors presenting findings that challenge current thinking? Is the evidence they present strong enough to prove their case? Have they cited all the relevant work that would contradict their thinking and addressed it appropriately?
  • If major revisions are required, try to indicate clearly what they are
  • Are there any major presentational problems? Are figures & tables, language and manuscript structure all clear enough for you to accurately assess the work?
  • Are there any ethical issues? If you are unsure it may be better to disclose these in the confidential comments section

Minor Issues

  • Are there places where meaning is ambiguous? How can this be corrected?
  • Are the correct references cited? If not, which should be cited instead/also? Are citations excessive, limited, or biased?
  • Are there any factual, numerical or unit errors? If so, what are they?
  • Are all tables and figures appropriate, sufficient, and correctly labelled? If not, say which are not

Your review should ultimately help the author improve their article. So be polite, honest and clear. You should also try to be objective and constructive, not subjective and destructive.

You should also:

  • Write clearly and so you can be understood by people whose first language is not English
  • Avoid complex or unusual words, especially ones that would even confuse native speakers
  • Number your points and refer to page and line numbers in the manuscript when making specific comments
  • If you have been asked to only comment on specific parts or aspects of the manuscript, you should indicate clearly which these are
  • Treat the author's work the way you would like your own to be treated

Most journals give reviewers the option to provide some confidential comments to editors. Often this is where editors will want reviewers to state their recommendation - see the next section - but otherwise this area is best reserved for communicating malpractice such as suspected plagiarism, fraud, unattributed work, unethical procedures, duplicate publication, bias or other conflicts of interest.

However, this doesn't give reviewers permission to 'backstab' the author. Authors can't see this feedback and are unable to give their side of the story unless the editor asks them to. So in the spirit of fairness, write comments to editors as though authors might read them too.

Reviewers should check the preferences of individual journals as to where they want review decisions to be stated. In particular, bear in mind that some journals will not want the recommendation included in any comments to authors, as this can cause editors difficulty later - see Section 11 for more advice about working with editors.

You will normally be asked to indicate your recommendation (e.g. accept, reject, revise and resubmit, etc.) from a fixed-choice list and then to enter your comments into a separate text box.

Recommending Acceptance

If you're recommending acceptance, give details outlining why, and if there are any areas that could be improved. Don't just give a short, cursory remark such as 'great, accept'. See Improving the Manuscript

Recommending Revision

Where improvements are needed, a recommendation for major or minor revision is typical. You may also choose to state whether you opt in or out of the post-revision review too. If recommending revision, state specific changes you feel need to be made. The author can then reply to each point in turn.

Some journals offer the option to recommend rejection with the possibility of resubmission – this is most relevant where substantial, major revision is necessary.

What can reviewers do to help? " Be clear in their comments to the author (or editor) which points are absolutely critical if the paper is given an opportunity for revisio n." (Jonathon Halbesleben, Editor of Journal of Occupational and Organizational Psychology)

Recommending Rejection

If recommending rejection or major revision, state this clearly in your review (and see the next section, 'When recommending rejection').

Where manuscripts have serious flaws you should not spend any time polishing the review you've drafted or give detailed advice on presentation.

Editors say, " If a reviewer suggests a rejection, but her/his comments are not detailed or helpful, it does not help the editor in making a decision ."

In your recommendations for the author, you should:

  • Give constructive feedback describing ways that they could improve the research
  • Keep the focus on the research and not the author. This is an extremely important part of your job as a reviewer
  • Avoid making critical confidential comments to the editor while being polite and encouraging to the author - the latter may not understand why their manuscript has been rejected. Also, they won't get feedback on how to improve their research and it could trigger an appeal

Remember to give constructive criticism even if recommending rejection. This helps developing researchers improve their work and explains to the editor why you felt the manuscript should not be published.

" When the comments seem really positive, but the recommendation is rejection…it puts the editor in a tough position of having to reject a paper when the comments make it sound like a great paper ." (Jonathon Halbesleben, Editor of Journal of Occupational and Organizational Psychology)

Visit our Wiley Author Learning and Training Channel for expert advice on peer review.

Watch the video, Ethical considerations of Peer Review

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Effect of exercise for...

Effect of exercise for depression: systematic review and network meta-analysis of randomised controlled trials

Linked editorial.

Exercise for the treatment of depression

  • Related content
  • Peer review
  • Michael Noetel , senior lecturer 1 ,
  • Taren Sanders , senior research fellow 2 ,
  • Daniel Gallardo-Gómez , doctoral student 3 ,
  • Paul Taylor , deputy head of school 4 ,
  • Borja del Pozo Cruz , associate professor 5 6 ,
  • Daniel van den Hoek , senior lecturer 7 ,
  • Jordan J Smith , senior lecturer 8 ,
  • John Mahoney , senior lecturer 9 ,
  • Jemima Spathis , senior lecturer 9 ,
  • Mark Moresi , lecturer 4 ,
  • Rebecca Pagano , senior lecturer 10 ,
  • Lisa Pagano , postdoctoral fellow 11 ,
  • Roberta Vasconcellos , doctoral student 2 ,
  • Hugh Arnott , masters student 2 ,
  • Benjamin Varley , doctoral student 12 ,
  • Philip Parker , pro vice chancellor research 13 ,
  • Stuart Biddle , professor 14 15 ,
  • Chris Lonsdale , deputy provost 13
  • 1 School of Psychology, University of Queensland, St Lucia, QLD 4072, Australia
  • 2 Institute for Positive Psychology and Education, Australian Catholic University, North Sydney, NSW, Australia
  • 3 Department of Physical Education and Sport, University of Seville, Seville, Spain
  • 4 School of Health and Behavioural Sciences, Australian Catholic University, Strathfield, NSW, Australia
  • 5 Department of Clinical Biomechanics and Sports Science, University of Southern Denmark, Odense, Denmark
  • 6 Biomedical Research and Innovation Institute of Cádiz (INiBICA) Research Unit, University of Cádiz, Spain
  • 7 School of Health and Behavioural Sciences, University of the Sunshine Coast, Petrie, QLD, Australia
  • 8 School of Education, University of Newcastle, Callaghan, NSW, Australia
  • 9 School of Health and Behavioural Sciences, Australian Catholic University, Banyo, QLD, Australia
  • 10 School of Education, Australian Catholic University, Strathfield, NSW, Australia
  • 11 Australian Institute of Health Innovation, Macquarie University, Macquarie Park, NSW, Australia
  • 12 Children’s Hospital Westmead Clinical School, University of Sydney, Westmead, NSW, Australia
  • 13 Australian Catholic University, North Sydney, NSW, Australia
  • 14 Centre for Health Research, University of Southern Queensland, Springfield, QLD, Australia
  • 15 Faculty of Sport and Health Science, University of Jyvaskyla, Jyvaskyla, Finland
  • Correspondence to: M Noetel m.noetel{at}uq.edu.au (or @mnoetel on Twitter)
  • Accepted 15 January 2024

Objective To identify the optimal dose and modality of exercise for treating major depressive disorder, compared with psychotherapy, antidepressants, and control conditions.

Design Systematic review and network meta-analysis.

Methods Screening, data extraction, coding, and risk of bias assessment were performed independently and in duplicate. Bayesian arm based, multilevel network meta-analyses were performed for the primary analyses. Quality of the evidence for each arm was graded using the confidence in network meta-analysis (CINeMA) online tool.

Data sources Cochrane Library, Medline, Embase, SPORTDiscus, and PsycINFO databases.

Eligibility criteria for selecting studies Any randomised trial with exercise arms for participants meeting clinical cut-offs for major depression.

Results 218 unique studies with a total of 495 arms and 14 170 participants were included. Compared with active controls (eg, usual care, placebo tablet), moderate reductions in depression were found for walking or jogging (n=1210, κ=51, Hedges’ g −0.62, 95% credible interval −0.80 to −0.45), yoga (n=1047, κ=33, g −0.55, −0.73 to −0.36), strength training (n=643, κ=22, g −0.49, −0.69 to −0.29), mixed aerobic exercises (n=1286, κ=51, g −0.43, −0.61 to −0.24), and tai chi or qigong (n=343, κ=12, g −0.42, −0.65 to −0.21). The effects of exercise were proportional to the intensity prescribed. Strength training and yoga appeared to be the most acceptable modalities. Results appeared robust to publication bias, but only one study met the Cochrane criteria for low risk of bias. As a result, confidence in accordance with CINeMA was low for walking or jogging and very low for other treatments.

Conclusions Exercise is an effective treatment for depression, with walking or jogging, yoga, and strength training more effective than other exercises, particularly when intense. Yoga and strength training were well tolerated compared with other treatments. Exercise appeared equally effective for people with and without comorbidities and with different baseline levels of depression. To mitigate expectancy effects, future studies could aim to blind participants and staff. These forms of exercise could be considered alongside psychotherapy and antidepressants as core treatments for depression.

Systematic review registration PROSPERO CRD42018118040.

Figure1

  • Download figure
  • Open in new tab
  • Download powerpoint

Introduction

Major depressive disorder is a leading cause of disability worldwide 1 and has been found to lower life satisfaction more than debt, divorce, and diabetes 2 and to exacerbate comorbidities, including heart disease, 3 anxiety, 4 and cancer. 5 Although people with major depressive disorder often respond well to drug treatments and psychotherapy, 6 7 many are resistant to treatment. 8 In addition, access to treatment for many people with depression is limited, with only 51% treatment coverage for high income countries and 20% for low and lower-middle income countries. 9 More evidence based treatments are therefore needed.

Exercise may be an effective complement or alternative to drugs and psychotherapy. 10 11 12 13 14 In addition to mental health benefits, exercise also improves a range of physical and cognitive outcomes. 15 16 17 Clinical practice guidelines in the US, UK, and Australia recommend physical activity as part of treatment for depression. 18 19 20 21 But these guidelines do not provide clear, consistent recommendations about dose or exercise modality. British guidelines recommend group exercise programmes 20 21 and offer general recommendations to increase any form of physical activity, 21 the American Psychiatric Association recommends any dose of aerobic exercise or resistance training, 20 and Australian and New Zealand guidelines suggest a combination of strength and vigorous aerobic exercises, with at least two or three bouts weekly. 19

Authors of guidelines may find it hard to provide consistent recommendations on the basis of existing mainly pairwise meta-analyses—that is, assessing a specific modality versus a specific comparator in a distinct group of participants. 12 13 22 These meta-analyses have come under scrutiny for pooling heterogeneous treatments and heterogenous comparisons leading to ambiguous effect estimates. 23 Reviews also face the opposite problem, excluding exercise treatments such as yoga, tai chi, and qigong because grouping them with strength training might be inappropriate. 23 Overviews of reviews have tried to deal with this problem by combining pairwise meta-analyses on individual treatments. A recent such overview found no differences between exercise modalities. 13 Comparing effect sizes between different pairwise meta-analyses can also lead to confusion because of differences in analytical methods used between meta-analysis, such as choice of a control to use as the referent. Network meta-analyses are a better way to precisely quantify differences between interventions as they simultaneously model the direct and indirect comparisons between interventions. 24

Network meta-analyses have been used to compare different types of psychotherapy and pharmacotherapy for depression. 6 25 26 For exercise, they have shown that dose and modality influence outcomes for cognition, 16 back pain, 15 and blood pressure. 17 Two network meta-analyses explored the effects of exercise on depression: one among older adults 27 and the other for mental health conditions. 28 Because of the inclusion criteria and search strategies used, these reviews might have been under-powered to explore moderators such as dose and modality (κ=15 and κ=71, respectively). To resolve conflicting findings in existing reviews, we comprehensively searched randomised trials on exercise for depression to ensure our review was adequately powered to identify the optimal dose and modality of exercise. For example, a large overview of reviews found effects on depression to be proportional to intensity, with vigorous exercise appearing to be better, 13 but a later meta-analysis found no such effects. 22 We explored whether recommendations differ based on participants’ sex, age, and baseline level of depression.

Given the challenges presented by behaviour change in people with depression, 29 we also identified autonomy support or behaviour change techniques that might improve the effects of intervention. 30 Behaviour change techniques such as self-monitoring and action planning have been shown to influence the effects of physical activity interventions in adults (>18 years) 31 and older adults (>60 years) 32 with differing effectiveness of techniques in different populations. We therefore tested whether any intervention components from the behaviour change technique taxonomy were associated with higher or lower intervention effects. 30 Other meta-analyses found that physical activity interventions work better when they provide people with autonomy (eg, choices, invitational language). 33 Autonomy is not well captured in the taxonomy for behaviour change technique. We therefore tested whether effects were stronger in studies that provided more autonomy support to patients. Finally, to understand the mechanism of intervention effects, such as self-confidence, affect, and physical fitness, we collated all studies that conducted formal mediation analyses.

Our findings are presented according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses-Network Meta-analyses (PRISMA-NMA) guidelines (see supplementary file, section S0; all supplementary files, data, and code are also available at https://osf.io/nzw6u/ ). 34 We amended our analysis strategy after registering our review; these changes were to better align with new norms established by the Cochrane Comparing Multiple Interventions Methods Group. 35 These norms were introduced between the publication of our protocol and the preparation of this manuscript. The largest change was using the confidence in network meta-analysis (CINeMA) 35 online tool instead of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) guidelines and adopting methods to facilitate assessments—for example, instead of using an omnibus test for all treatments, we assessed publication bias for each treatment compared with active controls. We also modelled acceptability (through dropout rate), which was not predefined but was adopted in response to a reviewer’s comment.

Eligibility criteria

To be eligible for inclusion, studies had to be randomised controlled trials that included exercise as a treatment for depression and included participants who met the criteria for major depressive disorder, either clinician diagnosed or identified through participant self-report as exceeding established clinical thresholds (eg, scored >13 on the Beck depression inventory-II). 36 Studies could meet these criteria when all the participants had depression or when the study reported depression outcomes for a subgroup of participants with depression at the start of the study.

We defined exercise as “planned, structured and repetitive bodily movement done to improve or maintain one or more components of physical fitness.” 37 Unlike recent reviews, 12 22 we included studies with more than one exercise arm and multifaceted interventions (eg, health and exercise counselling) as long as they contained a substantial exercise component. These trials could be included because network meta-analysis methods allows for the grouping of those interventions into homogenous nodes. Unlike the most recent Cochrane review, 12 we also included participants with physical comorbidities such as arthritis and participants with postpartum depression because the Diagnostic Statistical Manual of Mental Health Disorders , fifth edition, removed the postpartum onset specifier after that analysis was completed. 23 Studies were excluded if interventions were shorter than one week, depression was not reported as an outcome, and data were insufficient to calculate an effect size for each arm. Any comparison condition was included, allowing us to quantify the effects against established treatments (eg, selective serotonin reuptake inhibitors (SSRIs), cognitive behavioural therapy), active control conditions (usual care, placebo tablet, stretching, educational control, and social support), or waitlist control conditions. Published and unpublished studies were included, with no restrictions on language applied.

Information sources

We adapted the search strategy from the most recent Cochrane review, 12 adding keywords for yoga, tai chi, and qigong, as they met our definition for exercise. We conducted database searches, without filters or date limits, in The Cochrane Library via CENTRAL, SPORTDiscus via Embase, and Medline, Embase, and PsycINFO via Ovid. Searches of the databases were conducted on 17 December 2018 and 7 August 2020 and last updated on 3 June 2023 (see supplementary file section S1 for full search strategies). We assessed full texts of all included studies from two systematic reviews of exercise for depression. 12 22

Study selection and data collection

To select studies, we removed duplicate records in Covidence 38 and then screened each title and abstract independently and in duplicate. Conflicts were resolved through discussion or consultation with a third reviewer. The same methods were used for full text screening.

We used the Extraction 1.0 randomised controlled trial data extraction forms in Covidence. 38 Data were extracted independently and in duplicate, with conflicts resolved through discussion with a third reviewer.

For each study, we extracted a description of the interventions, including frequency, intensity, and type and time of each exercise intervention. Using the Compendium of Physical Activities, 39 we calculated the energy expenditure dose of exercise for each arm as metabolic equivalents of task (METs) min/week. Two authors evaluated each exercise intervention using the Behaviour Change Taxonomy version 1 30 for behaviour change techniques explicitly described in each exercise arm. They also rated the level of autonomy offered to participants, on a scale from 1 (no choice) to 10 (full autonomy). We also extracted descriptions of the other arms within the randomised trials, including other treatment or control conditions; participants’ age, sex, comorbidities, and baseline severity of depressive symptoms; and each trial’s location and whether or not the trial was funded.

Risk of bias in individual studies

We used Cochrane’s risk of bias tool for randomised controlled trials. 40 Risk of bias was rated independently and in duplicate, with conflicts resolved through discussion with a third reviewer.

Summary measures and synthesis

For main and moderation analyses, we used bayesian arm based multilevel network meta-analysis models. 41 All network meta-analytical approaches allow users to assess the effects of treatments against a range of comparisons. The bayesian arm based models allowed us to also assess the influence of hypothesised moderators, such as intensity, dose, age, and sex. Many network meta-analyses use contrast based methods, comparing post-test scores between study arms. 41 Arm based meta-analyses instead describe the population-averaged absolute effect size for each treatment arm (ie, each arm’s change score). 41 As a result, the summary measure we used was the standardised mean change from baseline, calculated as standardised mean differences with correction for small studies (Hedges’ g). In keeping with the norms from the included studies, effect sizes describe treatment effects on depression, such that larger negative numbers represent stronger effects on symptoms. Using National Institute for Health and Care Excellence guidelines, 42 we standardised change scores for different depression scales (eg, Beck depression inventory, Hamilton depression rating scale) using an internal reference standard for each scale (for each scale, the average of pooled standard deviations at baseline) reported in our meta-analysis. Because depression scores generally show regression to the mean, even in control conditions, we present effect sizes as improvements beyond active control conditions. This convention makes our results comparable to existing, contrast based meta-analyses.

Active control conditions (usual care, placebo tablet, stretching, educational control, and social support) were grouped to increase power for moderation analyses, for parsimony in the network graph, and because they all showed similar arm based pooled effect sizes (Hedges’ g between −0.93 and −1.00 for all, with no statistically significant differences). We separated waitlist control from these active control conditions because it typically shows poorer effects in treatment for depression. 43

Bayesian meta-analyses were conducted in R 44 using the brms package. 45 We preregistered informative priors based on the distributional parameters of our meta-analytical model. 46 We nested effects within arms to manage dependency between multiple effect sizes from the same participants. 46 For example, if one study reported two self-reported measures of depression, or reported both self-report and clinician rated depression, we nested these effect sizes within the arm to account for both pieces of information while controlling for dependency between effects. 46 Finally, we compared absolute effect sizes against a standardised minimum clinically important difference, 0.5 standard deviations of the change score. 47 From our data, this corresponded to a large change in before and after scores (Hedges’ g −1.16), a moderate change compared with waitlist control (g −0.55), or a small benefit when compared with active controls (g −0.20). For credibility assessments comparing exercise modalities, we used the netmeta package 48 and CINeMA. 49 We also used netmeta to model acceptability, comparing the odds ratio for drop-out rate in each arm.

Additional analyses

All prespecified moderation and sensitivity analyses were performed. We moderated for participant characteristics, including participants’ sex, age, baseline symptom severity, and presence or absence of comorbidities; duration of the intervention (weeks); weekly dose of the intervention; duration between completion of treatment and measurement, to test robustness to remission (in response to a reviewer’s suggestion); amount of autonomy provided in the exercise prescription; and presence of each behaviour change technique. As preregistered, we moderated for behaviour change techniques in three ways: through meta-regression, including all behaviour change techniques simultaneously for primary analysis; including one behaviour change technique at a time (using 99% credible intervals to somewhat control for multiple comparisons) in exploratory analyses; and through meta-analytical classification and regression trees (metaCART), which allowed for interactions between moderating variables (eg, if goal setting combined with feedback had synergistic effects). 50 We conducted sensitivity analyses for risk of bias, assessing whether studies with low versus unclear or high risk of bias on each domain showed statistically significant differences in effect sizes.

Credibility assessment

To assess the credibility of each comparison against active control, we used CINeMA. 35 49 This online tool was designed by the Cochrane Comparing Multiple Interventions Methods Group as an adaptation of GRADE for network meta-analyses. 35 In line with recommended guidelines, for each comparison we made judgements for within study bias, reporting bias, indirectness, imprecision, heterogeneity, and incoherence. Similar to GRADE, we considered the evidence for comparisons to show high confidence then downgraded on the basis of concerns in each domain, as follows:

Within study bias —Comparisons were downgraded when most of the studies providing direct evidence for comparisons were unclear or high risk.

Reporting bias —Publication bias was assessed in three ways. For each comparison with at least 10 studies 51 we created funnel plots, including estimates of effect sizes after removing studies with statistically significant findings (ie, worst case estimates) 52 ; calculated an s value, representing how strong publication bias would need to be to nullify meta-analytical effects 52 ; and conducted a multilevel Egger’s regression test, indicative of small study bias. Given these tests are not recommended for comparisons with fewer than 10 studies, 51 those comparisons were considered to show “some concerns.”

Indirectness — Our primary population of interest was adults with major depression. Studies were considered to be indirect if they focused on one sex only (>90% male or female), participants with comorbidities (eg, heart disease), adolescents and young adults (14-20 years), or older adults (>60 years). We flagged these studies as showing some concerns if one of these factors was present, and as “major concerns” if two of these factors were present. Evidence from comparisons was classified as some concerns or major concerns using majority rating for studies directly informing the comparison.

Imprecision — As per CINeMA, we used the clinically important difference of Hedges’ g=0.2 to ascribe a zone of equivalence, where differences were not considered clinically significant (−0.2<g<0.2). Studies were flagged as some concerns for imprecision if the bounds of the 95% credible interval extended across that zone, and they were flagged as major concerns if the bounds extended to the other side of the zone of equivalence (such that effects could be harmful).

Heterogeneity — Prediction intervals account for heterogeneity differently from credible intervals. 35 As a result, CINeMA accounts for heterogeneity by assessing whether the prediction intervals and the credible intervals lead to different conclusions about clinical significance (using the same zone of equivalence from imprecision). Comparisons are flagged as some concerns if the prediction interval crosses into, or out of, the zone of equivalence once (eg, from helpful to no meaningful effect), and as major concerns if the prediction interval crosses the zone twice (eg, from helpful and harmful).

Incoherence — Incoherence assesses whether the network meta-analysis provides similar estimates when using direct evidence (eg, randomised controlled trials on strength training versus SSRI) compared with indirect evidence (eg, randomised controlled trials where either strength training or SSRI uses waitlist control). Incoherence provides some evidence the network may violate the assumption of transitivity: that the only systematic difference between arms is the treatment, not other confounders. We assessed incoherence using two methods: Firstly, a global design-by-treatment interaction to assess for incoherence across the whole network, 35 49 and, secondly, separating indirect and direct evidence (SIDE method) for each comparison through netsplitting to see whether differences between those effect estimates were statistically significant. We flagged comparisons as some concerns if either no direct comparisons were available or direct and indirect evidence gave different conclusions about clinical significance (eg, from helpful to no meaningful effect, as per imprecision and heterogeneity). Again, we classified comparisons as major concerns if the direct and indirect evidence changed the sign of the effect or changed both limits of the credible interval. 35 49

Patient and public involvement

We discussed the aims and design of this study with members of the public, including those who had experienced depression. Several of our authors have experienced major depressive episodes, but beyond that we did not include patients in the conduct of this review.

Study selection

The PRISMA flow diagram outlines the study selection process ( fig 1 ). We used two previous reviews to identify potentially eligible studies for inclusion. 12 22 Database searches identified 18 658 possible studies. After 5505 duplicates had been removed, two reviewers independently screened 13 115 titles and abstracts. After screening, two reviewers independently reviewed 1738 full text articles. Supplementary file section S2 shows the consensus reasons for exclusion. A total of 218 unique studies described in 246 reports were included, totalling 495 arms and 14 170 participants. Supplementary file section S3 lists the references and characteristics of the included studies.

Fig 1

Flow of studies through review

Network geometry

As preregistered, we removed nodes with fewer than 100 participants. Using this filter, most interventions contained comparisons with at least four other nodes in the network geometry ( fig 2 ). The results of the global test design-by-treatment interaction model were not statistically significant, supporting the assumption of transitivity (χ 2 =94.92, df=75, P=0.06). When net-splitting was used on all possible combinations in the network, for two out of the 120 comparisons we found statistically significant incoherence between direct and indirect evidence (SSRI v waitlist control; cognitive behavioural therapy v tai chi or qigong). Overall, we found little statistical evidence that the model violated the assumption of transitivity. Qualitative differences were, however, found for participant characteristics between different arms (see supplementary file, section S4). For example, some interventions appeared to be prescribed more frequently among people with severe depression (eg, 7/16 studies using SSRIs) compared with other interventions (eg, 1/15 studies using aerobic exercise combined with therapy). Similarly, some interventions appeared more likely to be prescribed for older adults (eg, mean age, tai chi=59 v dance=31) or women (eg, per cent female: dance=88% v cycling=53%). Given that plausible mechanisms exist for these systematic differences (eg, the popularity of tai chi among older adults), 53 there are reasons to believe that allocation to treatment arms would be less than perfectly random. We have factored these biases in our certainty estimates through indirectness ratings.

Fig 2

Network geometry indicating number of participants in each arm (size of points) and number of comparisons between arms (thickness of lines). SSRI=selective serotonin reuptake inhibitor

Risk of bias within studies

Supplementary file section S5 provides the risk of bias ratings for each study. Few studies explicitly blinded participants and staff ( fig 3 ). As a result, overall risk of bias for most studies was unclear or high, and effect sizes could include expectancy effects, among other biases. However, sensitivity analyses suggested that effect sizes were not influenced by any risk of bias criteria owing to wide credible intervals (see supplementary file, section S6). Nevertheless, certainty ratings for all treatments arms were downgraded owing to high risk of bias in the studies informing the comparison.

Fig 3

Risk of bias summary plot showing percentage of included studies judged to be low, unclear, or high risk across Cochrane criteria for randomised trials

Synthesis of results

Supplementary file section S7 presents a forest plot of Hedges’ g values for each study. Figure 4 shows the predicted effects of each treatment compared with active controls. Compared with active controls, large reductions in depression were found for dance (n=107, κ=5, Hedges’ g −0.96, 95% credible interval −1.36 to −0.56) and moderate reductions for walking or jogging (n=1210, κ=51, g −0.63, −0.80 to −0.46), yoga (n=1047, κ=33, g=−0.55, −0.73 to −0.36), strength training (n=643, κ=22, g=−0.49, −0.69 to −0.29), mixed aerobic exercises (n=1286, κ=51, g=−0.43, −0.61 to −0.25), and tai chi or qigong (n=343, κ=12, g=−0.42, −0.65 to −0.21). Moderate, clinically meaningful effects were also present when exercise was combined with SSRIs (n=268, κ=11, g=−0.55, −0.86 to −0.23) or aerobic exercise was combined with psychotherapy (n=404, κ=15, g=−0.54, −0.76 to −0.32). All these treatments were significantly stronger than the standardised minimum clinically important difference compared with active control (g=−0.20), equating to an absolute g value of −1.16. Dance, exercise combined with SSRIs, and walking or jogging were the treatments most likely to perform best when modelling the surface under the cumulative ranking curve ( fig 4 ). For acceptability, the odds of participants dropping out of the study were lower for strength training (n=247, direct evidence κ=6, odds ratio 0.55, 95% credible interval 0.31 to 0.99) and yoga (n=264, κ=5, 0.57, 0.35 to 0.94) than for active control. The rate of dropouts was not significantly different from active control in any other arms (see supplementary file, section S8).

Fig 4

Predicted effects of different exercise modalities on major depression compared with active controls (eg, usual care), with 95% credible intervals. The estimate of effects for the active control condition was a before and after change of Hedges’ g of −0.95 (95% credible interval −1.10 to −0.79), n=3554, κ =113. Colour represents SUCRA from most likely to be helpful (dark purple) to least likely to be helpful (light purple). SSRI=selective serotonin reuptake inhibitor; SUCRA=surface under the cumulative ranking curve

Consistent with other meta-analyses, effects were moderate for cognitive behaviour therapy alone (n=712, κ=20, g=−0.55, −0.75 to −0.37) and small for SSRIs (n=432, κ=16, g=−0.26, −0.50 to −0.01) compared with active controls ( fig 4 ). These estimates are comparable to those of reviews that focused directly on psychotherapy (g=−0.67, −0.79 to −0.56) 7 or pharmacotherapy (g=−0.30, –0.34 to −0.26). 25 However, our review was not designed to find all studies of these treatments, so these estimates should not usurp these directly focused systematic reviews.

Despite the large number of studies in the network, confidence in the effects were low ( fig 5 ). This was largely due to the high within study bias described in the risk of bias summary plot. Reporting bias was also difficult to robustly assess because direct comparison with active control was often only provided in fewer than 10 studies. Many studies focused on one sex only, older adults, or those with comorbidities, so most arms had some concerns about indirect comparisons. Credible intervals were seldom wide enough to change decision making, so concerns about imprecision were few. Heterogeneity did plausibly change some conclusions around clinical significance. Few studies showed problematic incoherence, meaning direct and indirect evidence usually agreed. Overall, walking or jogging had low confidence, with other modalities being very low.

Fig 5

Summary table for credibility assessment using confidence in network meta-analysis (CINeMA). SSRI=selective serotonin reuptake inhibitor

Moderation by participant characteristics

The optimal modality appeared to be moderated by age and sex. Compared with models that only included exercise modality (R 2 =0.65), R 2 was higher for models that included interactions with sex (R 2 =0.71) and age (R 2 =0.69). R 2 showed no substantial increase for models including baseline depression (R 2 =0.67) or comorbidities (R 2 =0.66; see supplementary file, section S9).

Effects appeared larger for women than men for strength training and cycling ( fig 6 ). Effects appeared to be larger for men than women when prescribing yoga, tai chi, and aerobic exercise alongside psychotherapy. Yoga and aerobic exercise alongside psychotherapy appeared more effective for older participants than younger people ( fig 7 ). Strength training appeared more effective when prescribed to younger participants than older participants. Some estimates were associated with substantial uncertainty because some modalities were not well studied in some groups (eg, tai chi for younger adults), and mean age of the sample was only available for 71% of the studies.

Fig 6

Effects of interventions versus active control on depression (lower is better) by sex. Shading represents 95% credible intervals

Fig 7

Effects of interventions versus active control on depression (lower is better) by age. Shading represents 95% credible intervals

Moderation by intervention and design characteristics

Across modalities, a clear dose-response curve was observed for intensity of exercise prescribed ( fig 8 ). Although light physical activity (eg, walking, hatha yoga) still provided clinically meaningful effects (g=−0.58, −0.82 to −0.33), expected effects were stronger for vigorous exercise (eg, running, interval training; g=−0.74, −1.10 to −0.38). This finding did not appear to be due to increased weekly energy expenditure: credible intervals were wide, which meant that the dose-response curve for METs/min prescribed per week was unclear (see supplementary file, section S10). Weak evidence suggested that shorter interventions (eg, 10 weeks: g=−0.53, −0.71 to −0.35) worked somewhat better than longer ones (eg, 30 weeks: g=−0.37, −0.79 to 0.03), with wide credible intervals again indicating high uncertainty (see supplementary file, section S11). We also moderated for the lag between the end of treatment and the measurement of the outcome. We found no indication that participants were likely to relapse within the measurement period (see supplementary file, section S12); effects remained steady when measured either directly after the intervention (g=−0.59, −0.80 to −0.39) or up to six months later (g=−0.63, −0.87 to −0.40).

Fig 8

Dose-response curve for intensity (METs) across exercise modalities compared with active control. METs=metabolic equivalents of task

Supplementary file section S13 provides coding for the behaviour change techniques and autonomy for each exercise arm. None of the behaviour change techniques significantly moderated overall effects. Contrary to expectations, studies describing a level of participant autonomy (ie, choice over frequency, intensity, type, or time) tended to show weaker effects (g=−0.28, −0.78 to 0.23) than those that did not (g=−0.75, −1.17 to −0.33; see supplementary file, section S14). This effect was consistent whether or not we included studies that used physical activity counselling (usually high autonomy).

Use of group exercise appeared to moderate the effects: although the overall effects were similar for individual (g=−1.10, −1.57 to −0.64) and group exercise (g=−1.16, −1.61 to −0.73), some interventions were better delivered in groups (yoga) and some were better delivered individually (strength training, mixed aerobic exercise; see supplementary file, section S15).

As preregistered, we tested whether study funding moderated effects. Models that included whether a study was funded did explain more variance (R 2 =0.70) compared with models that included treatment alone (R 2 =0.65). Funded studies showed stronger effects (g=−1.01, −1.19 to −0.82) than unfunded studies (g=−0.77, −1.09 to −0.46). We also moderated for the type of measure (self-report v clinician report). This did not explain a substantial amount of variance in the outcome (R 2 =0.66).

Sensitivity analyses

Evidence of publication bias was found for overall estimates of exercise on depression compared with active controls, although not enough to nullify effects. The multilevel Egger’s test showed significance (F 1,98 =23.93, P<0.001). Funnel plots showed asymmetry, but the result of pooled effects remained statistically significant when only including non-significant studies (see supplementary file, section S16). No amount of publication bias would be sufficient to shrink effects to zero (s value=not possible). To reduce effects below clinical significance thresholds, studies with statistically significant results would need to be reported 58 times more frequently than studies with non-significant results.

Qualitative synthesis of mediation effects

Only a few of the studies used explicit mediation analyses to test hypothesised mechanisms of action. 54 55 56 57 58 59 One study found that both aerobic exercise and yoga led to decreased depression because participants ruminated less. 54 The study found that the effects of aerobic exercise (but not yoga) were mediated by increased acceptance. 54 “Perceived hassles” and awareness were not statistically significant mediators. 54 Another study found that the effects of yoga were mediated by increased self-compassion, but not rumination, self-criticism, tolerance of uncertainty, body awareness, body trust, mindfulness, and attentional biases. 55 One study found that the effects from an aerobic exercise intervention were not mediated by long term physical activity, but instead were mediated by exercise specific affect regulation (eg, self-control for exercise). 57 Another study found that neither exercise self-efficacy nor depression coping self-efficacy mediated effects of aerobic exercise. 56 Effects of aerobic exercise were not mediated by the N2 amplitude from electroencephalography, hypothesised as a neuro-correlate of cognitive control deficits. 58 Increased physical activity did not appear to mediate the effects of physical activity counselling on depression. 59 It is difficult to infer strong conclusions about mechanisms on the basis of this small number of studies with low power.

Summary of evidence

In this systematic review and meta-analysis of randomised controlled trials, exercise showed moderate effects on depression compared with active controls, either alone or in combination with other established treatments such as cognitive behaviour therapy. In isolation, the most effective exercise modalities were walking or jogging, yoga, strength training, and dancing. Although walking or jogging were effective for both men and women, strength training was more effective for women, and yoga or qigong was more effective for men. Yoga was somewhat more effective among older adults, and strength training was more effective among younger people. The benefits from exercise tended to be proportional to the intensity prescribed, with vigorous activity being better. Benefits were equally effective for different weekly doses, for people with different comorbidities, or for different baseline levels of depression. Although confidence in many of the results was low, treatment guidelines may be overly conservative by conditionally recommending exercise as complementary or alternative treatment for patients in whom psychotherapy or pharmacotherapy is either ineffective or unacceptable. 60 Instead, guidelines for depression ought to include prescriptions for exercise and consider adapting the modality to participants’ characteristics and recommending more vigorous intensity exercises.

Our review did not uncover clear causal mechanisms, but the trends in the data are useful for generating hypotheses. It is unlikely that any single causal mechanism explains all the findings in the review. Instead, we hypothesise that a combination of social interaction, 61 mindfulness or experiential acceptance, 62 increased self-efficacy, 33 immersion in green spaces, 63 neurobiological mechanisms, 64 and acute positive affect 65 combine to generate outcomes. Meta-analyses have found each of these factors to be associated with decreases in depressive symptoms, but no single treatment covers all mechanisms. Some may more directly promote mindfulness (eg, yoga), be more social (eg, group exercise), be conducted in green spaces (eg, walking), provide a more positive affect (eg, “runner’s high”’), or be more conducive to acute adaptations that may increase self-efficacy (eg, strength). 66 Exercise modalities such as running may satisfy many of the mechanisms, but they are unlikely to directly promote the mindful self-awareness provided by yoga and qigong. Both these forms of exercise are often practised in groups with explicit mindfulness but seldom have fast and objective feedback loops that improve self-efficacy. Adequately powered studies testing multiple mediators may help to focus more on understanding why exercise helps depression and less on whether exercise helps. We argue that understanding these mechanisms of action is important for personalising prescriptions and better understanding effective treatments.

Our review included more studies than many existing reviews on exercise for depression. 13 22 27 28 As a result, we were able to combine the strengths of various approaches to exercise and to make more nuanced and precise conclusions. For example, even taking conservative estimates (ie, the least favourable end of the credible interval), practitioners can expect patients to experience clinically significant effects from walking, running, yoga, qigong, strength training, and mixed aerobic exercise. Because we simultaneously assessed more than 200 studies, credible intervals were narrower than those in most existing meta-analyses. 13 We were also able to explore non-linear relationships between outcomes and moderators, such as frequency, intensity, and time. These analyses supported some existing findings—for example, our study and the study by Heissel et al 22 found that shorter interventions had stronger effects, at least for six months; our study and the study by Singh et al 13 both found that effects were stronger with vigorous intensity exercise compared with light and moderate exercise. However, most existing reviews found various treatment modalities to be equally effective. 13 27 In our review, some types of exercise had stronger effect sizes than others. We attribute this to the study level data available in a network meta-analysis compared with an overview of reviews 24 and higher power compared with meta-analyses with smaller numbers of included studies. 22 28 Overviews of reviews have the ability to more easily cover a wider range of participants, interventions, and outcomes, but also risk double counting randomised trials that are included in separate meta-analyses. They often include heterogeneous studies without having as much control over moderation analyses (eg, Singh et al included studies covering both prevention and treatment 13 ). Some of those reviews grouped interventions such as yoga with heterogeneous interventions such as stretching and qigong. 13 This practise of combining different interventions makes it harder to interpret meta-analytical estimates. We used methods that enabled us to separately analyse the effects of these treatment modalities. In so doing, we found that these interventions do have different effects, with yoga being an intervention with strong effects and stretching being better described as an active control condition. Network meta-analyses revealed the same phenomenon with psychotherapy: researchers once concluded there was a dodo bird verdict, whereby “everybody has won, and all must have prizes,” 67 until network meta-analyses showed some interventions were robustly more effective than others. 6 26

Predictors of acceptability and outcomes

We found evidence to suggest good acceptability of yoga and strength training; although the measurement of study drop-out is an imperfect proxy of adherence. Participants may complete the study without doing any exercise or may continue exercising and drop out of the study for other reasons. Nevertheless, these are useful data when considering adherence.

Behaviour change techniques, which are designed to increase adherence, did not meaningfully moderate the effect sizes from exercise. This may be due to several factors. It may be that the modality explains most of the variance between effects, such that behaviour change techniques (eg, presence or absence of feedback) did not provide a meaningful contribution. Many forms of exercise potentially contain therapeutic benefits beyond just energy expenditure. These characteristics of a modality may be more influential than coexisting behaviour change techniques. Alternatively, researchers may have used behaviour change techniques such as feedback or goal setting without explicitly reporting them in the study methods. Given the inherent challenges of behaviour change among people with depression, 29 and the difficulty in forecasting which strategies are likely to be effective, 68 we see the identification of effective techniques as important.

We did find that autonomy, as provided in the methods of included studies, predicted effects, but in the opposite direction to our hypotheses: more autonomy was associated with weaker effects. Physical activity counselling, which usually provides a great deal of patient autonomy, was among the lowest effect sizes in our meta-analysis. Higher autonomy judgements were associated with weaker outcomes regardless of whether physical activity counselling was included in the model. One explanation for these data is that people with depression benefit from the clear direction and accountability of a standardised prescription. When provided with more freedom, the low self-efficacy that is symptomatic of depression may stop patients from setting an appropriate level of challenge (eg, they may be less likely to choose vigorous exercise). Alternatively, participants were likely autonomous when self-selecting into trials with exercise modalities they enjoyed, or those that fit their social circumstances. After choosing something value aligned, autonomy within the trial may not have helpful. Either way, data should be interpreted with caution. Our judgement of the autonomy provided in the methods may not reflect how much autonomy support patients actually felt. The patient’s perceived autonomy is likely determined by a range of factors not described in the methods (eg, the social environment created by those delivering the programme, or their social identity), so other studies that rely on patient reports of the motivational climate are likely to be more reliable. 33 Our findings reiterate the importance of considering these patient reports in future research of exercise for depression.

Our findings suggest that practitioners could advocate for most patients to engage in exercise. Those patients may benefit from guidance on intensity (ie, vigorous) and types of exercise that appear to work well (eg, walking, running, mixed aerobic exercise, strength training, yoga, tai chi, qigong) and be well tolerated (eg, strength training and yoga). If social determinants permit, 66 engaging in group exercise or structured programmes could provide support and guidance to achieve better outcomes. Health services may consider offering these programmes as an alternative or adjuvant treatment for major depression. Specifically, although the confidence in the evidence for exercise is less strong than for cognitive behavioural therapy, the effect sizes seem comparable, so it may be an alternative for patients who prefer not to engage in psychotherapy. Previous reviews on those with mild-moderate depression have found similar effects for exercise or SSRIs, or the two combined. 13 14 In contrast, we found some forms of exercise to have stronger effects than SSRIs alone. Our findings are likely related to the larger power in our review (n=14 170) compared with previous reviews (eg, n=2551), 14 and our ability to better account for heterogeneity in exercise prescriptions. Exercise may therefore be considered a viable alternative to drug treatment. We also found evidence that exercise increases the effects of SSRIs, so offering exercise may act as an adjuvant for those already taking drugs. We agree with consensus statements that professionals should still account for patients’ values, preferences, and constraints, ensuring there is shared decision making around what best suits the patient. 66 Our review provides data to help inform that decision.

Strengths, limitations, and future directions

Based on our findings, dance appears to be a promising treatment for depression, with large effects found compared with other interventions in our review. But the small number of studies, low number of participants, and biases in the study designs prohibits us from recommending dance more strongly. Given most research for the intervention has been in young women (88% female participants, mean age 31 years), it is also important for future research to assess the generalisability of the effects to different populations, using robust experimental designs.

The studies we found may be subject to a range of experimental biases. In particular, researchers seldom blinded participants or staff delivering the intervention to the study’s hypotheses. Blinding for exercise interventions may be harder than for drugs 23 ; however, future studies could attempt to blind participants and staff to the study’s hypotheses to avoid expectancy effects. 69 Some of our ratings are for studies published before the proliferation of reporting checklists, so the ratings might be too critical. 23 For example, before CONSORT, few authors explicitly described how they generated a random sequence. 23 Therefore, our risk of bias judgements may be too conservative. Similarly, we planned to use the Cochrane risk of bias (RoB) 1 tool 40 so we could use the most recent Cochrane review of exercise and depression 12 to calibrate our raters, and because RoB 2 had not yet been published. 70 Although assessments of bias between the two tools are generally comparable, 71 the RoB 1 tool can be more conservative when assessing open label studies with subjective assessments (eg, unblinded studies with self-reported measures for depression). 71 As a result, future reviews should consider using the latest risk of bias tool, which may lead to different assessments of bias in included studies.

Most of the main findings in this review appear robust to risks from publication bias. Specifically, pooled effect sizes decreased when accounting for risk of publication bias, but no degree of publication bias could nullify effects. We did not exclude grey literature, but our search strategy was not designed to systematically search grey literature or trial registries. Doing so can detect additional eligible studies 72 and reveal the numbers of completed studies that remain unpublished. 73 Future reviews should consider more systematic searches for this kind of literature to better quantify and mitigate risk of publication bias.

Similarly, our review was able to integrate evidence that directly compared exercise with other treatment modalities such as SSRIs or psychotherapy, while also informing estimates using indirect evidence (eg, comparing the relative effects of strength training and SSRIs when tested against a waitlist control). Our review did not, however, include all possible sources of indirect evidence. Network meta-analyses exist that directly focus on psychotherapy 7 and pharmacotherapy, 25 and these combined for treating depression. 6 Those reviews include more than 500 studies comparing psychological or drug interventions with controls. Harmonising the findings of those reviews with ours would provide stronger data on indirect effects.

Our review found some interesting moderators by age and sex, but these were at the study level rather than individual level—that is, rather than being able to determine whether women engaging in a strength intervention benefit more than men, we could only conclude that studies with more women showed larger effects than studies with fewer women. These studies may have been tailored towards women, so effects may be subject to confounding, as both sex and intervention may have changed. The same finding applied to age, where studies on older adults were likely adapted specifically to this age group. These between study differences may explain the heterogeneity in the effects of interventions, and confounding means our moderators for age and sex should be interpreted cautiously. Future reviews should consider individual patient meta-analyses to allow for more detailed assessments of participant level moderators.

Finally, for many modalities, the evidence is derived from small trials (eg, the median number of walking or jogging arms was 17). In addition to reducing risks from bias, primary research may benefit from deconstruction designs or from larger, head-to-head analyses of exercise modalities to better identify what works best for each candidate.

Clinical and policy implications

Our findings support the inclusion of exercise as part of clinical practice guidelines for depression, particularly vigorous intensity exercise. Doing so may help bridge the gap in treatment coverage by increasing the range of first line options for patients and health systems. 9 Globally there has been an attempt to reduce stigma associated with seeking treatment for depression. 74 Exercise may support this effort by providing patients with treatment options that carry less stigma. In low resource or funding constrained settings, group exercise interventions may provide relatively low cost alternatives for patients with depression and for health systems. When possible, ideal treatment may involve individualised care with a multidisciplinary team, where exercise professionals could take responsibility for ensuring the prescription is safe, personalised, challenging, and supported. In addition, those delivering psychotherapy may want to direct some time towards tackling cognitive and behavioural barriers to exercise. Exercise professionals might need to be trained in the management of depression (eg, managing risk) and to be mindful of the scope of their practice while providing support to deal with this major cause of disability.

Conclusions

Depression imposes a considerable global burden. Many exercise modalities appear to be effective treatments, particularly walking or jogging, strength training, and yoga, but confidence in many of the findings was low. We found preliminary data that may help practitioners tailor interventions to individuals (eg, yoga for older men, strength training for younger women). The World Health Organization recommends physical activity for everyone, including those with chronic conditions and disabilities, 75 but not everyone can access treatment easily. Many patients may have physical, psychological, or social barriers to participation. Still, some interventions with few costs, side effects, or pragmatic barriers, such as walking and jogging, are effective across people with different personal characteristics, severity of depression, and comorbidities. Those who are able may want to choose more intense exercise in a structured environment to further decrease depression symptoms. Health systems may want to provide these treatments as alternatives or adjuvants to other established interventions (cognitive behaviour therapy, SSRIs), while also attenuating risks to physical health associated with depression. 3 Therefore, effective exercise modalities could be considered alongside those intervention as core treatments for depression.

What is already known on this topic

Depression is a leading cause of disability, and exercise is often recommended alongside first line treatments such as pharmacotherapy and psychotherapy

Treatment guidelines and previous reviews disagree on how to prescribe exercise to best treat depression

What this study adds

Various exercise modalities are effective (walking, jogging, mixed aerobic exercise, strength training, yoga, tai chi, qigong) and well tolerated (especially strength training and yoga)

Effects appeared proportional to the intensity of exercise prescribed and were stronger for group exercise and interventions with clear prescriptions

Preliminary evidence suggests interactions between types of exercise and patients’ personal characteristics

Ethics statements

Ethical approval.

Not required.

Acknowledgments

We thank Lachlan McKee for his assistance with data extraction. We also thank Juliette Grosvenor and another librarian (anonymous) for their review of our search strategy.

Contributors: MN led the project, drafted the manuscript, and is the guarantor. MN, TS, PT, MM, BdPC, PP, SB, and CL drafted the initial study protocol. MN, TS, PT, BdPC, DvdH, JS, MM, RP, LP, RV, HA, and BV conducted screening, extraction, and risk of bias assessment. MN, JS, and JM coded methods for behaviour change techniques. MN and DGG conducted statistical analyses. PP, SB, and CL provided supervision and mentorship. All authors reviewed and approved the final manuscript. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding: None received.

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Data sharing Data and code for reproducing analyses are available on the Open Science Framework ( https://osf.io/nzw6u/ ).

The lead author (MN) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Dissemination to participants and related patient and public communities: We plan to disseminate the findings of this study to lay audiences through mainstream and social media.

Provenance and peer review: Not commissioned; externally peer reviewed.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .

  • ↵ World Health Organization. Depression. 2020 [cited 2020 Mar 12]. https://www.who.int/news-room/fact-sheets/detail/depression
  • ↵ Birkjær M, Kaats M, Rubio A. Wellbeing adjusted life years: A universal metric to quantify the happiness return on investment. Happiness Research Institute; 2020. https://www.happinessresearchinstitute.com/waly-report
  • Jacobson NC ,
  • Pinquart M ,
  • Duberstein PR
  • Cuijpers P ,
  • Karyotaki E ,
  • Vinkers CH ,
  • Cipriani A ,
  • Furukawa TA
  • Strawbridge R ,
  • Marwood L ,
  • Santomauro D ,
  • Collins PY ,
  • Generaal E ,
  • Lawlor DA ,
  • Cooney GM ,
  • Recchia F ,
  • Miller CT ,
  • Mundell NL ,
  • Gallardo-Gómez D ,
  • Del Pozo-Cruz J ,
  • Álvarez-Barbosa F ,
  • Alfonso-Rosa RM ,
  • Del Pozo Cruz B
  • Salcher-Konrad M ,
  • ↵ National Collaborating Centre for Mental Health (UK). Depression: The Treatment and Management of Depression in Adults (Updated Edition). Leicester (UK): British Psychological Society; https://www.ncbi.nlm.nih.gov/pubmed/22132433
  • Bassett D ,
  • ↵ American Psychiatric Association. Practice Guideline for the Treatment of Patients with Major Depressive Disorder. Third Edition. Washington, DC: American Psychiatric Association; 2010. 87 p. https://psychiatryonline.org/pb/assets/raw/sitewide/practice_guidelines/guidelines/mdd-1410197717630.pdf
  • ↵ NICE. Depression in adults: treatment and management. [cited 2023 Mar 13]. National Institute for Health and Care Excellence; 2022 https://www.nice.org.uk/guidance/ng222/resources
  • Heissel A ,
  • Brokmeier LL ,
  • Ekkekakis P
  • ↵ Chaimani A, Caldwell DM, Li T, Higgins JPT, Salanti G. Undertaking network meta-analyses. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al., editors. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane; 2022. www.training.cochrane.org/handbook
  • Furukawa TA ,
  • Salanti G ,
  • Miller KJ ,
  • Gonçalves-Bradley DC ,
  • Areerob P ,
  • Hennessy D ,
  • Mesagno C ,
  • Glowacki K ,
  • Duncan MJ ,
  • Gainforth H ,
  • Richardson M ,
  • Johnston M ,
  • Abraham C ,
  • Whittington C ,
  • McAteer J ,
  • French DP ,
  • Olander EK ,
  • Chisholm A ,
  • Mc Sharry J
  • Ntoumanis N ,
  • Prestwich A ,
  • Caldwell DM ,
  • Nikolakopoulou A ,
  • Higgins JPT ,
  • Papakonstantinou T ,
  • Caspersen CJ ,
  • Powell KE ,
  • Christenson GM
  • ↵ Veritas Health Innovation. Covidence systematic review software. Melbourne, Australia; 2023. www.covidence.org
  • Ainsworth BE ,
  • Haskell WL ,
  • Herrmann SD ,
  • Altman DG ,
  • Gøtzsche PC ,
  • Cochrane Bias Methods Group ,
  • Cochrane Statistical Methods Group
  • Hodges JS ,
  • ↵ Dias S, Welton NJ, Sutton AJ, Ades AE. NICE DSU technical support document 2: a generalised linear modelling framework for pairwise and network meta-analysis of randomised controlled trials. In: National Institute for Health and Care Excellence (NICE), editor. NICE Decision Support Unit Technical Support Documents. London: Citeseer; 2011. https://www.ncbi.nlm.nih.gov/books/NBK310366/
  • Faltinsen E ,
  • Todorovac A ,
  • Staxen Bruun L ,
  • ↵ R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2022. https://www.R-project.org/
  • Hengartner MP ,
  • Balduzzi S ,
  • Dusseldorp E ,
  • Sterne JAC ,
  • Sutton AJ ,
  • Ioannidis JPA ,
  • Mathur MB ,
  • VanderWeele TJ
  • Leung LYL ,
  • La Rocque CL ,
  • Mazurka R ,
  • Stuckless TJR ,
  • Harkness KL
  • Vollbehr NK ,
  • Hoenders HJR ,
  • Bartels-Velthuis AA ,
  • Zeibig JM ,
  • Seiffer B ,
  • Ehmann PJ ,
  • Alderman BL
  • Bombardier CH ,
  • Gibbons LE ,
  • ↵ American Psychological Association. Clinical practice guideline for the treatment of depression across three age cohorts. American Psychological Association; 2019. https://www.apa.org/depression-guideline/
  • van Straten A ,
  • Reynolds CF 3rd .
  • Johannsen M ,
  • Nissen ER ,
  • Lundorff M ,
  • Coventry PA ,
  • Schuch FB ,
  • Deslandes AC ,
  • Gosmann NP ,
  • Fleck MP de A
  • Saunders DH ,
  • Phillips SM
  • Teychenne M ,
  • Hunsley J ,
  • Di Giulio G
  • Milkman KL ,
  • Hecksteden A ,
  • Savović J ,
  • ↵ Richter B, Hemmingsen B. Comparison of the Cochrane risk of bias tool 1 (RoB 1) with the updated Cochrane risk of bias tool 2 (RoB 2). Cochrane; 2021. Report No.: 1. https://community.cochrane.org/sites/default/files/uploads/inline-files/RoB1_2_project_220529_BR%20KK%20formatted.pdf
  • Chandler J ,
  • Lefebvre C ,
  • Glanville J ,
  • Briscoe S ,
  • Coronado-Montoya S ,
  • Kwakkenbos L ,
  • Steele RJ ,
  • Turner EH ,
  • Angermeyer MC ,
  • van der Auwera S ,
  • Schomerus G
  • Al-Ansari SS ,

review of research paper pdf

Our next-generation model: Gemini 1.5

Feb 15, 2024

The model delivers dramatically enhanced performance, with a breakthrough in long-context understanding across modalities.

SundarPichai_2x.jpg

A note from Google and Alphabet CEO Sundar Pichai:

Last week, we rolled out our most capable model, Gemini 1.0 Ultra, and took a significant step forward in making Google products more helpful, starting with Gemini Advanced . Today, developers and Cloud customers can begin building with 1.0 Ultra too — with our Gemini API in AI Studio and in Vertex AI .

Our teams continue pushing the frontiers of our latest models with safety at the core. They are making rapid progress. In fact, we’re ready to introduce the next generation: Gemini 1.5. It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute.

This new generation also delivers a breakthrough in long-context understanding. We’ve been able to significantly increase the amount of information our models can process — running up to 1 million tokens consistently, achieving the longest context window of any large-scale foundation model yet.

Longer context windows show us the promise of what is possible. They will enable entirely new capabilities and help developers build much more useful models and applications. We’re excited to offer a limited preview of this experimental feature to developers and enterprise customers. Demis shares more on capabilities, safety and availability below.

Introducing Gemini 1.5

By Demis Hassabis, CEO of Google DeepMind, on behalf of the Gemini team

This is an exciting time for AI. New advances in the field have the potential to make AI more helpful for billions of people over the coming years. Since introducing Gemini 1.0 , we’ve been testing, refining and enhancing its capabilities.

Today, we’re announcing our next-generation model: Gemini 1.5.

Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in our approach, building upon research and engineering innovations across nearly every part of our foundation model development and infrastructure. This includes making Gemini 1.5 more efficient to train and serve, with a new Mixture-of-Experts (MoE) architecture.

The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra , our largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.

Gemini 1.5 Pro comes with a standard 128,000 token context window. But starting today, a limited group of developers and enterprise customers can try it with a context window of up to 1 million tokens via AI Studio and Vertex AI in private preview.

As we roll out the full 1 million token context window, we’re actively working on optimizations to improve latency, reduce computational requirements and enhance the user experience. We’re excited for people to try this breakthrough capability, and we share more details on future availability below.

These continued advances in our next-generation models will open up new possibilities for people, developers and enterprises to create, discover and build using AI.

Context lengths of leading foundation models

Highly efficient architecture

Gemini 1.5 is built upon our leading research on Transformer and MoE architecture. While a traditional Transformer functions as one large neural network, MoE models are divided into smaller "expert” neural networks.

Depending on the type of input given, MoE models learn to selectively activate only the most relevant expert pathways in its neural network. This specialization massively enhances the model’s efficiency. Google has been an early adopter and pioneer of the MoE technique for deep learning through research such as Sparsely-Gated MoE , GShard-Transformer , Switch-Transformer, M4 and more.

Our latest innovations in model architecture allow Gemini 1.5 to learn complex tasks more quickly and maintain quality, while being more efficient to train and serve. These efficiencies are helping our teams iterate, train and deliver more advanced versions of Gemini faster than ever before, and we’re working on further optimizations.

Greater context, more helpful capabilities

An AI model’s “context window” is made up of tokens, which are the building blocks used for processing information. Tokens can be entire parts or subsections of words, images, videos, audio or code. The bigger a model’s context window, the more information it can take in and process in a given prompt — making its output more consistent, relevant and useful.

Through a series of machine learning innovations, we’ve increased 1.5 Pro’s context window capacity far beyond the original 32,000 tokens for Gemini 1.0. We can now run up to 1 million tokens in production.

This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words. In our research, we’ve also successfully tested up to 10 million tokens.

Complex reasoning about vast amounts of information

1.5 Pro can seamlessly analyze, classify and summarize large amounts of content within a given prompt. For example, when given the 402-page transcripts from Apollo 11’s mission to the moon, it can reason about conversations, events and details found across the document.

Reasoning across a 402-page transcript: Gemini 1.5 Pro Demo

Gemini 1.5 Pro can understand, reason about and identify curious details in the 402-page transcripts from Apollo 11’s mission to the moon.

Better understanding and reasoning across modalities

1.5 Pro can perform highly-sophisticated understanding and reasoning tasks for different modalities, including video. For instance, when given a 44-minute silent Buster Keaton movie , the model can accurately analyze various plot points and events, and even reason about small details in the movie that could easily be missed.

Multimodal prompting with a 44-minute movie: Gemini 1.5 Pro Demo

Gemini 1.5 Pro can identify a scene in a 44-minute silent Buster Keaton movie when given a simple line drawing as reference material for a real-life object.

Relevant problem-solving with longer blocks of code

1.5 Pro can perform more relevant problem-solving tasks across longer blocks of code. When given a prompt with more than 100,000 lines of code, it can better reason across examples, suggest helpful modifications and give explanations about how different parts of the code works.

Problem solving across 100,633 lines of code | Gemini 1.5 Pro Demo

Gemini 1.5 Pro can reason across 100,000 lines of code giving helpful solutions, modifications and explanations.

Enhanced performance

When tested on a comprehensive panel of text, code, image, audio and video evaluations, 1.5 Pro outperforms 1.0 Pro on 87% of the benchmarks used for developing our large language models (LLMs). And when compared to 1.0 Ultra on the same benchmarks, it performs at a broadly similar level.

Gemini 1.5 Pro maintains high levels of performance even as its context window increases. In the Needle In A Haystack (NIAH) evaluation, where a small piece of text containing a particular fact or statement is purposely placed within a long block of text, 1.5 Pro found the embedded text 99% of the time, in blocks of data as long as 1 million tokens.

Gemini 1.5 Pro also shows impressive “in-context learning” skills, meaning that it can learn a new skill from information given in a long prompt, without needing additional fine-tuning. We tested this skill on the Machine Translation from One Book (MTOB) benchmark, which shows how well the model learns from information it’s never seen before. When given a grammar manual for Kalamang , a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person learning from the same content.

As 1.5 Pro’s long context window is the first of its kind among large-scale models, we’re continuously developing new evaluations and benchmarks for testing its novel capabilities.

For more details, see our Gemini 1.5 Pro technical report .

Extensive ethics and safety testing

In line with our AI Principles and robust safety policies, we’re ensuring our models undergo extensive ethics and safety tests. We then integrate these research learnings into our governance processes and model development and evaluations to continuously improve our AI systems.

Since introducing 1.0 Ultra in December, our teams have continued refining the model, making it safer for a wider release. We’ve also conducted novel research on safety risks and developed red-teaming techniques to test for a range of potential harms.

In advance of releasing 1.5 Pro, we've taken the same approach to responsible deployment as we did for our Gemini 1.0 models, conducting extensive evaluations across areas including content safety and representational harms, and will continue to expand this testing. Beyond this, we’re developing further tests that account for the novel long-context capabilities of 1.5 Pro.

Build and experiment with Gemini models

We’re committed to bringing each new generation of Gemini models to billions of people, developers and enterprises around the world responsibly.

Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI . Read more about this on our Google for Developers blog and Google Cloud blog .

We’ll introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model.

Early testers can try the 1 million token context window at no cost during the testing period, though they should expect longer latency times with this experimental feature. Significant improvements in speed are also on the horizon.

Developers interested in testing 1.5 Pro can sign up now in AI Studio, while enterprise customers can reach out to their Vertex AI account team.

Learn more about Gemini’s capabilities and see how it works .

Get more stories from Google in your inbox.

Your information will be used in accordance with Google's privacy policy.

Done. Just one step more.

Check your inbox to confirm your subscription.

You are already subscribed to our newsletter.

You can also subscribe with a different email address .

Related stories

Gemini models are coming to performance max.

gemma-header

Gemma: Introducing new state-of-the-art open models

What is a long context window.

MSC_Keyword_Cover (3)

How AI can strengthen digital security

Shield

Working together to address AI risks and opportunities at MSC

AI Evergreen 1 (1)

How we’re partnering with the industry, governments and civil society to advance AI

Let’s stay in touch. Get the latest news from Google in your inbox.

  • Newsletters

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

  • Will Douglas Heaven archive page

OpenAI has built a striking new generative video model called Sora that can take a short text description and turn it into a detailed, high-definition film clip up to a minute long.

Based on four sample videos that OpenAI shared with MIT Technology Review ahead of today’s announcement, the San Francisco–based firm has pushed the envelope of what’s possible with text-to-video generation (a hot new research direction that we flagged as a trend to watch in 2024 ).

“We think building models that can understand video, and understand all these very complex interactions of our world, is an important step for all future AI systems,” says Tim Brooks, a scientist at OpenAI.

But there’s a disclaimer. OpenAI gave us a preview of Sora (which means sky in Japanese) under conditions of strict secrecy. In an unusual move, the firm would only share information about Sora if we agreed to wait until after news of the model was made public to seek the opinions of outside experts. [Editor’s note: We’ve updated this story with outside comment below.] OpenAI has not yet released a technical report or demonstrated the model actually working. And it says it won’t be releasing Sora anytime soon. [ Update: OpenAI has now shared more technical details on its website.]

The first generative models that could produce video from snippets of text appeared in late 2022. But early examples from Meta , Google, and a startup called Runway were glitchy and grainy. Since then, the tech has been getting better fast. Runway’s gen-2 model, released last year, can produce short clips that come close to matching big-studio animation in their quality. But most of these examples are still only a few seconds long.  

The sample videos from OpenAI’s Sora are high-definition and full of detail. OpenAI also says it can generate videos up to a minute long. One video of a Tokyo street scene shows that Sora has learned how objects fit together in 3D: the camera swoops into the scene to follow a couple as they walk past a row of shops.

OpenAI also claims that Sora handles occlusion well. One problem with existing models is that they can fail to keep track of objects when they drop out of view. For example, if a truck passes in front of a street sign, the sign might not reappear afterward.  

In a video of a papercraft underwater scene, Sora has added what look like cuts between different pieces of footage, and the model has maintained a consistent style between them.

It’s not perfect. In the Tokyo video, cars to the left look smaller than the people walking beside them. They also pop in and out between the tree branches. “There’s definitely some work to be done in terms of long-term coherence,” says Brooks. “For example, if someone goes out of view for a long time, they won’t come back. The model kind of forgets that they were supposed to be there.”

Impressive as they are, the sample videos shown here were no doubt cherry-picked to show Sora at its best. Without more information, it is hard to know how representative they are of the model’s typical output.   

It may be some time before we find out. OpenAI’s announcement of Sora today is a tech tease, and the company says it has no current plans to release it to the public. Instead, OpenAI will today begin sharing the model with third-party safety testers for the first time.

In particular, the firm is worried about the potential misuses of fake but photorealistic video . “We’re being careful about deployment here and making sure we have all our bases covered before we put this in the hands of the general public,” says Aditya Ramesh, a scientist at OpenAI, who created the firm’s text-to-image model DALL-E .

But OpenAI is eyeing a product launch sometime in the future. As well as safety testers, the company is also sharing the model with a select group of video makers and artists to get feedback on how to make Sora as useful as possible to creative professionals. “The other goal is to show everyone what is on the horizon, to give a preview of what these models will be capable of,” says Ramesh.

To build Sora, the team adapted the tech behind DALL-E 3, the latest version of OpenAI’s flagship text-to-image model. Like most text-to-image models, DALL-E 3 uses what’s known as a diffusion model. These are trained to turn a fuzz of random pixels into a picture.

Sora takes this approach and applies it to videos rather than still images. But the researchers also added another technique to the mix. Unlike DALL-E or most other generative video models, Sora combines its diffusion model with a type of neural network called a transformer.

Transformers are great at processing long sequences of data, like words. That has made them the special sauce inside large language models like OpenAI’s GPT-4 and Google DeepMind’s Gemini . But videos are not made of words. Instead, the researchers had to find a way to cut videos into chunks that could be treated as if they were. The approach they came up with was to dice videos up across both space and time. “It’s like if you were to have a stack of all the video frames and you cut little cubes from it,” says Brooks.

The transformer inside Sora can then process these chunks of video data in much the same way that the transformer inside a large language model processes words in a block of text. The researchers say that this let them train Sora on many more types of video than other text-to-video models, varied in terms of resolution, duration, aspect ratio, and orientation. “It really helps the model,” says Brooks. “That is something that we’re not aware of any existing work on.”

“From a technical perspective it seems like a very significant leap forward,” says Sam Gregory, executive director at Witness, a human rights organization that specializes in the use and misuse of video technology. “But there are two sides to the coin,” he says. “The expressive capabilities offer the potential for many more people to be storytellers using video. And there are also real potential avenues for misuse.” 

OpenAI is well aware of the risks that come with a generative video model. We are already seeing the large-scale misuse of deepfake images . Photorealistic video takes this to another level.

Gregory notes that you could use technology like this to misinform people about conflict zones or protests. The range of styles is also interesting, he says. If you could generate shaky footage that looked like something shot with a phone, it would come across as more authentic.

The tech is not there yet, but generative video has gone from zero to Sora in just 18 months. “We’re going to be entering a universe where there will be fully synthetic content, human-generated content and a mix of the two,” says Gregory.

The OpenAI team plans to draw on the safety testing it did last year for DALL-E 3. Sora already includes a filter that runs on all prompts sent to the model that will block requests for violent, sexual, or hateful images, as well as images of known people. Another filter will look at frames of generated videos and block material that violates OpenAI’s safety policies.

OpenAI says it is also adapting a fake-image detector developed for DALL-E 3 to use with Sora. And the company will embed industry-standard C2PA tags , metadata that states how an image was generated, into all of Sora’s output. But these steps are far from foolproof. Fake-image detectors are hit-or-miss. Metadata is easy to remove, and most social media sites strip it from uploaded images by default.  

“We’ll definitely need to get more feedback and learn more about the types of risks that need to be addressed with video before it would make sense for us to release this,” says Ramesh.

Brooks agrees. “Part of the reason that we’re talking about this research now is so that we can start getting the input that we need to do the work necessary to figure out how it could be safely deployed,” he says.

Update 2/15: Comments from Sam Gregory were added .

Artificial intelligence

Ai for everything: 10 breakthrough technologies 2024.

Generative AI tools like ChatGPT reached mass adoption in record time, and reset the course of an entire industry.

What’s next for AI in 2024

Our writers look at the four hot trends to watch out for this year

  • Melissa Heikkilä archive page

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Deploying high-performance, energy-efficient AI

Investments into downsized infrastructure can help enterprises reap the benefits of AI while mitigating energy consumption, says corporate VP and GM of data center platform engineering and architecture at Intel, Zane Ball.

  • MIT Technology Review Insights archive page

Stay connected

Get the latest updates from mit technology review.

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at [email protected] with a list of newsletters you’d like to receive.

IMAGES

  1. (PDF) Descriptive Review for Research Paper Format

    review of research paper pdf

  2. How to write a literature review in research paper

    review of research paper pdf

  3. 10 Expert Tips: How to Critically Review an Article in 2024

    review of research paper pdf

  4. Critical analysis of research paper sample in 2021

    review of research paper pdf

  5. (PDF) Writing a Literature Review Research Paper: A step-by-step approach

    review of research paper pdf

  6. Research Paper Chapter 1 To 5

    review of research paper pdf

VIDEO

  1. How to Review a Research Paper

  2. Write Your Literature Review FAST

  3. Want To Finish Your PhD Or Publish Papers FASTER? Do This

  4. Disadvantages of Compact Academic Writing Programs

  5. Write 3 Research Papers In 6 Months: Exact Process

  6. How to Write a Research Paper Publication

COMMENTS

  1. How to write a review paper

    How to write a review paper to our readers, but it will also enhance its scientific impact on environmental science. Mastering the skills needed to write a good sci-entific review also pays dividends when writing up the literature review featured in the introduction of primary-research papers.

  2. (PDF) Writing a Literature Review Research Paper: A step-by-step approach

    Writing a literature review in the pre or post-qualification, will be required to undertake a literature review, either as part of a course of study, as a key step in the research process. A ...

  3. PDF How to Write a Literature Review

    A literature review is a review or discussion of the current published material available on a particular topic. It attempts to synthesizeand evaluatethe material and information according to the research question(s), thesis, and central theme(s). In other words, instead of supporting an argument, or simply making a list of summarized research ...

  4. PDF LITERATURE REVIEWS

    WHAT IS A LITERATURE REVIEW? PURPOSES OF A LITERATURE REVIEW orient your reader by defining key concepts (theoretical) and/or providing relevant background (empirical) "motivate" your research, i.e. demonstrating the relevance of your project

  5. How to review a paper

    Writing a good review requires expertise in the field, an intimate knowledge of research methods, a critical mind, the ability to give fair and constructive feedback, and sensitivity to the feelings of authors on the receiving end.

  6. (PDF) Writing a review article in 7 steps

    (PDF) Writing a review article in 7 steps April 2015 Authors: Eric Lichtfouse Abstract This short note provides step-by-step guidelines to write a review article or a book chapter. I explain...

  7. PDF Literature Review and Focusing the Research

    the topic of the research and to build a rationale for the problem that is studied and the need for additional research. Boote and Beile (2005) eloquently explain the purpose of a literature review in planning primary research: As the foundation of any research project, the literature review should accomplish several important objectives.

  8. PDF What is a Literature Review?

    erature review is a synopsis of other research. Moreover, it is a critical appraisal of other research on a given topic that helps to put that topic in context (Machi and McEvoy, 2009). A comprehensive review should provide the reader with a succinct, objective and logical summary of the current knowledge on a particular topic.

  9. How to write a superb literature review

    One of my favourite review-style articles 3 presents a plot bringing together data from multiple research papers (many of which directly contradict each other). This is then used to identify broad ...

  10. PDF Literature Reviews What is a literature review? summary synthesis

    The goal of a research paper is to develop a new argument, and typically includes some form of data collection and analysis. A research paper usually includes a literature review as one of its components (often labeled as the "Background" or "Theoretical Background" section). Why are literature reviews necessary?

  11. How to Write a Literature Review

    What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research that you can later apply to your paper, thesis, or dissertation topic.

  12. PDF A Guide to Peer Reviewing Journal Articles

    The benefits include: Learning more about the editorial process. By reviewing a paper and liaising with the editorial office, you will gain first-hand experience of the key considerations that go into the publication decision, as well as commonly recommended revisions. Keeping up to date with novel research in your field.

  13. (PDF) Writing Critical Reviews: A Step-by-Step Guide

    Martin Davies Written for students - I overview the different kinds of report writing (from my book *Study Skills for International Postgraduates* (2nd Edition) Bloomsbury. Chapter February 2022...

  14. PDF The Science of Literature Reviews: Searching, Identifying, Selecting

    desktop or secondary research since it has to do with reading, summarising, compiling, ana-lysing and interpreting published materials in a specific re search domain [13]. For example, a desk-based review of existing literature or data can be conducted following a qualitative Activities in a literature review process (authors' illustration).

  15. PDF Writing a Literature Review Paper

    select and read your sources; write your review. Perhaps the most important step in this process is selecting your research topic. A good research topic focuses on a subject that has been well explored. That is, one where you can find articles that reflect growth and change in an area of research. Your topic needs to be narrow and focused.

  16. How to write a good scientific review article

    Literature reviews are valuable resources for the scientific community. With research accelerating at an unprecedented speed in recent years and more and more original papers being published, review articles have become increasingly important as a means to keep up-to-date with developments in a particular area of research.

  17. PDF How to [read, present, review] a research paper

    1. read abstract carefully. 2. read introduction quickly. 3. read conclusions quickly. 4. look at references. 5. skim rest of paper. and then go back and start again if I do want to read it. Can help to articulate explicitly what questions you're trying to answer in your reading. How to read a research paper.

  18. (PDF) Article review writing format, steps, examples and illustration

    Review articles in academic journals that analyze or discuss researches previously published by others, rather than reporting new research results or findings. Summaries and critiques are two ways to write a review of a scientific journal article.

  19. Step by Step Guide to Reviewing a Manuscript

    Briefly summarize what the paper is about and what the findings are. Try to put the findings of the paper into the context of the existing literature and current knowledge. Indicate the significance of the work and if it is novel or mainly confirmatory. Indicate the work's strengths, its quality and completeness.

  20. (PDF) How to Review a Research Paper

    Peer review in health sciences. Second edition. London: BMJ Books, 2003:45-61. PDF | workshop at University of Diyala , Dec. 4, 2016 for the academic staff and post graduate students | Find, read ...

  21. PDF Guidelines for Writing a Paper Review

    Guidelines for Writing a Paper Review. Throughout the semester, we will have paper discussions that dive into more advanced research related to course material. If you are in the 4-credit graduate section of 433, you will be required to submit four paper reviews, choosing any four from the schedule (you cannot, however, submit a review for

  22. PDF The Impact of Infrastructure on Development Outcomes

    papers written in other languages were also reviewed, if they were cited in the selected literature review papers. Finally, to ensure that more recent (as yet unpublished) research was captured by the review, a global call for new papers on this theme was conducted in preparation for the 2022

  23. Effect of exercise for depression: systematic review and network meta

    Objective To identify the optimal dose and modality of exercise for treating major depressive disorder, compared with psychotherapy, antidepressants, and control conditions. Design Systematic review and network meta-analysis. Methods Screening, data extraction, coding, and risk of bias assessment were performed independently and in duplicate. Bayesian arm based, multilevel network meta ...

  24. (PDF) How to review a paper

    How to review a paper October 2009 Journal of Health Services Research & Policy 14 (4):255-256 Authors: Melissa Harden The University of York Kath Wright The University of York Kate Misso...

  25. Introducing Gemini 1.5, Google's next-generation AI model

    Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in our approach, building upon research and engineering innovations across nearly every part of our foundation model development and infrastructure. This includes making Gemini 1.5 more efficient to train and serve, with a new Mixture-of-Experts (MoE) architecture.

  26. OpenAI teases an amazing new generative video model called Sora

    The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more. OpenAI has built a striking new generative video model called Sora that can take a ...

  27. 110553 PDFs

    Jul 2024 Iglesias-Sánchez Patricia P. Carmen Jambrino Arvind Choubey Vaibhav Soni R K Lal Bhise Rushikesh Nanasaheb Alok Kumar Krishna Sonia Leva Veerandra singh Matsaniya Explore the latest...

  28. Concrete 3D printing technology in sustainable construction: A review

    This paper reviews recent developments and proposes perspectives for future research on three-dimensional printing concrete (3DPC). This review originally analyses the 3DP applications combined ...