Reference management. Clean and simple.

Google Scholar: the ultimate guide

How to use Google scholar: the ultimate guide

What is Google Scholar?

Why is google scholar better than google for finding research papers, the google scholar search results page, the first two lines: core bibliographic information, quick full text-access options, "cited by" count and other useful links, tips for searching google scholar, 1. google scholar searches are not case sensitive, 2. use keywords instead of full sentences, 3. use quotes to search for an exact match, 3. add the year to the search phrase to get articles published in a particular year, 4. use the side bar controls to adjust your search result, 5. use boolean operator to better control your searches, google scholar advanced search interface, customizing search preferences and options, using the "my library" feature in google scholar, the scope and limitations of google scholar, alternatives to google scholar, country-specific google scholar sites, frequently asked questions about google scholar, related articles.

Google Scholar (GS) is a free academic search engine that can be thought of as the academic version of Google. Rather than searching all of the indexed information on the web, it searches repositories of:

  • universities
  • scholarly websites

This is generally a smaller subset of the pool that Google searches. It's all done automatically, but most of the search results tend to be reliable scholarly sources.

However, Google is typically less careful about what it includes in search results than more curated, subscription-based, academic databases like Scopus and Web of Science . As a result, it is important to take some time to assess the credibility of the resources linked through Google Scholar.

➡️ Take a look at our guide on the best academic databases .

Google Scholar home page

One advantage of using Google Scholar is that the interface is comforting and familiar to anyone who uses Google. This lowers the learning curve of finding scholarly information .

There are a number of useful differences from a regular Google search. Google Scholar allows you to:

  • copy a formatted citation in different styles including MLA and APA
  • export bibliographic data (BibTeX, RIS) to use with reference management software
  • explore other works have cited the listed work
  • easily find full text versions of the article

Although it is free to search in Google Scholar, most of the content is not freely available. Google does its best to find copies of restricted articles in public repositories. If you are at an academic or research institution, you can also set up a library connection that allows you to see items that are available through your institution.

The Google Scholar results page differs from the Google results page in a few key ways. The search result page is, however, different and it is worth being familiar with the different pieces of information that are shown. Let's have a look at the results for the search term "machine learning.”

Google Scholar search results page

  • The first line of each result provides the title of the document (e.g. of an article, book, chapter, or report).
  • The second line provides the bibliographic information about the document, in order: the author(s), the journal or book it appears in, the year of publication, and the publisher.

Clicking on the title link will bring you to the publisher’s page where you may be able to access more information about the document. This includes the abstract and options to download the PDF.

Google Scholar quick link to PDF

To the far right of the entry are more direct options for obtaining the full text of the document. In this example, Google has also located a publicly available PDF of the document hosted at umich.edu . Note, that it's not guaranteed that it is the version of the article that was finally published in the journal.

Google Scholar: more action links

Below the text snippet/abstract you can find a number of useful links.

  • Cited by : the cited by link will show other articles that have cited this resource. That is a super useful feature that can help you in many ways. First, it is a good way to track the more recent research that has referenced this article, and second the fact that other researches cited this document lends greater credibility to it. But be aware that there is a lag in publication type. Therefore, an article published in 2017 will not have an extensive number of cited by results. It takes a minimum of 6 months for most articles to get published, so even if an article was using the source, the more recent article has not been published yet.
  • Versions : this link will display other versions of the article or other databases where the article may be found, some of which may offer free access to the article.
  • Quotation mark icon : this will display a popup with commonly used citation formats such as MLA, APA, Chicago, Harvard, and Vancouver that may be copied and pasted. Note, however, that the Google Scholar citation data is sometimes incomplete and so it is often a good idea to check this data at the source. The "cite" popup also includes links for exporting the citation data as BibTeX or RIS files that any major reference manager can import.

Google Scholar citation panel

Although Google Scholar limits each search to a maximum of 1,000 results , it's still too much to explore, and you need an effective way of locating the relevant articles. Here’s a list of pro tips that will help you save time and search more effectively.

You don’t need to worry about case sensitivity when you’re using Google scholar. In other words, a search for "Machine Learning" will produce the same results as a search for "machine learning.”

Let's say your research topic is about self driving cars. For a regular Google search we might enter something like " what is the current state of the technology used for self driving cars ". In Google Scholar, you will see less than ideal results for this query .

The trick is to build a list of keywords and perform searches for them like self-driving cars, autonomous vehicles, or driverless cars. Google Scholar will assist you on that: if you start typing in the search field you will see related queries suggested by Scholar!

If you put your search phrase into quotes you can search for exact matches of that phrase in the title and the body text of the document. Without quotes, Google Scholar will treat each word separately.

This means that if you search national parks , the words will not necessarily appear together. Grouped words and exact phrases should be enclosed in quotation marks.

A search using “self-driving cars 2015,” for example, will return articles or books published in 2015.

Using the options in the left hand panel you can further restrict the search results by limiting the years covered by the search, the inclusion or exclude of patents, and you can sort the results by relevance or by date.

Searches are not case sensitive, however, there are a number of Boolean operators you can use to control the search and these must be capitalized.

  • AND requires both of the words or phrases on either side to be somewhere in the record.
  • NOT can be placed in front of a word or phrases to exclude results which include them.
  • OR will give equal weight to results which match just one of the words or phrases on either side.

➡️ Read more about how to efficiently search online databases for academic research .

In case you got overwhelmed by the above options, here’s some illustrative examples:

Tip: Use the advanced search features in Google Scholar to narrow down your search results.

You can gain even more fine-grained control over your search by using the advanced search feature. This feature is available by clicking on the hamburger menu in the upper left and selecting the "Advanced search" menu item.

Google Scholar advanced search

Adjusting the Google Scholar settings is not necessary for getting good results, but offers some additional customization, including the ability to enable the above-mentioned library integrations.

The settings menu is found in the hamburger menu located in the top left of the Google Scholar page. The settings are divided into five sections:

  • Collections to search: by default Google scholar searches articles and includes patents, but this default can be changed if you are not interested in patents or if you wish to search case law instead.
  • Bibliographic manager: you can export relevant citation data via the “Bibliography manager” subsection.
  • Languages: if you wish for results to return only articles written in a specific subset of languages, you can define that here.
  • Library links: as noted, Google Scholar allows you to get the Full Text of articles through your institution’s subscriptions, where available. Search for, and add, your institution here to have the relevant link included in your search results.
  • Button: the Scholar Button is a Chrome extension which adds a dropdown search box to your toolbar. This allows you to search Google Scholar from any website. Moreover, if you have any text selected on the page and then click the button it will display results from a search on those words when clicked.

When signed in, Google Scholar adds some simple tools for keeping track of and organizing the articles you find. These can be useful if you are not using a full academic reference manager.

All the search results include a “save” button at the end of the bottom row of links, clicking this will add it to your "My Library".

To help you provide some structure, you can create and apply labels to the items in your library. Appended labels will appear at the end of the article titles. For example, the following article has been assigned a “RNA” label:

Google Scholar  my library entry with label

Within your Google Scholar library, you can also edit the metadata associated with titles. This will often be necessary as Google Scholar citation data is often faulty.

There is no official statement about how big the Scholar search index is, but unofficial estimates are in the range of about 160 million , and it is supposed to continue to grow by several million each year.

Yet, Google Scholar does not return all resources that you may get in search at you local library catalog. For example, a library database could return podcasts, videos, articles, statistics, or special collections. For now, Google Scholar has only the following publication types:

  • Journal articles : articles published in journals. It's a mixture of articles from peer reviewed journals, predatory journals and pre-print archives.
  • Books : links to the Google limited version of the text, when possible.
  • Book chapters : chapters within a book, sometimes they are also electronically available.
  • Book reviews : reviews of books, but it is not always apparent that it is a review from the search result.
  • Conference proceedings : papers written as part of a conference, typically used as part of presentation at the conference.
  • Court opinions .
  • Patents : Google Scholar only searches patents if the option is selected in the search settings described above.

The information in Google Scholar is not cataloged by professionals. The quality of the metadata will depend heavily on the source that Google Scholar is pulling the information from. This is a much different process to how information is collected and indexed in scholarly databases such as Scopus or Web of Science .

➡️ Visit our list of the best academic databases .

Google Scholar is by far the most frequently used academic search engine , but it is not the only one. Other academic search engines include:

  • Science.gov
  • Semantic Scholar
  • scholar.google.fr : Sur les épaules d'un géant
  • scholar.google.es (Google Académico): A hombros de gigantes
  • scholar.google.pt (Google Académico): Sobre os ombros de gigantes
  • scholar.google.de : Auf den Schultern von Riesen

➡️ Once you’ve found some research, it’s time to read it. Take a look at our guide on how to read a scientific paper .

No. Google Scholar is a bibliographic search engine rather than a bibliographic database. In order to qualify as a database Google Scholar would need to have stable identifiers for its records.

No. Google Scholar is an academic search engine, but the records found in Google Scholar are scholarly sources.

No. Google Scholar collects research papers from all over the web, including grey literature and non-peer reviewed papers and reports.

Google Scholar does not provide any full text content itself, but links to the full text article on the publisher page, which can either be open access or paywalled content. Google Scholar tries to provide links to free versions, when possible.

The easiest way to access Google scholar is by using The Google Scholar Button. This is a browser extension that allows you easily access Google Scholar from any web page. You can install it from the Chrome Webstore .

google scholar research articles

Stand on the shoulders of giants

Google Scholar provides a simple way to broadly search for scholarly literature. From one place, you can search across many disciplines and sources: articles, theses, books, abstracts and court opinions, from academic publishers, professional societies, online repositories, universities and other web sites. Google Scholar helps you find relevant work across the world of scholarly research.

google scholar research articles

How are documents ranked?

Google Scholar aims to rank documents the way researchers do, weighing the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature.

Features of Google Scholar

  • Search all scholarly literature from one convenient place
  • Explore related works, citations, authors, and publications
  • Locate the complete document through your library or on the web
  • Keep up with recent developments in any area of research
  • Check who's citing your publications, create a public author profile

google scholar research articles

Disclaimer: Legal opinions in Google Scholar are provided for informational purposes only and should not be relied on as a substitute for legal advice from a licensed lawyer. Google does not warrant that the information is complete or accurate.

  • Privacy & Terms

Using Google for Research

  • Google Search
  • Google Scholar
  • Google Books

What is Google Scholar?

Google Scholar searches for scholarly literature in a simple, familiar way. You can search across many disciplines and sources at once to find articles, books, theses, court opinions, and content from academic publishers, professional societies, some academic web sites, and more. See the Google Scholar inclusion guidelines for more about what’s in Google Scholar.

Advanced Search Tips

For more precise searching, use Google's  Advanced Scholar Search Page , or try these tips:

Find content by an author:

  • Add the author's name to the search, or
  • Use the "author:" operator (eg. aphasia author:jones finds articles about aphasia written by people named Jones)

Search for a phrase:

  • Use "quotation marks" to find phrases (eg. "allegory of the cave" plato republic finds articles about Plato's cave allegory in The Republic )

Search by words in the title:

  • Use the "intitle:" operator (eg. intitle:fellini finds articles with Fellini in the title]

Setting "Library Links" preferences in Google Scholar

1. go to scholar.google.com , and click on the menu button (3 horizontal bars) in the upper left-hand corner of the screen..

Screenshot of Google Scholar search interface showing location of menu button.

2. In the menu that appears, click "Settings"

Screenshot of Google Scholar menu showing location of Settings link.

3. Click "Library links" in the left-hand menu. 

Screenshot of Google Scholar Settings showing location of Library Links link.

4. Search for NYU, and select only  "New York University Libraries - GetIt@NYU" then click "Save".

Screenshot of Library Links search box showing a search for NYU, and only the box next to "New York University Libraries Getit@NYU" is checked.

5. Conduct a new search in Google Scholar. Click the "GetIt@NYU" link next to each search result to get NYU Libraries-subscribed access to the article. If you are off campus, you will be prompted to log in with your NetID and password before being granted access to the full-text.

Screenshot of Google Scholar search results page showing that Getit@NYU links now appear next to each result.

6. If you encounter a search result without a "GetIt@NYU" link next to it, try clicking on the "double arrow" button below it, and the link should appear.

Screenshot of a single Google Scholar search result showing location of double-arrow button.

  • << Previous: Google Search
  • Next: Google Books >>
  • Last Updated: Sep 12, 2023 4:08 PM
  • URL: https://guides.nyu.edu/google

18 Google Scholar tips all students should know

Dec 13, 2022

Think of this guide as your personal research assistant.

Molly McHugh-Johnson headshot

“It’s hard to pick your favorite kid,” Anurag Acharya says when I ask him to talk about a favorite Google Scholar feature he’s worked on. “I work on product, engineering, operations, partnerships,” he says. He’s been doing it for 18 years, which as of this month, happens to be how long Google Scholar has been around.

Google Scholar is also one of Google’s longest-running services. The comprehensive database of research papers, legal cases and other scholarly publications was the fourth Search service Google launched, Anurag says. In honor of this very important tool’s 18th anniversary, I asked Anurag to share 18 things you can do in Google Scholar that you might have missed.

1. Copy article citations in the style of your choice.

With a simple click of the cite button (which sits below an article entry), Google Scholar will give you a ready-to-use citation for the article in five styles, including APA, MLA and Chicago. You can select and copy the one you prefer.

2. Dig deeper with related searches.

Google Scholar’s related searches can help you pinpoint your research; you’ll see them show up on a page in between article results. Anurag describes it like this: You start with a big topic — like “cancer” — and follow up with a related search like “lung cancer” or “colon cancer” to explore specific kinds of cancer.

A Google Scholar search results page for “cancer.” After four search results, there is a section of Related searches, including breast cancer, lung cancer, prostate cancer, colorectal cancer, cervical cancer, colon cancer, cancer chemotherapy and ovarian cancer.

Related searches can help you find what you’re looking for.

3. And don’t miss the related articles.

This is another great way to find more papers similar to one you found helpful — you can find this link right below an entry.

4. Read the papers you find.

Scholarly articles have long been available only by subscription. To keep you from having to log in every time you see a paper you’re interested in, Scholar works with libraries and publishers worldwide to integrate their subscriptions directly into its search results. Look for a link marked [PDF] or [HTML]. This also includes preprints and other free-to-read versions of papers.

5. Access Google Scholar tools from anywhere on the web with the Scholar Button browser extension.

The Scholar Button browser extension is sort of like a mini version of Scholar that can move around the web with you. If you’re searching for something, hitting the extension icon will show you studies about that topic, and if you’re reading a study, you can hit that same button to find a version you read, create a citation or to save it to your Scholar library.

A screenshot of a Google Search results landing page, with the Scholar Button extension clicked. The user has searched for “breast cancer” within Google Search; that term is also searched in the Google Scholar extension. The extension shows three relevant articles from Google Scholar.

Install the Scholar Button Chrome browser extension to access Google Scholar from anywhere on the web.

6. Learn more about authors through Scholar profiles.

There are many times when you’ll want to know more about the researchers behind the ideas you’re looking into. You can do this by clicking on an author’s name when it’s hyperlinked in a search result. You’ll find all of their work as well as co-authors, articles they’re cited in and so on. You can also follow authors from their Scholar profile to get email updates about their work, or about when and where their work is cited.

7. Easily find topic experts.

One last thing about author profiles: If there are topics listed below an author’s name on their profile, you can click on these areas of expertise and you’ll see a page of more authors who are researching and publishing on these topics, too.

8. Search for court opinions with the “Case law” button.

Scholar is the largest free database of U.S. court opinions. When you search for something using Google Scholar, you can select the “Case law” button below the search box to see legal cases your keywords are referenced in. You can read the opinions and a summary of what they established.

9. See how those court opinions have been cited.

If you want to better understand the impact of a particular piece of case law, you can select “How Cited,” which is below an entry, to see how and where the document has been cited. For example, here is the How Cited page for Marbury v. Madison , a landmark U.S. Supreme Court ruling that established that courts can strike down unconstitutional laws or statutes.

10. Understand how a legal opinion depends on another.

When you’re looking at how case laws are cited within Google Scholar, click on “Cited by” and check out the horizontal bars next to the different results. They indicate how relevant the cited opinion is in the court decision it’s cited within. You will see zero, one, two or three bars before each result. Those bars indicate the extent to which the new opinion depends on and refers to the cited case.

A screenshot of the “Cited by” page for U.S. Supreme Court case New York Times Company v. Sullivan. The Cited by page shows four different cases; two of them have three bars filled in, indicating they rely heavily on New York Times Company v. Sullivan; the other two cases only have one bar filled in, indicating less reliance on New York Times Company v. Sullivan.

In the Cited by page for New York Times Company v. Sullivan, court cases with three bars next to their name heavily reference the original case. One bar indicates less reliance.

11. Sign up for Google Scholar alerts.

Want to stay up to date on a specific topic? Create an alert for a Google Scholar search for your topics and you’ll get email updates similar to Google Search alerts. Another way to keep up with research in your area is to follow new articles by leading researchers. Go to their profiles and click “Follow.” If you’re a junior grad student, you may consider following articles related to your advisor’s research topics, for instance.

12. Save interesting articles to your library.

It’s easy to go down fascinating rabbit hole after rabbit hole in Google Scholar. Don’t lose track of your research and use the save option that pops up under search results so articles will be in your library for later reading.

13. Keep your library organized with labels.

Labels aren’t only for Gmail! You can create labels within your Google Scholar library so you can keep your research organized. Click on “My library,” and then the “Manage labels…” option to create a new label.

14. If you’re a researcher, share your research with all your colleagues.

Many research funding agencies around the world now mandate that funded articles should become publicly free to read within a year of publication — or sooner. Scholar profiles list such articles to help researchers keep track of them and open up access to ones that are still locked down. That means you can immediately see what is currently available from researchers you’re interested in and how many of their papers will soon be publicly free to read.

15. Look through Scholar’s annual top publications and papers.

Every year, Google Scholar releases the top publications based on the most-cited papers. That list (available in 11 languages) will also take you to each publication’s top papers — this takes into account the “h index,” which measures how much impact an article has had. It’s an excellent place to start a research journey as well as get an idea about the ideas and discoveries researchers are currently focused on.

16. Get even more specific with Advanced Search.

Click on the hamburger icon on the upper left-hand corner and select Advanced Search to fine-tune your queries. For example, articles with exact words or a particular phrase in the title or articles from a particular journal and so on.

17. Find extra help on Google Scholar’s help page.

It might sound obvious, but there’s a wealth of useful information to be found here — like how often the database is updated, tips on formatting searches and how you can use your library subscriptions when you’re off-campus (looking at you, college students!). Oh, and you’ll even learn the origin of that quote on Google Scholar’s home page.

The Google Scholar home page. The quote at the bottom reads: “Stand on the shoulders of giants.”

18. Keep up with Google Scholar news.

Don’t forget to check out the Google Scholar blog for updates on new features and tips for using this tool even better.

Related stories

AI gift finding tools

6 AI tools to help you give better gifts

Valentine's Day Search Trends

Quiz: How well do you know Valentine’s Day Search Trends?

Ways-to-Circle-to-Search (2)

5 ways to use Circle to Search

Thumbnail Circle to Search

Circle (or highlight or scribble) to Search

21424_ANC_Unpacked blog post header_OP1@3x

The power of Google AI comes to the new Samsung Galaxy S24 series

Circle to Search

New ways to search in 2024

Let’s stay in touch. Get the latest news from Google in your inbox.

Western Carolina University

Find Journal Articles: Google Scholar

  • Finding Journal Articles
  • Good Places to Start - Interdisciplinary Databases

Google Scholar

  • Google Scholar This link opens in a new window Google Scholar is a web search engine that finds scholarly literature, including papers, theses, books, and reports. By searching Google Scholar from the library’s webpage, you will have free linked access to the library’s subscription holdings. Other links from Google Scholar may prompt you to pay for articles, but DO NOT PAY for articles. We will help you get the articles you need.
  • How to set up Google Scholar

Google Scholar Search

  • << Previous: Good Places to Start - Interdisciplinary Databases
  • Next: Need Help >>
  • Last Updated: Feb 6, 2023 2:37 PM
  • URL: https://researchguides.wcu.edu/FindArticles

HUNTER LIBRARY

176 Central Drive Cullowhee, NC 28723 Administration: 828-227-7485 Reference: 828-227-7465 Circulation: 828-227-7485

Facebook

QUICK LINKS

Ask-A-Librarian Reserve a Study Room My Account Library Catalog Article Databases Interlibrary Loan

  • Library databases
  • Library website

Full-Text Articles: Articles at Google Scholar

Google scholar.

Find scholarly content on the web with Google Scholar. It's useful for conducting comprehensive literature reviews beyond Walden Library.

Learn more from this guide:

  • Google Scholar by Jon Allinder Last Updated Aug 16, 2023 3628 views this year

Find an article at Google Scholar

If Walden doesn't have an article you want, check Google Scholar. You may find a free copy online.

google scholar research articles

If there is no link on the right:

  • Click the article title. Though rare, you may get it free from the publisher. You might also see how much it costs if you're interested in buying it.
  • Try searching regular Google .
  • Buy the article.
  • Use the Document Delivery Service . Remember, it can take 7-10 business days to get an article from DDS.

Connect Google Scholar to the Walden Library

Option 1: search using google scholar pre-connected to the walden library.

Access Google Scholar directly through the Library's website to use a pre-connected version .

Option 2: Manually connect Google Scholar to Walden Library

Follow these steps to manually link Google Scholar to the Walden Library collection:

  • Go to Google Scholar  (scholar.google.com). 

google scholar research articles

  • In the search box, type in  Walden  and click the Search  button.

google scholar research articles

  • Click  Save. Google Scholar will remember this setting until you clear your browser cookies .  Now when you search Google Scholar, you will see Find @ Walden links to the right of articles available in the Library.

google scholar research articles

  • When you click on  Find @ Walden  you will be asked to login with your Walden username and password.
  • You may see a list of databases that contain the article; you will need to click on one of these database links to be taken to the article.
  • Pay attention to the years listed by the database links, as databases may have different publication years available.  Click on the database you want to try and it should take you to the article.
  • Previous Page: Find an Exact Article
  • Next Page: Buy an Article
  • Office of Student Disability Services

Walden Resources

Departments.

  • Academic Residencies
  • Academic Skills
  • Career Planning and Development
  • Customer Care Team
  • Field Experience
  • Military Services
  • Student Success Advising
  • Writing Skills

Centers and Offices

  • Center for Social Change
  • Office of Academic Support and Instructional Services
  • Office of Degree Acceleration
  • Office of Research and Doctoral Services
  • Office of Student Affairs

Student Resources

  • Doctoral Writing Assessment
  • Form & Style Review
  • Quick Answers
  • ScholarWorks
  • SKIL Courses and Workshops
  • Walden Bookstore
  • Walden Catalog & Student Handbook
  • Student Safety/Title IX
  • Legal & Consumer Information
  • Website Terms and Conditions
  • Cookie Policy
  • Accessibility
  • Accreditation
  • State Authorization
  • Net Price Calculator
  • Contact Walden

Walden University is a member of Adtalem Global Education, Inc. www.adtalem.com Walden University is certified to operate by SCHEV © 2024 Walden University LLC. All rights reserved.

  • {{link.text}}

Publications

Google publishes hundreds of research papers each year. Publishing is important to us; it enables us to collaborate and share ideas with, as well as learn from, the broader scientific community. Submissions are often made stronger by the fact that ideas have been tested through real product implementation by the time of publication.

We believe the formal structures of publishing today are changing - in computer science especially, there are multiple ways of disseminating information.  We encourage publication both in conventional scientific venues, and through other venues such as industry forums, standards bodies, and open source software and product feature releases.

Open Source

We understand the value of a collaborative ecosystem and love open source software .

Product and Feature Launches

With every launch, we're publishing progress and pushing functionality.

Industry Standards

Our researchers are often helping to define not just today's products but also tomorrow's.

"Resources" doesn't just mean tangible assets but also intellectual. Incredible datasets and a great team of colleagues foster a rich and collaborative research environment.

Couple big challenges with big resources and Google offers unprecedented research opportunities.

22 Research Areas

  • Algorithms and Theory 608 Publications
  • Data Management 116 Publications
  • Data Mining and Modeling 214 Publications
  • Distributed Systems and Parallel Computing 208 Publications
  • Economics and Electronic Commerce 209 Publications
  • Education Innovation 30 Publications
  • General Science 158 Publications
  • Hardware and Architecture 67 Publications
  • Human-Computer Interaction and Visualization 444 Publications
  • Information Retrieval and the Web 213 Publications
  • Machine Intelligence 1019 Publications
  • Machine Perception 454 Publications
  • Machine Translation 48 Publications
  • Mobile Systems 72 Publications
  • Natural Language Processing 395 Publications
  • Networking 210 Publications
  • Quantum A.I. 30 Publications
  • Robotics 37 Publications
  • Security, Privacy and Abuse Prevention 289 Publications
  • Software Engineering 100 Publications
  • Software Systems 250 Publications
  • Speech Processing 264 Publications

3 Collections

  • Google AI Residency 60 Publications
  • Google Brain Team 305 Publications
  • Data Infrastructure and Analysis 10 Publications

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • News Q&A
  • Published: 07 November 2014

Google Scholar pioneer on search engine’s future

  • Richard Van Noorden  

Nature ( 2014 ) Cite this article

3230 Accesses

11 Citations

536 Altmetric

Metrics details

  • Communication
  • Research management

As the search engine approaches its 10th birthday, Nature speaks to the co-creator of Google Scholar.

google scholar research articles

Google Scholar, the free search engine for scholarly literature, turns ten years old on 18 November. By 'crawling' over the text of millions of academic papers, including those behind publishers' paywalls, it has transformed the way that researchers consult the literature online. In a Nature survey this year, some 60% of scientists said that they use the service regularly. Nature spoke with Anurag Acharya, who co-created the service and still runs it, about Google Scholar's history and what he sees for its future.

google scholar research articles

How do you know what literature to index?

'Scholarly' is what everybody else in the scholarly field considers scholarly. It sounds like a recursive definition but it does settle down. We crawl the whole web, and for a new blog, for example, you see what the connections are to the rest of scholarship that you already know about. If many people cite it, or if it cites many people, it is probably scholarly. There is no one magic formula: you bring evidence to bear from many features.

Where did the idea for Google Scholar come from?

I came to Google in 2000, as a year off from my academic job at the University of California, Santa Barbara. It was pretty clear that I was unlikely to have a larger impact [in academia] than at Google — making it possible for people everywhere to be able to find information. So I gave up on academia and ran Google’s web-indexing team for four years. It was a very hectic time, and basically, I burnt out.

Alex Verstak [Acharya’s colleague on the web-indexing team] and I decided to take a six-month sabbatical to try to make finding scholarly articles easier and faster. The idea wasn’t to produce Google Scholar, it was to improve our ranking of scholarly documents in web search. But the problem with trying to do that is figuring out the intent of the searcher. Do they want scholarly results or are they a layperson? We said, “Suppose you didn’t have to solve that hard a problem; suppose you knew the searcher had a scholarly intent.” We built an internal prototype, and people said: “Hey, this is good by itself. You don’t have to solve another problem — let’s go!” Then Scholar clearly seemed to be very useful and very important, so I ended up staying with it.

Was it an instant success?

It was very popular. Once we launched it, usage grew exponentially. One big difference was that we were relevance-ranking [sorting results by relevance to the user’s request], which scholarly search services had not done previously. They were reverse-chronological [providing the newest results first]. And we crawled the full text of research articles, though we did not include the full text from all the publishers when we started.

It took years in some cases to convince publishers to let you crawl their full text. Was that hard?

It depends. You have to think back to a decade ago, when web search was considered lightweight — what people would use to find pictures of Britney Spears, not scholarly articles. But we knew people were sending us purely academic queries. We just had to persuade publishers that our service would be used and would bring them more traffic. We were working with many of them already before Google Scholar launched, of course.

In 2012 Google Scholar was removed from the drop-down menu of search options on Google’s home page. Do you worry that Google Scholar might be downgraded or killed ?

No. Our team is continually growing, from two people at the start to nine now. People may have treated that menu removal as a demotion, but it wasn’t really. Those menu links are to help users get from the home page to another service, so they emphasize the most-used transitions. If users already know to start with Google Scholar, they don’t need that transition. That’s all it was.

How does Google Scholar make money?

Google Scholar does not currently make money. There are many Google services that do not make a significant amount of money. The primary role of Scholar is to give back to the research community, and we are able to do so because it is not very expensive, from Google’s point of view. In terms of volume of queries, Google Scholar is small compared to many Google services, so opportunities for advertisement monetization are relatively small. There’s not been pressure to monetize. The benefits that Scholar provides, given the number of people who are working on it, are very significant. People like it internally — we are all, in part, ex-academics.

How many queries does Google Scholar get every day, and how much literature does the service track? (Estimates place it anywhere from 100 million to 160 million scholarly items).

I’m unable to tell you, beyond a very, very large number. The same answer for the literature, except that the number of items indexed has grown about an order of magnitude since we launched. A lot of people wonder about the size. But this kind of discussion is not useful — it’s just ' bike-shedding' . Our challenge is to see how often people are able to find the articles they need. The index size might be a concern here if it was too small. But we are clearly large enough.

Google Scholar has introduced extra services: author profile pages and a recommendations engine , for instance. Is this changing it from a search engine to something closer to a bibliometrics tool?

Yes and no. A significant purpose of profiles is to help you to find the articles you need. Often you don’t remember exactly how to find an article, but you might pivot from a paper you do remember to an author and to their other papers. And you can follow other people’s work — another crucial way of finding articles. Profiles have other uses, of course. Once we know your papers, we can track how your discipline has evolved over time, the other people in the scholarly world that you are linked to, and can even recommend other topics that people in your field are interested in. This helps the recommendations engine, which is a step beyond [a search engine].

Are you worried about the practice known as gaming — people creating fake papers, getting them indexed by Google, and gaining fake citations?

Not really. Yes, you can add any papers you want. But everything is completely visible — articles in your profiles, articles citing yours, where they are hosted, and so on. Anyone in the world can call you on it, basically killing your career. We don’t see spam for that very reason. I have a lot of experience dealing with spam because I used to work on web search. Spam is easier when people are anonymous. If I am trying to build a publication history for my public reputation, I will be relatively cautious. 

What features would you like to see in the future?

We are very good at helping people to find the articles they are looking for and can describe. But the next big thing we would like to do is to get you the articles that you need, but that you don’t know to search for. Can we make serendipity easier? How can we help everyone to operate at the research frontier without them having to scan over hundreds of papers — a very inefficient way of finding things — and do nothing else all day long?

I don’t know how we will make this happen. We have some initial efforts on this (such as the recommendations engine), but it is far from what it needs to be. There is an inherent problem to giving you information that you weren’t actively searching for. It has to be relevant — so that we are not wasting your time — but not too relevant, because you already know about those articles. And it has to avoid short-term interests that come and go: you look up something but you don’t want to get spammed about it for the rest of your life. I don’t think getting our users to ‘train’ a recommendations model will work — that is too much effort.

(For more on recommendation services, see ' How to tame the flood of literature ', in Nature 's Toolbox section.)

What about helping people search directly for scientific data, not papers ?

That is an interesting idea. It is feasible to crawl over data buried inside paywalled papers, as we do with full text. But then if we link the user to the paywalled article, they don’t see this data — just the paper’s abstract. For indexing full-text articles, we depend on that abstract to let users estimate the probable utility of the article. For data we don't have anything similar. So as a field of scholarly communication, we haven’t yet developed a model that would allow for a useful data-search service.

Many people would like to have an API (Application Programming Interface) in Google Scholar, so that they could write programs that automatically make searches or retrieve profile information, and build services on top of the tool. Is that possible?

I can’t do that. Our indexing arrangements with publishers preclude it. We are allowed to scan all the articles, but not to distribute this information to others in bulk. It is important to be able to work with publishers so we can continue to build a comprehensive search service that is free to everybody. That is our primary function, and everything else is in addition to this.

Do you see yourself working at Google Scholar for the next decade?

I didn’t expect to work on Google Scholar for ten years in the first place! My wife reminds me it was supposed to be five, then seven years — and now I’m still not leaving. But this is the most important thing I know I can do. We are basically making the smartest people on the planet more effective. That’s a very attractive proposition, and I don’t foresee moving away from Google Scholar any time soon, or any time easily.

Does your desire for a free, effective search engine go back to your time as a student at the Indian Institute of Technology Kharagpur?

It influenced the problems that appealed to me. For example, there is no other service that indexes the full texts of papers even when the user can see only the abstract. The reason I thought this was an important direction to go in was that I realised users needed to know the information was there. If you know the information is in a paywalled paper, and it is important to you, you will find a way in: you can write to the author, for instance. I did that in Kharagpur — it was really ineffective and slow! So my experiences informed the approach I took. But at this point, Google Scholar has a life of its own.

Should people who use Google Scholar have concerns about data privacy?

We use the standard Google data-collection policies — there is nothing different for Scholar. My role at Google is focused on Google Scholar. So I am not going to be able to say more about broader issues.

You can also search for this author in PubMed   Google Scholar

Related links

Related links in nature research.

The top 100 papers 2014-Oct-29

How to tame the flood of literature 2014-Sep-03

Online collaboration: Scientists and the social network 2014-Aug-13

Computing giants launch free science metrics 2011-Aug-02

Scientists get their own Google 2004-Nov-18

Nature blog: the decline and fall of Microsoft Academic Search

Related external links

Rights and permissions.

Reprints and permissions

About this article

Cite this article.

Van Noorden, R. Google Scholar pioneer on search engine’s future. Nature (2014). https://doi.org/10.1038/nature.2014.16269

Download citation

Published : 07 November 2014

DOI : https://doi.org/10.1038/nature.2014.16269

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Turing award elites revisited: patterns of productivity, collaboration, authorship and impact.

Scientometrics (2021)

Shalosh B. Ekhad: a computer credit for mathematicians

  • Jacqueline Eviston-Putsch

Scientometrics (2020)

Science behind AI: the evolution of trend, mobility, and collaboration

Quick links.

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

google scholar research articles

Benedictine University Library

Research Basics: Find Articles Using Google Scholar

  • Understanding the Assignment
  • Choosing a Research Topic
  • Refining a Research Topic
  • Developing a Research Question
  • Deciding What Types of Sources You Will Need
  • Types of Sources
  • Search Techniques
  • Find Books & eBooks This link opens in a new window
  • Choose a Database / Find Articles
  • Find Articles Using the EBSCO Articles tab
  • Find Journals
  • Find Websites using Google
  • Find Articles Using Google Scholar
  • Find Government Documents This link opens in a new window
  • Find Statistics This link opens in a new window
  • Interlibrary Loan This link opens in a new window
  • How to evaluate your sources This link opens in a new window
  • Primary vs. Secondary Sources This link opens in a new window
  • Popular vs. Scholary This link opens in a new window
  • Wheel of Sources
  • Incorporate Sources into Your Research Paper
  • Paraphrasing
  • Voice Markers
  • Using Source Material to Develop/Support an Argument
  • Reasons to Cite Your Sources
  • Citation & Style Guides This link opens in a new window
  • Learning Checks
  • Open Access Educational Resources
  • Research Help

Ask a Librarian

Chat with a Librarian

Lisle: (630) 829-6057 Mesa: (480) 878-7514 Toll Free: (877) 575-6050 Email: [email protected]

Book a Research Consultation Library Hours

Facebook

Connect Google Scholar to the BenU Library's Collection

1. starting in google scholar, choose settings..

Google Scholar Settings

2. Choose Library Links. Search “Benedictine” and check the boxes. Search "Worldcat" and check the box. Click Save.

Google Scholar Library Links

You're done! Now when you search in Google Scholar, your results page will include BenU Library links along the right.

Google Scholar BenU Library Links

Search Google Scholar

Google Scholar promotes itself as a resource that provides one-stop shopping for scholarly literature. It searches across many disciplines and covers a wide variety of resources, including journal articles, theses, books, abstracts, and more. Although Google Scholar is aimed at the academic community, it uses a very broad definition of "scholarly literature." 

It is important to realize that not everything in Google Scholar is peer reviewed.

Try a search:

Google Scholar Search

Tutorial: Using Google Scholar

Remember to evaluate websites for reliability and accuracy before you use them in your research assignments.

  • << Previous: Find Websites using Google
  • Next: Find Government Documents >>
  • Last Updated: Feb 2, 2024 11:54 AM
  • URL: https://researchguides.ben.edu/research-basics

Kindlon Hall 5700 College Rd. Lisle, IL 60532 (630) 829-6050

Gillett Hall 225 E. Main St. Mesa, AZ 85201 (480) 878-7514

Instagram

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

Logo of plosone

The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching

Neal robert haddaway.

1 MISTRA EviEM, Royal Swedish Academy of Sciences, Stockholm, Sweden

Alexandra Mary Collins

2 Centre for Environmental Policy, Imperial College, London, United Kingdom

3 Department for Environmental, Food and Rural Affairs, London, United Kingdom

Deborah Coughlin

4 Department for Civil and Environmental Engineering, Imperial College, London, United Kingdom

Stuart Kirk

5 Environment Agency, London, United Kingdom

Conceived and designed the experiments: NH. Performed the experiments: NH AC. Analyzed the data: NH. Contributed reagents/materials/analysis tools: NH. Wrote the paper: NH AC DC SK.

Associated Data

All relevant data are within the paper and its Supporting Information files.

Google Scholar (GS), a commonly used web-based academic search engine, catalogues between 2 and 100 million records of both academic and grey literature (articles not formally published by commercial academic publishers). Google Scholar collates results from across the internet and is free to use. As a result it has received considerable attention as a method for searching for literature, particularly in searches for grey literature, as required by systematic reviews. The reliance on GS as a standalone resource has been greatly debated, however, and its efficacy in grey literature searching has not yet been investigated. Using systematic review case studies from environmental science, we investigated the utility of GS in systematic reviews and in searches for grey literature. Our findings show that GS results contain moderate amounts of grey literature, with the majority found on average at page 80. We also found that, when searched for specifically, the majority of literature identified using Web of Science was also found using GS. However, our findings showed moderate/poor overlap in results when similar search strings were used in Web of Science and GS (10–67%), and that GS missed some important literature in five of six case studies. Furthermore, a general GS search failed to find any grey literature from a case study that involved manual searching of organisations’ websites. If used in systematic reviews for grey literature, we recommend that searches of article titles focus on the first 200 to 300 results. We conclude that whilst Google Scholar can find much grey literature and specific, known studies, it should not be used alone for systematic review searches. Rather, it forms a powerful addition to other traditional search methods. In addition, we advocate the use of tools to transparently document and catalogue GS search results to maintain high levels of transparency and the ability to be updated, critical to systematic reviews.

Introduction

Searching for information is an integral part of research. Over 11,500 journals are catalogued by Journal Citation Reports ( http://thomsonreuters.com/journal-citation-reports/ ), and the volume of published scientific research is growing at an ever-increasing rate [ 1 , 2 ]. Scientists must sift through this information to find relevant research, and do so today most commonly by using online citation databases (e.g. Web of Science) and search engines (e.g. Google Scholar). Just as the number of academic articles and journals is steadily increasing, so too are the number of citation databases.

A citation database is a set of citations that can be searched using an online tool, for example Web of Science ( https://webofknowledge.com/ ). These databases typically charge subscription fees for access to the database that do not cover the cost of access to the full text of the research articles themselves. Generally these databases selectively catalogue citations according to a predefined list of journals, publishers or subject areas. Several free-to-use services have recently appeared that search for citations on the internet, most notably Google Scholar and Microsoft Academic Search. These search engines do not store citations within a specific database, instead they regularly ‘crawl’ the internet for information that appears to be a citation. Some key characteristics of databases and search engines are compared in Table 1 .

According to Thomson Reuters, the Web of Science Core Collections citation database contains almost 50 million research records ( http://wokinfo.com/citationconnection/realfacts/ ; February 2015), with Microsoft Academic Search reporting to catalogue in excess of 45 million records as of January 2013 ( http://academic.research.microsoft.com/About/help.htm#9 ). Google Scholar does not report the volume of citations identifiable via their search facility, although attempts have been made to estimate this that suggest between 1.8 million [ 3 ] and 100 million records [ 4 ] are identifiable.

“Grey literature” is the term given to describe documents not published by commercial publishers, and it may form a vital component of evidence reviews such as systematic reviews and systematic maps [ 5 ], rapid evidence assessments [ 6 ] and synopses [ 7 ]. Grey literature includes academic theses, organisation reports, government papers, etc. and may prove highly influential in syntheses, despite not being formally published in the same way as traditional academic literature e.g. [ 8 ]. Considerable efforts are typically required within systematic reviews to search for grey literature in an attempt to include practitioner-held data and also account for possible publication bias [ 5 , 9 ]. Publication bias is the tendency for significant, positive research to be more likely to be published than non-significant or negative research, leading to an increased likelihood of overestimating effect sizes in meta-analyses and other syntheses [ 10 ]. The inclusion of grey literature is a central tenet of systematic review methodology, which aims to include all available documented evidence and reduce susceptibility to bias.

Academic citation databases are often the first port of call for researchers looking for information. However, access to databases is often expensive; some costing c. £100,000 per annum for organisations of up to 100 employees. Increasingly, researchers are using academic citation search engines to find information (Haddaway, unpublished data). Academic citation search engines appear to represent an attractive alternative to costly citation databases, cataloguing research almost immediately and not restricting results to certain journals, publishers or subject categories. Search engines are particularly attractive to systematic reviewers, since they have the potential to be used to search for grey literature quickly and simply using one search facility rather than a plethora of individual websites [ 5 ].

There is on-going debate regarding the utility of Google Scholar as an academic resource e.g. [ 11 , 12 ], but also as a replacement for traditional academic citation databases and in searches for grey literature in systematic reviews [ 13 , 14 ]. Google Scholar represents an attractive resource for researchers, since it is free-to-use, appears to catalogue vast numbers of academic articles, allows citations to be exported individually, and also provides citation tracking (although see criticism of citation tracking by Delgado Lopez-Cozar et al. [ 15 ]). Google Scholar is also potentially useful in systematic reviews, since reliance on just one such platform for searches would: i) offer resource efficiency, ii) offer cost efficiency, iii) allow rapid linking to full texts, iv) provide access to a substantial body of grey literature as well as academic literature, and v) be compatible with new methods for downloading citations in bulk that would allow for a very transparent approach to searching [ 16 ].

Previous research has shown that articles identified within systematic reviews are identifiable using Google Scholar [ 13 ]. However, other authors have suggested that this does not make Google Scholar an appropriate replacement for academic citation databases, as, in practice, there are considerable limitations in the search facility relative to those of academic databases [ 11 ], and there is on-going debate about Google Scholar’s place in research [ 12 ]. Shultz [ 17 ] listed many limitations that have been attributed to Google Scholar, including that the service permits use of only basic Boolean operators in search strings, which are limited to 256 characters, and that users cannot sort results (although some of the other cited disadvantages have been corrected in recent updates). Two further limitations to the use of Google Scholar in academic searches are the inability to directly export results in bulk as citations (although a limited number of individual citations can be extracted within a set time period) and the display of only the first 1,000 search records with no details of the means by which they are ordered.

Web-based academic search engines, such as Google Scholar, are often used within secondary syntheses (i.e. literature reviews, meta-analyses and systematic reviews). Systematic reviews typically screen the first 50 to 100 search records within Google Scholar e.g. [ 18 , 19 , 20 ], sometimes restricting searches to title rather than full-text searches e.g. [ 21 ]. Such activities are not themselves evidence-based, however. Little is known about how these results are ordered, or what proportion of search results are traditional academic relative to grey literature. Furthermore, this small degree of screening (50 to 100 records) is a very small proportion of the volume of literature found through other sources (often 10s of thousands of records).

Google Scholar has improved greatly in recent iterations; evident from early critiques of the service relative to academic citation databases that cite problems that no longer exist e.g. [ 22 , 23 ]. Whilst the debate on the usefulness of Google Scholar in academic activities has continued in recent years, some improvements to the service offer unequivocal utility; for example, Shariff et al. [ 24 ] found that Google Scholar provided access to almost three times as many articles free of charge than PubMed (14 and 5%, respectively).

Any recommendations in systematic review guidance that are made regarding the allocation of greater resources to the use of academic search engines, such as Google Scholar, should be based on knowledge that such resources are worthwhile, and that academic search engines provide meaningful sources of evidence, and do not correspond to wasted effort.

Here, we describe a study investigating the use of Google Scholar as a source of research literature to help answer the following questions:

  • What proportion of Google Scholar search results is academic literature and what proportion grey literature, and how does this vary between different topics?
  • How much overlap is there between the results obtained from Google Scholar and those obtained from Web of Science?
  • What proportion of Google Scholar and Web of Science search results are duplicates and what causes this duplication ?
  • Are articles included in previous environmental systematic reviews identifiable by using Google Scholar alone?
  • Is Google Scholar an effective means of finding grey literature relative to that identified from hand searches of organisational websites?

Seven published systematic reviews were used as case studies [ 20 , 25 , 26 , 27 , 28 , 29 , 30 ] (see Table 2 ). These reviews were chosen as they covered a diverse range of topics in environmental management and conservation, and included interdisciplinary elements relevant to public health, social sciences and molecular biology. The importance and types of grey literature vary between subjects, and a diversity of topics is necessary for any assessment of the utility of a grey literature search tool. The search strings used herein were either taken directly from the string used in Google Scholar in each systematic review’s methods or were based on the review’s academic search string where Google Scholar was not originally searched. Search results in Google Scholar were performed both at “full text” (i.e. the entire full text of each document was searched for the specified terms) and “title” (i.e. only the title of each document was searched for the specified terms) level using the advanced search facility (see https://scholar.google.se/intl/en/scholar/help.html#searching for further details). Searches included patents and citations. Since Google Scholar displays a maximum of 1,000 search results this was the maximum number of citations that could be extracted using the specially developed method described below.

Searches were performed on 06/02/15. Web of Science includes the following databases as part of the MISTRA EviEM subscription; KCI-Korean Journal Database, SciELO Citation Index and Web of Sciences Core Collection.

1. What proportion of Google Scholar search results is grey literature?

A download manager (DownThemAll!; http://www.downthemall.net ) and web-scraping programme (Import.io; http://www.import.io ) were used to download each page of search results (to a maximum of 100 pages; 1000 results) and then extract citations as patterned data from the locally stored HTML files into a database. Two databases (one for the title only search and one for the full text search) for each of the 7 systematic reviews were created, each holding up to 1,000 Google Scholar citations (see S1 File ).

Exported citations were assessed and categorised by NRH and AMC as one of the following types of literature:

  • ‘Black’–peer-reviewed articles published in academic journals
  • ‘Book’–monographs or complete books produced by commercial publishers
  • ‘Book chapter’–chapters within books produced by commercial publishers
  • ‘Patent’–registered patents and patent applications with the United States Patent and Trademark Office (USPTO)
  • ‘Thesis’–dissertations from postgraduate degrees (master’s and doctorates)
  • ‘Conference’–presentations, abstracts, posters and proceedings from conferences, workshops, meetings, congresses, symposia and colloquia
  • ‘Other’–all other literature that may or may not be peer-reviewed, including; reports, working papers, self-published books, etc.
  • ‘Unclear’–any search record that could not be categorised according to the above classification (ambiguous citations were discussed by the reviewers and classed as ‘unclear’ if no consensus could be reached due to limited information).

Book chapters are a subcategory of books but have been separated for additional clarity. These categories have been chosen because they reflect the type of information returned by Web of Science (‘black’ literature) and Google Scholar (all literature). The categories also reflect the emergent classifications that were possible based on information in the citations and any associated descriptions.

For each search type (title or full text) the proportion of literature types across the search results was summarised per page of results to assess the relative location of the types within the results.

2. How much overlap is there between Google Scholar and Web of Science?

For each of the 7 systematic review case studies title and full text searches were performed in Google Scholar and Web of Science (25/01/2015) and citation records extracted (all records for Web of Science or the first 1,000 for Google Scholar). Full text search results were not extracted for SR4 since over 47,000 records were returned, which was deemed too expansive for this assessment. The search results were then compared using the fuzzy duplicate identification add-in for Excel described below to investigate the degree of overlap between Web of Science and the first 1,000 Google Scholar search results.

3. What proportion of Google Scholar and Web of Science search results are duplicates and what causes this duplication?

Duplicate records are multiple citations that refer to the same article. They are disadvantageous in search results since they do not represent truly unique records and require time and resources for processing. Duplicates also lead to a false estimation of the size of search results: depending on the level of duplication there may be a significant deviation from the true size of search results. The fourteen databases from the 7 case study systematic reviews described above were screened for Google Scholar duplicates using the Excel Fuzzy Duplicate Finder add-in ( https://www.ablebits.com/excel-find-similar/ ) set to find up to 10 character differences between record titles. Potential duplicates were then manually assessed and reasons for duplication (e.g. spelling mistakes or grammatical differences) were recorded.

Searches were performed using Web of Science (using Bangor University’s subscription consisting of Biological Abstracts, MEDLINE, SciELO, Web of Science Core Collections and Zoological Record) using the same 7 search strings used with the above case studies in Google Scholar for topic words. The first 1,000 search results were extracted and assessed for duplicates on title using the Fuzzy Duplicate Finder as described above. Search results were extracted for records ordered both by relevance and by publication date (newest first), with the exception of SR2, SR5 and SR7, where totals of 230, 1,058 and 1,071 records respectively (all returned) were obtained and extracted in full.

4. Are articles included in previous environmental systematic reviews identifiable using Google Scholar?

In order to examine the coverage of Google Scholar in relation to studies included in environmental management systematic reviews, the lists of included articles following full text assessment were extracted from six reviews (four SRs described in Table 2 ; SR1, SR4, SR5, SR6 and two additional reviews; [ 8 , 31 ]) and each record’s title was searched for using Google Scholar. The option in Google Scholar to include citations was selected. Where titles were not found immediately, quotation marks were used, followed by partial removal of the title where possible typographical errors or punctuation variations might cause a record not to be found. Where records were identified as citations (i.e. Google Scholar found a reference within the reference list of another article) this was also recorded. In addition, references from the final lists of included article for three systematic reviews (SR1, SR4, SR6) were searched for in Web of Science as described for Google Scholar, above.

5. Is Google Scholar an effective means of finding grey literature identified from hand searches of organisational websites?

For another systematic review search string (SR5, Table 2 ) the 84 articles that were identified during searches for grey literature in the published review [ 28 ] from 16 organisational web sites (see S1 Table ) were used to test the ability of Google Scholar to find relevant grey literature using a single search string. The 84 articles were checked against the exported search results for both title and full text searches in Google Scholar (see Methods Section 1 above). The 84 articles were then screened in Google Scholar individually to assess whether they were included in the search engine’s coverage.

1. What proportion of Google Scholar search results is grey literature

Between 8 and 39% of full text search results from Google Scholar were classed as grey literature (mean ± SD: 19% ± 11), and between 8 and 64% of title search results (40% ± 17). Fig 1 displays search results by grey literature category, showing a greater percentage of grey literature than academic literature in title search results (43.0%) than full text results (18.9%). Conference proceedings, theses and “other” grey literature (i.e. reports and white-papers) accounted for the increase in the proportion of grey literature in title searches relative to full text searches. Theses formed a particularly small proportion of the full text search results across all case studies (1.3%), but formed a larger proportion of title search results (6.4%). Similarly, conference proceedings were less common in full text search results (3.2%) than title search results (15.3%). The proportion of patents, book chapters and books was similar in full text and title searches (0.2 and 0.3; 1.7 and 2.5; 4.2 and 2.8% respectively).

An external file that holds a picture, illustration, etc.
Object name is pone.0138237.g001.jpg

When examining the location of literature categories across search results (see S1 Fig ) several patterns emerge. “Peak” grey literature content (i.e. the point at which the volume of grey literature per page of search results was at its highest and where the bulk of grey literature is found) occurred on average at page 80 (±15 (SD)) for full text results, whilst it occurred at page 35 (± 25 (SD)) for title results. Before these points in the search results grey literature content was low in relative terms. For the majority of the case studies it was not until page 20 to 30 that grey literature formed a majority of each page of search results.

Google Scholar demonstrated modest overlap with Web of Science title searches: this overlap ranged from 10 to 67% of the total results in Web of Science ( Table 3 ). The overlap was highly variable between subjects, with reviews on marine protected area efficacy and terrestrial protected area socioeconomic impacts demonstrating the lowest overlap (17.1 and 10.3% respectively). Two case study title searches returned more than the viewable limit of 1,000 search results in Google Scholar (SR1 and SR4) and so only the first 1,000 could be extracted.

See Table 2 for case study explanations.

Full text search results from Google Scholar demonstrated low overlap with Web of Science results ( Table 4 ), ranging from 0.2 to 19.8% of the total Web of Science results.

n/a corresponds to search results that were too voluminous to download in full. See Table 2 for case study explanations.

3. What proportion of Google Scholar and Web of Science search results are duplicates and how do these duplicates come about?

Duplication rates (i.e. the percentage of total results that are duplicate records) for Google Scholar and Web of Science are shown in Table 5 and range from 0.00 to 2.93%. Rates of duplication are substantially higher within Google Scholar than Web of Science, and rates are far higher in title searches within Google Scholar than full text searches ( Table 6 ), although this is quite variable between the 7 case studies (1.0 to 4.8%%).

Numbers in parentheses correspond to the standard deviations of the individual case study duplication rates. Sample size refers to the number of search records in total, followed by the number of independent search strings (i.e. the number of case studies investigated).

Duplication rates are assessed for up to 1,000 search records (or the total number where less than c. 1,300). For Web of Science the full text results were ordered by publication date (newest first) and relevance where more than 1,000 results were returned. Numbers are duplication rate (%) followed by total search records in parentheses.

Duplicates appear to have arisen for a range of reasons. First, typographical errors introduced by manual transcription were found in both Google Scholar (15% of title records) and Web of Science. For example, the sole example of a duplicate from Web of Science is that of the two records that differ only in the spelling of the word ‘Goukamma’ (or Goukarmma) in the following title: “A change of the seaward boundary of Goukamma Marine Protected Area could increase conservation and fishery benefits”. Differences in formatting and punctuation are a subset of typographical errors and corresponded to 18% of title level duplicates. Second, capitalisation causes duplication in Google Scholar, and was responsible for 36% of title level duplicates. Third, incomplete titles (i.e. some missing words) were responsible for 15% of title level duplicates. Fourth, automated text detection (i.e. when scanning documents digitally) was responsible for 3% of title level duplicates. Fifth, Google Scholar also scans for citations within references of selected included literature, and the presence of both these citations and the original articles themselves was responsible for 13% of title level duplication.

Many of the included articles from the six published systematic review case studies were identified when searching for those articles specifically in Google Scholar ( Table 7 ). However, a significant proportion of studies in one review [ 31 ] were not found at all using Google Scholar (31.5%). Other reviews were better represented by Google Scholar coverage (94.3 to 100% of studies). Only one review had an included article list that was fully covered by Google Scholar, the review with the smallest evidence base of only 37 studies [ 31 ]. For those reviews where studies were not identified by Google Scholar, a further search was performed for these missing studies in Web of Science ( Table 7 ), which demonstrated that some of these studies (6 studies from 2 case study reviews) were catalogued by Web of Science.

Records identified as citations are found only within reference lists of other articles (their existence is not verified by the presence of a publisher version or full text article, unlike hyperlinked citations).

1 For those articles not found using Google Scholar, Web of Science searches were carried out using Bangor University subscription (Biological Abstracts, MEDLINE, SciELO Citation Index, Web of Science Core Collections, Zoological Record).

Google Scholar search results that were available only as citations (i.e. obtained from the reference lists of other search results) constituted between 0 and 15.2% of identified results. Citations typically do not lead to web pages that provide additional information and cannot therefore be verified manually by users.

When searching specifically for individual articles, Google Scholar catalogued a larger proportion of articles than Web of Science (% of total in Google Scholar / % of total in Web of Science: SR1, 98.3/96.7; SR4, 94.3/83.9; SR6, 99.4/89.7).

None of the 84 grey literature articles identified by SR5 [ 28 ] were found within the exported Google Scholar search results (68 total records from title searches and 1,000 of a total 49,700 records from full text searches). However, when searched for specifically 61 of the 84 articles were identified by Google Scholar.

This paper set out to investigate the role of Google Scholar in searches for academic and grey literature in systematic and other literature reviews. There is much interest in Google Scholar due to its free-to-use interface, apparent comprehensiveness e.g. [ 11 , 12 , 13 , 14 ], and application within systematic reviews [ 16 ]. However, previous studies have disagreed on whether the service could be used as a standalone resource e.g. [ 11 , 12 ]. Our study enables recommendations to be made for the use of Google Scholar in systematic searches for academic and grey literature, particularly in systematic reviews.

Our results show that Google Scholar is indeed a useful platform for searching for environmental science grey literature that would benefit researchers such as systematic reviewers, agreeing with previous research in medicine [ 32 , 33 ]. Our investigations also demonstrate that more grey literature is returned in title searches than full text searches (43% relative to 19%, respectively), slightly more than previously found in an investigation of full text searching alone in an early version of Google Scholar (13% of total results; [ 17 ]). The grey literature returned by Google Scholar may be seen by some as disadvantageous given its perceived lack of verification (through formal academic peer-review), particularly where researchers are looking for purely traditional academic evidence. However, this may be particularly useful for those seeking evidence from across academic and grey literature domains; for example, those wishing to minimise the risk of publication bias (the over-representation of significant research in academic publications [ 34 ]).

We found that the greatest volume of grey literature in searches occurs at around page 35 for title searches. This finding indicates that researchers, including systematic reviewers, using Google Scholar as a source of grey literature should revise the current common practice of searching the first 50–100 results (5–10 pages) in favour of a more extensive search that looks further into the records returned. Conversely, those wishing to use title searching for purely academic literature should focus on the first 300 results to reduce the proportion of grey literature in their search results.

The grey literature returned in the 7 systematic review case studies examined herein mostly consisted of “other” grey literature and conference proceedings; i.e. white papers and organisational reports. Reports and white papers may prove particularly useful for secondary syntheses, since they may often represent resources that are commissioned by policy and practice decision-makers. Conference proceedings typically represent academic works that have not been formally published in commercial academic journals: such articles may also provide useful evidence for reviewers, particularly systematic reviewers. Academic theses were more common in title searches in Google Scholar, whilst books were more common in full text searches. Theses can provide a vital source of grey literature [ 35 ], research that never makes it into the public domain through academic publications. It is worth noting that whilst academic peer-review is not a guarantee of rigour, research that has not been through formal academic peer-review should be carefully appraised before being integrated into syntheses such as systematic reviews [ 5 ]. Google Scholar may thus prove to be a useful resource in addition to dedicated databases of theses (e.g. DART-Europe; http://www.dart-europe.eu/basic-search.php ) and other grey literature repositories (e.g. ProceedingsFirst; https://www.oclc.org/support/services/firstsearch/documentation/dbdetails/details/Proceeding.en.html ).

Surprisingly, we found relatively little overlap between Google Scholar and Web of Science (10–67% of WoS results were returned using searches in Google Scholar using title searches). For the largest set of results (SR4) only 17% of WoS records were returned in the viewable results in Google Scholar (restricted to the first 1,000 records). However, the actual number of returned results in Google Scholar was 4,310, with only the first 1,000 being viewable due to the limitations of Google Scholar. Assuming an even distribution of overlapping studies across these results we might expect a modest 73% coverage in total (calculated by applying a consistent rate of 17% from the first 1,000 to the full set of 4,310 search records). The limitations of viewable results in Google Scholar make an assessment of overlap impossible when the number of results is greater than 1,000. The case study SR1 only slightly exceeded the viewable limit of 1,000 studies and identified an overlap of 38%, however.

The relatively low overlap between the two services demonstrates that Google Scholar is not a suitable replacement for traditional academic searches: although its results are greater than those in Web of Science, the majority of Web of Science search results are not returned by Google Scholar. However, Google Scholar is a useful addition to traditional database searching, since a large body of search records was returned for each case study that did not overlap, potentially increasing the coverage of any multi-database search, such as those carried out in systematic reviews.

Duplicates within citation databases are disadvantageous because they represent false records. Although the individual reference may be correct, its presence in the database contributes to the number of results. Where large numbers of references must be screened manually, as in systematic reviews, duplicates may also represent a waste of resources where they are not automatically detectable. Duplication rates in Web of Science were very low (0–0.05%), but notably higher in Google Scholar (1–5%). Duplication in Google Scholar occurred as a result of differences in formatting, punctuation, capitalisation, incomplete records, and mistakes during automated scanning and population of the search records. The sensitivity of Google Scholar searches comes at a cost, since identical records are identified as unique references. This may not be a significant problem for small-scale searches, but a 5% duplication rate represents a substantial waste of resources in a systematic review where tens of thousands of titles must be screened manually.

Gehano et al. [ 13 ] found that Google Scholar was able to identify all 738 articles from across 29 systematic reviews in medicine, and concluded that it could be used as a standalone resource in systematic reviews, stating that “if the authors of the 29 systematic reviews had used only GS, no reference would have been missed”. As pointed out by other researchers e.g. [ 14 ], this conclusion is incorrect, since the ability to find specific, known references does not equate to an ability to return these references using a search strategy as might be conducted within a systematic review: most importantly, the relevant articles may be returned outside of the viewable 1,000 records. Giustini and Boulos [ 14 ] found that 5% of studies from a systematic review could not be identified using specific searches in Google Scholar, whilst Boeker et al. [ 11 ] found that up to 34% of studies from 14 systematic reviews were missed.

Google Scholar was able to find much of the existing literature included within the systematic review case studies in our investigations, and indeed found more than Web of Science in the three case studies examined. As such, Google Scholar provides a powerful tool for identifying articles that are already known to exist (for example, when looking for a citation or access to a full text document). In addition, the search engine was also able to identify large amounts of potentially relevant grey literature. However, some important evidence was not identified at all by Google Scholar (31.5% in one case study), meaning that the review may have come to a very different conclusion if it had relied solely on Google Scholar. Similarly, Web of Science alone is insufficient to identify all relevant literature. As described above, Google Scholar may provide a useful source of evidence in addition to traditional academic databases, but it should not be used as a standalone resource in evidence-gathering exercises such as systematic reviews.

Google Scholar was able to identify a large proportion of the grey literature found in one case study through hand searching of organisational websites (61 of 84 articles). However, 23 articles could not be found using the search engine. Furthermore, the 61 articles found were not returned when using a typical systematic review-style search string. Together, these factors demonstrate that Google Scholar is a useful resource in addition to hand searching of organisational websites, returning a large volume of potentially relevant information, but that it should not be used as a standalone resource for grey literature searching, since some vital information is missed. Hand searching, as recommended by the Collaboration for Environmental Evidence Guidelines in Systematic Reviews [ 5 ], is restricted only to those websites included in an a priori protocol. Google Scholar exhaustively searches the internet for studies, however, and whilst it may be more coarse than fine-level hand searching (i.e. missing studies), the addition of a Google Scholar search targeting grey literature would increase comprehensiveness without giving cause for concern with relation to any systematic bias. However, since the algorithms that order search results are not disclosed, a substantial proportion of search results should be examined.

Other Considerations

As mentioned above, only the first 1,000 search results can be viewed in Google Scholar, and the order in which results are returned is not disclosed. Furthermore, the ‘advanced’ search facility supports only very basic Boolean logic, accepting only one set of ‘OR’ or ‘AND’ arguments, not both. In addition, variations in the way that subscript and superscript text, for example with chemical symbols, are displayed and recognised mean that poor matching occurs during searches where these characters form part of article titles. Finally, Google Scholar has a low threshold for repetitive activity that triggers an automated block to a user’s IP address (in our experience the export of approximately 180 citations or 180 individual searches). Thankfully this can be readily circumvented with the use of IP-mirroring software such as Hola ( https://hola.org/ ), although care should be taken when systematically accessing Google Scholar to ensure the terms of use are not violated.

Conclusions

We have provided evidence that Google Scholar is a powerful tool for finding specific literature, but that it cannot be a replacement for traditional academic citation databases, nor can it replace hand-searching for grey literature. The limitations of the number of search results displayed, the incomplete Boolean operation of the advanced search facility, and the non-disclosure of the algorithm by which search results are ordered mean that Google Scholar is not a transparent search facility. Moreover, the high proportion of grey literature that is missed by Google Scholar mean that it is not a viable alternative to hand searching for grey literature as a stand-alone tool. Despite this, Google Scholar is able to identify a large body of additional grey literature in excess of that found by either traditional academic citation databases or grey literature identification methods. These factors make Google Scholar an attractive supplement to hand searching, further increasing comprehensiveness of searches for evidence.

We also note that the development of tools to take snapshots of search results from Google Scholar and extract these results as citations can significantly increase the efficiency and transparency of using Google Scholar (i.e. beyond the arbitrary first 50 search results currently favoured in many systematic reviews).

Several recommendations can be made based on our findings for those wishing to use Google Scholar as a resource for research evidence:

  • 1. Finding : Google Scholar is capable of identifying the majority of evidence in the systematic review case studies examined when searching specifically for known articles.
  • Recommendation : Google Scholar is a powerful, free-to-use tool that can be recommended if looking for specific research studies.
  • 2. Finding : Google Scholar is not capable of identifying all relevant evidence identified in the systematic review case studies examined, missing some vital information (as did Web of Science).
  • Recommendation : Google Scholar (and Web of Science) should not be used as standalone resources for finding evidence as part of comprehensive searching activities, such as systematic reviews.
  • 3. Finding: Substantially more grey literature is found using title searches in Google Scholar than full text searches.
  • Recommendation: If looking for grey literature, reviewers should consider using title searches. If looking for academic literature title searches will yield a great deal of unsuitable information.
  • 4. Finding: Title level searches yield more conference proceedings, theses and ‘other’ grey literature.
  • Recommendation: Title level searches may be particularly useful in identifying as yet unpublished academic research grey literature as well as organisational reports and government papers [ 9 ]
  • 5. Finding: The majority of grey literature begins to appear after approximately 20 to 30 pages of results.
  • Recommendation: If looking for grey literature the results should be screened well beyond the 20 th page.

In summary, we find Google Scholar to be a useful supplement in searches for evidence, particularly grey literature so long as its limitations are recognised. We recommend that the arbitrary assessment of the first 50 search results from Google Scholar, frequently undertaken in systematic reviews, should be replaced with the practice of recording snapshots of all viewable search results: i.e. the first 1,000 records. This change in practice could significantly improve both the transparency and coverage of systematic reviews, especially with respect to their grey literature components.

Supporting Information

Search results by page for 7 case studies (see Table 2 for descriptions), for a) full text and b) title searches. Results displayed are for the total number of extractable records in Google Scholar.

Database of Google Scholar full text and title searches for 7 case study systematic reviews.

List of organisations yielding potentially relevant evidence for a systematic review on the human wellbeing impacts of terrestrial protected areas.

Acknowledgments

The authors wish to thank Helen Bayliss and Beth Hall for discussion of the topic. AMC acknowledges a Policy Placement Fellowship funded by the Natural Environment Research Council, the UK Department for Environment Food and Rural Affairs and the Environment Agency. Some ideas for this project were prompted by a forthcoming Defra research project (WT1552).

Funding Statement

AMC acknowledges a Policy Placement Fellowship funded by the Natural Environment Research Council, the UK Department for Environment Food and Rural Affairs and the Environment Agency. Some ideas for this project were prompted by a forthcoming Defra research project (WT1552). NH was hosted at Bangor University ( http://www.bangor.ac.uk/ ).

Data Availability

Gender differences in google scholar representation and impact: an empirical analysis of political communication, journalism, health communication, and media psychology

  • Open access
  • Published: 16 February 2024

Cite this article

You have full access to this open access article

  • Manuel Goyanes 1 ,
  • Tamás Tóth 2 &
  • Gergő Háló   ORCID: orcid.org/0000-0002-7656-4043 2  

1 Altmetric

Improving gender equality in top-tier scholars and addressing gender bias in research impact are among the significant challenges in academia. However, extant research has observed that lingering gender differences still undermine female scholars. This study examines the recognition of female scholars through Google Scholar data in four different subfields of communication, focusing on two pressing issues: (1) gender representation among the most cited scholars and (2) gender differences in citations. Our findings demonstrate significant differences in gender proportions among the most cited scholars across all subfields, but especially in Political Communication and Journalism. The regression analysis revealed significant differences in citation scores in Political Communication, Journalism, and the pooled sample. However, results revealed that gender differences in research impact were not statistically significant in Health Communication and Media Psychology. Our study advocates for shifts in the citing behavior of communication scholars, emphasizing the importance of actively recognizing and citing studies conducted by female researchers to drive advancements in communication research.

Avoid common mistakes on your manuscript.

Research evaluation frameworks play a crucial role in “objectively” measuring scientific meritocracy (Kamdem et al., 2019 ; Khan et al., 2022 ), especially since the number of open academic positions is not keeping pace with the growing number of PhD graduates (Cyranoski et al., 2011 ). Although there are great concerns about evaluation processes’ fairness and procedures, scholars have been positively or negatively evaluated by these institutions in many countries (Lawrence et al., 2014 ; Park & Gordon, 1996 ). However, decades of field research have shown that beyond personality traits, such as talent or curiosity, individual and structural factors may also significantly influence different dimensions of scientific performance, such as productivity and impact (Cameron et al., 2016 ; Dion et al., 2018 ). In this study, we focus on one of the most important structural factors that might affect scholars’ recognition, namely gender roles .

Extant research has systematically examined gender roles and their possible effects on sex bias in scientific productivity and impact (Knobloch-Westerwick & Glynn, 2013 ). One of the most important theoretical pillars of studying academic gender bias is the Matilda effect , which posits that female scholars suffer lingering structural inequalities that constrain their career prospects (Rossiter, 1993 ). Scholars have explored the Matilda effect from different research angles, such as females’ and males’ research performance, impact, gender ratios of authors in academic publications, decisions on tenure track positions, or the likelihood of being funded (Dion et al., 2018 ; Freelon et al., 2023 ; Huang et al., 2020 ; Knobloch-Westerwick & Glynn, 2013 ). This article complements this research tradition and focuses on citations, one of the most important factors in heuristically estimating scientific impact (Judge et al., 2007 ). Citation counts are ultimate elements that might affect hiring, promotion, and grant decisions (Cameron, 2005 ; Feeley & Yang, 2022 ; Holden et al., 2005 ; Toutkoushian, 1994 ).

Although several studies have examined the Matilda effect in the field of communication (Feeley & Yang, 2022 ; Freelon et al., 2023 ; Knobloch-Westerwick & Glynn, 2013 ; Knobloch-Westerwick et al., 2013 ), little is known about how gender differences unfold in Google Scholar, one of the most important platforms to openly disclose scholars’ research impact across fields and research topics (Marsicano et al., 2022 ). Accordingly, focusing on four different subfields of communication research (Political Communication, Journalism, Health Communication, and Media Psychology) the aim of this paper is twofold: (1) to examine gender proportions among the most cited scholars within and across these subfields and (2) to explore gender differences in their citation counts. Footnote 1

Analyzing these research fields is important because of the particularly marked struggle in social sciences to achieve a dominant position in knowledge production (de Sousa Santos, 2018 ; Wallerstein, 1999 ) an aspect which, according to extant research, harms the recognition of females’ scientific contributions in communication research (Knobloch-Westerwick & Glynn, 2013 ; Knobloch-Westerwick et al., 2013 ). We analyze the fields of Political Communication, Journalism, Health Communication, and Media Psychology, because the former two disciplines are closer to “masculine” fields, while the others are closer to “feminine” topics in the sense of the role congruity theory (Knobloch-Westerwick & Glynn, 2013 ; Knobloch-Westerwick et al., 2013 ). Therefore, we aim to find out whether females are under-recognized in “masculine” and “feminine” subfields considering their presence (e.g., the number of women scientists) and impact (citation counts) among the top-cited researchers in Google Scholar.

Google scholar: A novel star

Generally, scholars use various academic search engines for research purposes (Gusenbauer, 2019 ). Google Scholar, launched in 2004, is interesting in particular because it is estimated to be the most comprehensive scientific search engine, with more than 389 million records (Gusenbauer, 2019 ). In addition, Google Scholar provides metadata for and/or the full text of scientific literature, and tracks citations, including self-citations, h-, and i-10 indexes (Singh et al., 2022 ).

Due to its growing popularity, researchers compared Google Scholar’s citation counts to other databases, such as Web of Science and Scopus (Amara & Landry, 2012 ; Etxebarria & Gomez-Uranga, 2010 ; Franceschet, 2010 ; Harzing & Alakangas, 2016 ; Mikki, 2010 ; Mingers & Lipitakis, 2010 ; Wildgaard, 2015 ). Franceschet ( 2010 ) outlined that Google Scholar detects a significantly higher number of citations and h-indexes than Web of Science, probably due to Google Scholar’s more inclusive crawling methods. Mingers and Lipitakis ( 2010 ) compared citation numbers in Google Scholar to the same metrics in Web of Science in the research fields of Business and Management. They suggested that Web of Science should not be taken into account in citation-based evaluations in social sciences because it covers less than half of the journals, papers, and citations detected by Google Scholar (Mingers & Lipitakis, 2010 ).

In line with the above results, Wildgaard ( 2015 ) found that Web of Science and Google Scholar provided remarkably different numbers of citations and publications and detected diverging numbers of co-authors in Astronomy, Environmental Science, Philosophy, and Public Health. Consequently, Wildgaard ( 2015 ) emphasized that extreme caution is needed when considering only one of the aforementioned databases to evaluate the scientific impact because “the same indicators calculated for the same scholar, but in two different databases, might provide a different picture of the scholar’s impact” (Wildgaard, 2015 , pp. 897–898). In contrast, another research (Mikki, 2010 ) revealed that Google Scholar detected 85% of Earth Science documents that emerged in the Web of Science and showed that the number of citations and h-indexes were very similar in the two databases. Harzing’s and Alakangas’ ( 2016 ) longitudinal and cross-disciplinary analysis also found that Web of Science, Scopus, and Google Scholar provided stable and consistent growth in publication and citation metrics, suggesting that all of these databases have the stability of coverage that is necessary for more in-depth cross-disciplinary comparisons.

Regarding recent studies, Thelwall and Kousha ( 2017 ) revealed that Google Scholar collects more citations than ResearchGate, Web of Science, and Scopus. They also argued that Google Scholar and ResearchGate might not utilize different data sources for indexing citations because their citation counts strongly correlate with each other’s metrics (Thelwall & Kousha, 2017 ). Singh and colleagues ( 2022 ) found that Google Scholar outperformed ResearchGate in citation metrics when they analyzed highly cited authors. They outlined the possible reasons why Google Scholar is “more successful” in crawling citations. Two of these reasons are crucial. First, Google Scholar has a more universal and less stringent indexing policy that collects a wide range of electronic documents: it crawls both peer-reviewed articles and the grey literature. Second, while Google Scholar automatically assigns a publication to a researcher, ResearchGate “sometimes fails to automatically attribute publications to the correct author” (Singh et al., 2022 , p. 1535). Finally, researchers also found that scientists have more impressive bibliometric results in Google Scholar than in Scopus (Marsicano et al., 2022 ). The explanation relied again on the extensive search methods that Google Scholar implements (Marsicano et al., 2022 ).

Even though Google Scholar’s popularity is perceived and acknowledged, it also has some pitfalls (Marsicano et al., 2022 ). First and foremost, Google Scholar was criticized for containing specific types of “errors,” such as including non-scholarly documents (Jacsó, 2012a ). In addition, researchers observed that Google Scholar might duplicate documents, thus potentially inflating citation scores (Doğan et al., 2016 ; Jacsó, 2006b ). Consequently, Jacsó ( 2006a ) suggests that Google Scholar is “good for locating relevant items, leading users some of the time to an open access version of a document, but it is not an appropriate tool for bibliometric studies” (p. 307) because it “ plays fast and loose, (make that too fast and too loose), with its hit counts and citation counts to allow fair comparisons without tiresome verification” (p. 307). However, scholars found that double citations originating from redundant versions of the same paper occur in less than 2% of the observed cases on this platform (Moed et al., 2016 ). Finally, bibliometric information may overlap in Google Scholar if the imported data is incorrectly added to research profiles where scholars have identical names or surnames.

Even though researchers highlight that Google Scholar utilizes questionable and opaque indexing methods (Jacsó, 2005 , 2012b ), the relevance and magnitude of this academic search engine are difficult to disregard. Therefore, we argue that analyzing the Matilda effect in representation and citations within Google Scholar is an important step toward a better understanding of potential gender bias in the subfields of communication studies. For that, as to the best of our knowledge no previous analyses addressed the Matilda effect within Google Scholar with regards to communication science, by doing so we offer fresh insights into the representation and citation patterns of the field considering one of the most important platform for research evaluation. In the subsequent sections, we introduce the significance of analyzing top-cited scholars, as well as the Matilda effect among these researchers and in citations before we outline our research questions.

The significance of analyzing top-cited scholars

The examination of the most cited scholars within a specific field plays a pivotal role in understanding the development of sciences, shedding light on the overall state of knowledge production. Examining the top-cited scholars allows for the identification of individuals who wield significant influence in steering the direction of a field. Their work is often at the forefront of new (methodological and/or theoretical) developments within their respective disciplines, shaping the intellectual evolution of scientific fields (Kwiek, 2018 ). By focusing on the most cited scholars, this study, while recognizing the broader complexities and potential limitations associated with this approach, offers insightful findings on the gender representation and gender differences in citations in one of the most important collectives in shaping the course of science (Bolkan et al., 2012 ; Cucari et al., 2023 ).

Matilda effect in authorship and citations

In 1968, Merton introduced the Matthew effect, which focuses on two intertwined phenomena: the over-recognition of top scholars and the under-recognition of lesser-known scientists. The Matthew effect outlines that acknowledged scientists gain enhanced visibility while their less recognized peers’ contributions fade away (Merton, 1968 ). This paper’s primary theoretical background is a phenomenon entitled the Matilda effect—a term coined in relation to the Matthew effect—which presumes that female scientists are less recognized than their male colleagues (Rossiter, 1993 ). For instance, studies have proved that, as they reviewed progressively higher academic positions, they found a constant decrease in the number of female scholars in these roles (European Commission, 2012 ; National Academy of Sciences, 2007 ; van den Besselaar & Sandström, 2017 ). Research also showed that female scholars win smaller grants than their male colleagues (RAND Corporation, 2005 ) and receive scholarships with considerably less frequency than male scientists (Bornmann et al., 2007 ; Lerchenmueller & Sorenson, 2018 ; Liao & Lian, 2022 ; van den Besselaar & Leydesdorff, 2009 ). It is important to note, however, that many studies found no gender bias in publishing, hiring, and being funded (Ceci & Williams, 2011; Ley & Hamilton, 2008 ; Liao & Lian, 2022 ).

A vital question emerges at this point: what factors might fuel the Matilda effect? The answer relies primarily on socially constructed, structural reasons. Considering the literature on gender bias in science, the relevant theoretical background is rooted in social role theory, whereby scholars argue that gender is socially constructed via gender roles (Eagly, 1987 ). These roles implement normative expectations from males and females and suggest the desirable behavior for men and women (Eagly, 1987 ). The social role theory suggests that communal characteristics mostly suit women while agentic ones are generally desirable for men (Eagly, 1987 ). Specifically, communal characteristics imply helpful, caring, and sympathetic attitudes towards other people’s well-being, while agentic characteristics are typical of competitive, ambitious, self-confident individuals with strong leadership skills (Knobloch-Westerwick et al., 2013 ). At this point, an important segment of the theory, the role congruity theory , kicks in.

The role congruity theory helps scholars analyze the congruity between gender roles and other roles, such as the scientific one (Eagly & Karau, 2002 ). Role congruity theory suggests that scientific roles are agentic, and therefore are closer to “male” characteristics, implying ambition, leadership, and self-confidence (Knobloch-Westerwick & Glynn, 2013 ). On the other hand, role congruity theory highlights that communal roles—such as taking care of children and ill people—are not compatible with the scientific role. Consequently, beliefs about the scientist and female roles are not compatible, which leads to competition between these role-based expectations. Role incongruity might harm female scientists by causing them to be judged negatively in academia. As a result, the social–psychological incongruity might attract negative evaluations or reduce the willingness to invite female scientists to research networks (Knobloch-Westerwick et al., 2013 ). These structural circumstances can reduce the duration of females’ careers and harm their productivity, because structural factors such as negative stereotypes towards women, exclusion from informal networks of communication, and the lack of professional mentors might be due to role incongruity (Cech & Blair-Loy, 2014 ; Huang et al., 2020 ).

Beyond the well-known structural reasons, other explanations might also be relevant in examining the Matilda effect. One of the most comprehensive papers on sex differences analyzed 1.5 million authors and found that women account for 27% of authorship in the research fields of science, technology, engineering, and mathematics (Huang et al., 2020 ). Researchers explained the above difference with different dropout rates for females at every stage of their careers (Huang et al., 2020 ). Dropout rates might be higher for women than men because females report exclusion from colleagues, aggressive behavior from students, and sexual harassment during their faculty work more often than males (Bronstein & Farnsworth, 1998 ). Another research (Leahey, 2006 ) suggested that specialization supports productivity, but that female sociologists tend not to focus on a single research field because they feel that narrowing down their research scope would harm their competitiveness when they try to move to other institutions or departments. Duch et al. ( 2012 ) argue that female scholars’ lower publication rates are possibly due to the fact that women gain less institutional support in research resource amounts than their male peers.

Extant research also found that women participate significantly less in international research collaboration than men (Uhly et al., 2017 ). Importantly, the above study also revealed that family status can create an invisible “glass fence” that harms females’ academic careers if women have partners who do not work in academia (Uhly et al., 2017 ). Jadidi et al. ( 2018 ) argue that female scholars are less prolific than men because they work with a smaller fraction of senior authors than males, narrowing women’s research networks. The study revealed that successful male and female scholars had the same collaborative behavior: both groups work with “highly-connected scientists” (Jadidi et al., 2018 , p. 18) who produce many peer-reviewed papers with high quality. Van den Besselaar and Sandström ( 2017 ) argue that the Matilda effect in production is explained by the facts that (1) male scholars are older in general and have more time to publish and (2) men have higher academic positions. The above study suggests that the higher academic position scholars have, the more prolific they are, and women are in a disadvantaged position in that competition.

As for the field of communication, a recent study has found that the number of female first authors grew significantly between 2009 and 2019, but their proportion among the top-cited authors did not grow at a similar pace (Author et al. 2022). More specifically, even though the share of female scholars (57%) was larger in 2019 than their male counterparts’ ratio in communication research, males outperformed (58%) their female peers’ shares in the first authorship among the top-cited researchers (Author et al., 2022). Even though another study revealed that gender imbalance has decreased in the last two decades among the most cited communication scholars’ proportion, almost three-quarters (74.3%) of them are still (white) men (Freelon et al., 2023 ). Although the above research explored how the Matilda effect prevails in gender ratios in authorship and among leading scholars in the prominent segments of communication studies, we still do not have information on possible gender proportions among the most cited authors in Google Scholar. Therefore, we formulate the following research question:

RQ1) Are there equal gender proportions in Google Scholar among the most cited scholars in (a) Political Communication, (b) Journalism, (c) Health Communication, (d) Media Psychology, and (e) the pooled sample?

Ample evidence suggests that a gendered citation gap persists in sciences and male scholars receive more citations than their female peers (Dion et al., 2018 ). Again, what might cause the Matilda effect in receiving citations? Dion and colleagues ( 2018 ) consider two important factors: productivity gaps and differences in self-citations. First, males tend to be more prolific than women because they occupy higher positions, work in larger research networks, win more funds, have smaller dropout rates during their careers, spending less time on caregiving, and possibly have less or no career breaks while they work in academia (Huang et al., 2020 ). Second, men are willing to cite their own papers more frequently than women, which is theoretically labelled as gender homophily in citations (Hutson, 2006 ; Maliniak et al., 2013 ; Potthoff & Zimmermann, 2017 ; Zigerell, 2015 ).

Nevertheless, regarding gender bias in citations, Dion and colleagues emphasize that it is “difficult to know if this occurs simply because men publish and cite themselves more than women or if scholars systematically fail to cite relevant work by women in their field (or both)” ( 2018 , p. 315). What is more important, however, is that the Matilda effect in citations is detrimental because it disregards many women’s works and findings that should be introduced in papers, monographs, book chapters, textbooks, and courses at academic institutions (Colgan, 2017 ; Hardt et al., 2017 ). If many female scholars’ findings are marginalized, a large part of the scientific work might fade away, and inequalities will be maintained in academia, where diverse knowledge production should be essential, if not paramount.

Several studies have analyzed the possible gender gaps within the citation patterns of published papers to investigate the prevalence of the Matilda effect, but their outcomes are contradictory. On the one hand, the Matilda effect emerges in the research fields of Ecology (Cameron et al., 2016 ), Economics (Ferber, 1988 ; Ferber & Brün, 2011 ), Library and Information Sciences (Håkanson, 2005 ), Mathematics (Aksnes et al., 2011 ), and Political Science (Maliniak et al., 2013 ). On the other hand, there was no Matilda effect in citations in Biochemistry (Long, 1992 ), Construction Studies (Powell et al., 2009 ), Criminal Justice (Stack, 2002 ), Economic History (Di Vaio et al., 2012 ), Geography (Slyder et al., 2011 ), International Relations (Østby et al., 2013 ), Public Administration (Corley & Sabharwal, 2010 ), and Sociology (Ward, 1992 ).

In communication research, important analyses considering citations were conducted on gender gaps. In line with the role congruity theory (Eagly & Karau, 2002 ), scholars found that male researchers are cited more than females in Communication Research and the Journal of Communication (Knobloch-Westerwick & Glynn, 2013 ). Another study found that male scholars cite their male peers more often than they cite female researchers, and vice versa, thus proving gender homophily in citations in two leading German communication studies journals (Potthoff & Zimmermann, 2017 ). This gender homophily in citations is partly due to differences in male and female communication scholars’ research interests (Potthoff & Zimmermann, 2017 ). Based on the results of the structural equation modeling in the aforementioned study, male authors tend to be cited more than female authors. This conclusion was drawn from the model which demonstrated that the gender composition of authors (higher values indicating higher impact of male authors) and “masculine” / “feminine” research subjects affect the proportion of female authors cited. The gender composition of authors had a negative effect on the choice of female-typed research subjects, and a positive effect on male-typed research subjects, which in turn affects the proportion of female authors cited (Potthoff & Zimmermann, 2017 ). Recent research also highlighted that even though female communication scholars’ publications are viewed more than the work of their male colleagues, women’s papers are cited less than male authors’ publications (Author, 2022b). In contrast, Feeley and Yang ( 2022 ) analyzed the number of (self-)citations in eight communication journals and found that the Matilda effect emerged “only” in Health Communication and Political Communication and that the effect was minor. However, they also argue that males were more likely to self-cite their own papers in six journals than females. Against this backdrop, we outline the following research question:

RQ2) How does gender affect citation counts in a) Political Communication, b) Journalism, c) Health Communication, d) Media Psychology, and e) the pooled sample?

Google Scholar is a growing platform that measures researchers’ publications and citation counts across years (Marsicano et al., 2022 ). Its use has grown in recent years, even in research evaluation processes (Hayashi, 2019 ). The platform allows users to summarize their research production by linking each research item to a given citation score provided by Google Scholar’s search algorithm. The platform also allows users to outline their individual research fields via research labels and ranks the most cited scholars according to their citation scores. Although research output and citation counts might be occasionally misreported by Google Scholar or researchers, it is a platform that can be used to assess impact and productivity in several evaluation processes.

Data for this study was directly computed from Google Scholar. To gather individual level data from the four subfields, the top 100 hundred most cited scholars were examined by selecting each discipline in Google Scholar (n = 400). We coded all data for every scholar across subfields on the same day (22/06/2022) to avoid discrepancies in citation counts and research output, as highly cited scholars may increase bibliometrics from one day to another. We rely on citation counts, productivity, and years of experience (measured as the total number of years since the first citation) as reported directly in Google Scholar. If coders detected inconsistencies at individual level data, records were manually corrected: false positives (i.e., fake or irrelevant profiles) were removed from the dataset, introducing the subsequent profiles within the subfield list. In such cases, scholars’ production was manually reviewed to detect mismatches between research interest and research output (for instance, scholars interested in Journalism and publishing in Aeronautics).

However, in most cases, the most cited scholars in the four subfields under scrutiny had accurate profiles, thus such corrections were minimal. For the pooled sample, subfields were merged and duplicates were removed (i.e., scholars cross-listed in two or more subfields, n = 25). Regarding intercoder agreement, the first author independently coded a random selection of 20% of observations the same day of the original data collection and disagreements were not found (100% agreement for gender, 100% for citation counts, 100% for research output, and 100% for year since first citation). The variables of interest are explained below.

Dependent and Independent Variables

Subfields. This variable taps on four subfields plus the pooled sample (collapsing the four subfields into one value): Political Communication, Journalism, Health Communication, and Media Psychology. We chose the categories included in this study based on the size, thematic patterns, influence, and diversity of the given subfield within communication studies. Furthermore, the chosen subfields are also represented in ICA divisions, indicating their relevance and magnitude within the wider field.

Gender. This variable deals with the gender of the author under review. We consider the typical divide in scientometric analysis (male vs. female) by manually checking the name reported in Google Scholar and the personal photograph. In case of uncertainty, coders made Google searches to clarify the gender of the scholar. This variable is considered the main independent variable in the regression models (males = 269; females = 106).

Citation count. Total number of citations that were reported at each individual profile of Google Scholar. This variable is the dependent variable in the regression models. Pooled sample (range = 63,198; mean = 8085; SD = 8249.32; skewness = 2.92, SD = 0.12; Kurtosis = 11.79, SD = 0.24), political communication (range = 42.269; mean = 10,269.20; SD = 8042.19; skewness = 2.47, SD = 0.24; Kurtosis = 7.06, SD = 0.47), journalism (range = 44,652; mean = 7709.18; SD = 6841.63; skewness = 2.92, SD = 0.24; Kurtosis = 11.54, SD = 0.47), media psychology (range = 57,826; mean = 4736.24; SD = 7811.64; skewness = 4.43, SD = 0.24; Kurtosis = 24.43, SD = 0.47), health communication (range = 60,985; mean = 9627.65; SD = 9114.18; skewness = 3, SD = 0.24; Kurtosis = 12.72, SD = 0.47).

As citation counts may be affected by both the levels of productivity and years of experience in academia (Li et al., 2017 ), our regression models controlled for both. Research suggests that levels of research productivity significantly and positively boost citation records (Li et al., 2017 ). Therefore, scholarly overproduction is likely to increase impact and visibility (Li et al., 2017 ). Likewise, scholars’ total citation records are significantly influenced by the years of experience since the first citation: the more years a researcher spends publishing, the better chance they have at accumulating high citation statistics.

Research output. This variable considers different types of research, such as papers, books, book chapters, conference proceedings, editorials, and all potential material subject to being cited by the scientific community and that has been manually or algorithmically uploaded by researchers or Google Scholar to the individual profiles. Pooled sample (range = 1116; mean = 151.01; SD = 116.03; skewness = 2.50, SD = 0.12; Kurtosis = 13.02, SD = 0.24), political communication (range = 512; mean = 156.17; SD = 93.03; skewness = 1.31, SD = 0.24; Kurtosis = 2.41, SD = 0.47), journalism (range = 625; mean = 163.95; SD = 108.82; skewness = 1.74, SD = 0.24; Kurtosis = 4.05, SD = 0.47), media psychology (range = 519; mean = 101.52; SD = 92.70; skewness = 2.04, SD = 0.24; Kurtosis = 4.56, SD = 0.47), health communication (range = 1092; mean = 182.42; SD = 146.65; skewness = 3.11, SD = 0.24; Kurtosis = 16.16, SD = 0.47).

Years since first citation. We compute the years since first citation by counting the number of years in a scholar’s Google Scholar profile (min = 5; max = 40; mean = 20.51; SD = 7.47).

Analysis strategy

In order to answer the research questions, we relied on two different statistical tests. First, to answer RQ1, we ran a series of χ 2 Goodness of Fit test, one for each subfield of study and one for the pooled sample, by collapsing all subfields. The minimum expected frequency for running this statistic was met. Second, to answer RQ2, we ran a series of bootstrap OLS-regression models. As assessed by a visual inspection of distributions, citation counts across subfields were not distributed normally. Accordingly, in order to provide reliable findings, the study ran a series of bootstrap OLS-regression models accounting for robust standard errors based on bootstrapping to 1,000 resamples with biased corrected confidence to assess statistical significance.

The first research question inquiries about the gender representation among the most cited Google Scholar researchers across different subfields of communication (see Table  1 ). The χ 2 Goodness of Fit test showed that there were statistically significant differences between the number of male and female scholars across every subfield and in the pooled sample. In other words, assuming equal proportions, there is a prominent male majority in the category of the most cited researchers. At the subfield level, the starkest underrepresentation is in Political Communication, with female scholars accounting for only 15% of the sample, followed by Journalism (21%), Media Psychology (33%), and Health Communication (41%). In the pooled sample, the situation is also quite unbalanced as females make up 28.26% of the most cited category.

A series of OLS-regressions were run to answer RQ2 for each subfield of study and the pooled sample. In Political Communication (see Table  2 ), after controlling for productivity (β = 0.51; p < 0.001) and years since first citation, female scholars are significantly less cited than their male peers (β = -0.09; p < 0.05).

Similarly, in Journalism (see Table  3 ) results of the regression analysis revealed that after controlling for productivity (β = 0.34; p < 0.05) and years since first citation, female scholars are significantly less cited than their male peers (β = -0.17; p < 0.05).

However, in Health Communication (see Table  4 ), the most balanced subfield in terms of gender representation among the most cited scholars (see RQ1 above), after controlling for productivity (β = 0.50; p < 0.01) and years since first citation (β = 0.18; p < 0.05), we found no statistically significant differences between male and female scholars’ citation scores.

In Media Psychology (see Table  5 ), the second most balanced subfield in terms of gender representation among the most cited scholars, after controlling for productivity (β = 0.54; p < 0.01) and years since first citation, we found no statistically significant differences between male and female scholars’ citation scores.

Finally, collapsing all subfields in the pooled sample (see Table  6 ), the regression analysis revealed that after controlling for productivity (β = 0.48; p < 0.001) and years since first citation (β = 0.19; p < 0.01), female scholars are significantly less cited than their male peers (β = -0.10; p < 0.05).

Discussion and conclusion

Extant research has investigated the Matilda effect in sciences, indicating significant gender biases in productivity, performance, and career paths (Dion et al., 2018 ; Huang et al., 2020 ). In the field of communication, studies explored gender-based citation disparities (Feeley & Yang, 2022 ; Freelon et al., 2023 ; Knobloch-Westerwick & Glynn, 2013 ; Knobloch-Westerwick et al., 2013 ). Although the overall inequalities are apparent, subfield-level analyses are still scarce, prompting a need for a deeper, high-resolution exploration. Therefore, we examined the gender proportions among the most cited scholars in Political Communication, Journalism, Health Communication, and Media Psychology, as well as the gender-based citation counts between them. Given the absence of prior analyses on the Matilda effect within Google Scholar (i.e., recent similar analysis by Freelon et al. ( 2023 ) apply WoS and Scopus data), we provide novel insights into the representation and citation patterns of top-cited researchers.

With regards to our first research questions, we found that, compared to their male peers, highly cited female authors are underrepresented in all subfields and in the pooled sample, regardless of whether the field is one traditionally considered masculine or feminine. These striking results are aligned with previous findings indicating a lack of balanced female representation among the top performing communication scholars (Freelon et al., 2023 ; Knobloch-Westerwick & Glynn, 2013 ). Importantly, disparities are substantial in the pooled sample, Political Communication, Journalism, and Media Psychology. While the overall picture in Health Communication is more gender balanced, it is still significantly skewed in favor of men.

The struggle to dominate academic knowledge production and impact in social sciences, including communication studies, is obvious. Although many researchers have focused on regional and economic aspects, suggesting that rich Western institutions dominate the poor, non-Western academia in publications and citations, gender is another structure that must be considered in academic inequalities (de Sousa Santos, 2018 ; Rossiter, 1993 ; Wallerstein, 1999 ). Gender bias in social sciences is interesting in particular because it also emerges within core institutions and not only in academia embedded in the periphery (Author, 2020). But how should this specific type of inequality be understood in this case? The role congruity theory supports the interpretation of why there was a significantly lower female presence among the top-cited communication researchers on Google Scholar (Eagly & Karau, 2002 ; Garcia-Retamero & López-Zafra, 2006 ; Knobloch-Westerwick & Glynn, 2013 ). Agentic features characterize the role of scientists, who are considered to be ambitious and career-oriented (Knobloch-Westerwick et al., 2013 ). These characteristics are aligned with male social roles rather than female roles, which are assumed to be community-oriented instead (Eagly, 1987 ).

To acquire citations, scholars must publish papers in highly prestigious journals, participate in large and prolific research networks, win grants, hold high academic positions, win scholarly awards and promotions, spend much time with research, attend international conferences, and share their publications via (academic) social sites (Demeter, 2020 ). The above factors are closer to the career-oriented, agentic roles rather than the “caring” communal ones. On top of that, these efforts require time. Time is crucial because earlier studies have observed that there is a higher proportion of female than male scholars at the lower rungs of academia level, indicating that women typically teach more than men, and thus have less time to participate in research and publish outstanding papers in prestigious journals (Author, 2020).

Additionally, the Matilda effect was outlined in 1993, and the empirical analysis of this research tradition harks back only a few decades. We take this into account because many scholars among the top-cited communication researchers published before the 1990s, when less attention was paid to citing females and males equally. Even though the Matilda effect theory is three decades old, it still needs time to gain prominence in scholarly analysis. Thus, studies with results similar to ours can highlight the importance of acknowledging – via citation records – women’s academic publications more frequently (Freelon et al., 2023 ).

Taking a closer look at our results on female underrepresentation among the top-cited scholars, we can interpret them from another angle. Our findings indicated that Political Communication and Journalism studies, which are typically portrayed as masculine research fields (Knobloch-Westerwick & Glynn, 2013 ), show the most serious female underrepresentation (15% and 21%) among the most cited scholars. In turn, Health Communication—that is, the field most connected to the notion of “care”—was the most balanced (41%). Prior studies (Holman et al., 2018 ; Larivière et al., 2013 ) have indicated a stronger male dominance in fields connected to policy making and social power, while politically less involved fields generally associated with “care,” tend to be more balanced. This explanation can be a relevant one if we try to understand the gender bias, which seems to be stratified among communication scholars on Google Scholar. In other words, even though each subfield analyzed is significantly male-dominated in terms of presence, the difference is smaller in Health Communication, possibly due to its proximity to the female social role that implies care (Knobloch-Westerwick & Glynn, 2013 ).

Notwithstanding, although most scientific fields are becoming more gender balanced over time (Elsevier, 2017 ), citation biases—as citations are accumulated over a relatively long time—still prevail. Consequently, in our second research question, we measured gender-based citations, controlling for academic experience (i.e., the years since their first citation) and productivity. After controlling for these measures, our study revealed female scholars to be significantly under-cited in the pooled sample as well as in the subfields of Political Communication and Journalism. Notwithstanding, in the case of Health Communication and Media Psychology, we found no significant differences in gender-based citations. That is, while in Health Communication and Media Psychology, the general under-recognition of female scholars can be attributed to the aforementioned slow process of adaptation of citation measures to progress in gender equality, the same cannot be stated for Political Communication and Journalism, emphasizing currently existing socio-cultural citation biases towards female scholars.

Our analysis also indicated that fields with similar proportions of females and males among the most cited scholars are also those in which gender differences in citation counts seem to disappear. Specifically, the regression analysis revealed significant differences in citation counts in Political Communication, Journalism, and the pooled sample. On the other hand, the more diverse fields of Health Communication and Media Psychology showed no significant biases concerning research impact based on gender after controlling for productivity and academic experience.

The outcomes on the gender differences in receiving citations outline the following conclusions. Even though many female scholars acquired positions among the top-cited communication researchers, the scientific community acknowledges their work equally if their research field is considered to be aligned with their “expected,” communal social characteristics. Media Psychology and Health Communication are closer to the communal roles than the other two subfields (Knobloch-Westerwick & Glynn, 2013 ). In other words, the underlying communal characteristics in Health Communication and Media Psychology “allow” female researchers to acquire unbiased level of recognition if they are first able to make their way into the elite league of the most cited scholars. Put differently, agentic characteristics are sufficient to be among the most recognized scholars but, for women, their subfield should contain communal social characteristics for them to acquire the same level of recognition as males in Health Communication and Media Psychology. In turn, “masculine” fields such as Political Communication and Media Psychology seem to resist females’ equal recognition because (1) the effort to get into the elite league of the most cited scholars needs agentic characteristics and (2) these subfields are “masculine” as they deal with power and influence on larger communities. As a result, the presence of the two, intertwined structural factors attached to Political Communication and Journalism are too strong to let female scholars receive the same recognition as males. Nevertheless, we contend that these subfields ought to adopt a more inclusive stance towards women, acknowledging and valuing their scientific contributions to prevent overlooking their impact. Fostering a mindful approach to citations is essential for advancing towards a more diverse and inclusive body of scientific knowledge.

Limitations and future research

As mentioned above, Google Scholar indexes grey literature that may inflate citation counts. Consequently, our results should be interpreted with caution, especially if compared with certain other databases such as Web of Sciences or Scopus. In addition, due to the unsupervised crawling methods of Google Scholar, fake profiles, fake papers or non-curated material can introduce important bias in measuring citations counts. Our analysis, fully aware of these potential biases of Google Scholar, tried to reduce the measurement error by consciously implementing a content analysis, and collecting data for all subfields and academics on the same day to prevent variations in their citations or production records. As a consequence of these limitations, we measured four diverse subfields within communication, yet future studies may also consider extending this analysis to other subfields of communication or other fields of sciences.

We also need to note that examining the top-cited scholars has limitations. Firstly, it concentrates on a narrow group of scholars, offering a limited perspective on general patterns of the fields under examination. Secondly, biases related to factors like gender and institutional affiliations may distort the analysis within this subset (Kwiek, 2018 ). To address these challenges, future research should broaden its focus to a more representative sample, ensuring a broader understanding of gender representation and research impact.

It is also important to note that there are certain limitations to the gender-based categorization of communication scientific subfields. As Knobloch-Westerwick and Glynn ( 2013 , pp. 12–13) pointed out: “Regarding gender-typed topics, 48 pieces (4.7%) fell into the “female-typed” category, 236 (23.1%) fell into the “male-typed” category, and 17 pieces (1.7%) were categorized to fall into both categories based on featuring strings associated with stereotypes for both genders. The vast majority of 711 (70.5%) emerged as “gender-neutral” based on the categorizations of gender-typed research topics”. Therefore, it is highly important to exercise caution when categorizing communication research topics based on gender stereotypes, as well as to avoid distinct dichotomous categories.

We utilize capital letters in the terms “Political Communication, Journalism, Health Communication, and Media Psychology” because we refer to the research fields and not the related phenomena.

Aksnes, D. W., Rorstad, K., Piro, F., & Sivertsen, G. (2011). Are female researchers less cited? A large-scale study of Norwegian scientists. Journal of the American Society for Information Science and Technology, 62 (4), 628–636. https://doi.org/10.1002/asi.21486

Article   Google Scholar  

Amara, N., & Landry, R. (2012). Counting citations in the field of business and management: Why use Google Scholar rather than the Web of Science. Scientometrics Scientometrics, 93 (3), 553–581. https://doi.org/10.1007/s11192-012-0729-2

Bolkan, S., Griffin, D. J., Holmgren, J. L., & Hickson, M. (2012). Prolific scholarship in communication studies: Five years in review. Communication Education, 61 (4), 380–394. https://doi.org/10.1080/03634523.2012.699080

Bornmann, L., Mutz, R., & Daniel, H.-D. (2007). Gender differences in grant peer review: A meta-analysis. Journal of Informetrics, 1 (3), 226–238. https://doi.org/10.1016/j.joi.2007.03.001

Bronstein, P., & Farnsworth, L. (1998). Gender differences in faculty experiences of interpersonal climate and processes for advancement. Research in Higher Education, 39 (5), 557–585. https://doi.org/10.1023/A:1018701722855

Cameron, B. D. (2005). Trends in the usage of ISI bibliometric: data Uses, abuses, and implications. Portal: Libraries and the Academy, 5 (1), 105–125.

Cameron, E. Z., White, A. M., & Gray, M. E. (2016). Solving the productivity and impact puzzle: Do Men outperform Women, or are metrics biased? BioScience, 66 (3), 245–252. https://doi.org/10.1093/biosci/biv173

Cech, E. A., & Blair-Loy, M. (2014). Perceiving glass ceilings? meritocratic versus structural explanations of gender inequality among Women in science and technology. Social Problems, 57 (3), 371–397. https://doi.org/10.1525/sp.2010.57.3.371

Colgan, J. (2017). Gender bias in international relations graduate education? New evidence from syllabi. PS: Political Science Amp; Politics, 50 (2), 456–460. https://doi.org/10.1017/S1049096516002997

European Commission. (2012). Meta-analysis of gender and science research: synthesis report . https://op.europa.eu/en/publication-detail/-/publication/3516275d-c56d-4097-abc3-602863bcefc8

Corley, E. A., & Sabharwal, M. (2010). Scholarly collaboration and productivity patterns in public administration: analysing recent trends. Public Administration, 88 (3), 627–648. https://doi.org/10.1111/j.1467-9299.2010.01830.x

Cucari, N., Tutore, I., Montera, R., & Profita, S. (2023). A bibliometric performance analysis of publication productivity in the corporate social responsibility field: Outcomes of SCIVAL analytics. Corporate Social Responsibility and Environmental Management, 30 (1), 1–16. https://doi.org/10.1002/csr.2346

Cyranoski, D., Gilbert, N., Ledford, H., Nayar, A., & Yahia, M. (2011). Education: The PhD factory. Nature, 472 , 276–279.

Article   CAS   PubMed   Google Scholar  

Demeter, M. (2020). Academic Knowledge Production and the Global South . Palgrave Macmillan. https://doi.org/10.1007/978-3-030-52701-3

Book   Google Scholar  

de Sousa Santos, B. (2018). The end of the cognitive empire: The coming of age of epistemologies of the South . Duke University Press.

Di Vaio, G., Waldenström, D., & Weisdorf, J. (2012). Citation success: Evidence from economic history journal publications. Explorations in Economic History, 49 (1), 92–104. https://doi.org/10.1016/j.eeh.2011.10.002

Dion, M. L., Sumner, J. L., & Mitchell, S. M. (2018). Gendered citation patterns across political science and social science methodology fields. Political Analysis, 26 (3), 312–327. https://doi.org/10.1017/pan.2018.12

Doğan, G., Şencan, İ, & Tonta, Y. (2016). Does dirty data affect google scholar citations? Proceedings of the Association for Information Science and Technology, 53 (1), 1–4. https://doi.org/10.1002/pra2.2016.14505301098

Duch, J., Zeng, X. H. T., Sales-Pardo, M., Radicchi, F., Otis, S., Woodruff, T. K., & NunesAmaral, L. A. (2012). The possible role of resource requirements and academic career-choice risk on gender differences in publication rate and impact. PLoS ONE, 7 (12), e51332.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Eagly, A. H. (1987). Sex differences in social behavior: A social-role interpretation. Psychology Press . https://doi.org/10.4324/9780203781906

Eagly, A. H., & Karau, S. J. (2002). Role congruity theory of prejudice toward female leaders. Psychological Review, 109 (3), 573.

Article   PubMed   Google Scholar  

Elsevier. (2017). Gender in the Global Research Landscape . The Netherlands: Elsevier.

Google Scholar  

Etxebarria, G., & Gomez-Uranga, M. (2010). Use of Scopus and Google Scholar to measure social sciences production in four major Spanish universities. Scientometrics, 82 (2), 333–349. https://doi.org/10.1007/s11192-009-0043-9

Feeley, T. H., & Yang, Z. (2022). Is there a matilda effect in communication journals? Communication Reports, 35 (1), 1–11. https://doi.org/10.1080/08934215.2021.1974505

Ferber, M. A. (1988). Citations and networking. Gender & Society, 2 (1), 82–89. https://doi.org/10.1177/089124388002001006

Ferber, M. A., & Brün, M. (2011). The gender gap in citations: Does it persist? Feminist Economics, 17 (1), 151–158. https://doi.org/10.1080/13545701.2010.541857

Franceschet, M. (2010). A comparison of bibliometric indicators for computer science scholars and journals on web of science and google scholar. Scientometrics, 83 (1), 243–258. https://doi.org/10.1007/s11192-009-0021-2

Article   CAS   Google Scholar  

Freelon, D., Pruden, M. L., Eddy, K. A., & Kuo, R. (2023). Inequities of race, place, and gender among the communication citation elite, 2000–2019. Journal of Communication . https://doi.org/10.1093/joc/jqad002

Garcia-Retamero, R., & López-Zafra, E. (2006). Prejudice against Women in Male-congenial environments: Perceptions of gender role congruity in leadership. Sex Roles, 55 (1), 51–61. https://doi.org/10.1007/s11199-006-9068-1

Gusenbauer, M. (2019). Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics, 118 (1), 177–214. https://doi.org/10.1007/s11192-018-2958-5

Håkanson, M. (2005). The impact of gender on citations: An analysis of college & research libraries, journal of academic librarianship, and library quarterly. College & Research Libraries, 66 (4), 312–323.

Hardt, H., Kim, H., Meister, P., & Smith, A. E. (2017). Diversity by the book: Gender representation in political science graduate training . Annual Convention of the Midwest Political Science Association.

Harzing, A.-W., & Alakangas, S. (2016). Google Scholar, Scopus and the Web of Science: A longitudinal and cross-disciplinary comparison. Scientometrics, 106 (2), 787–804. https://doi.org/10.1007/s11192-015-1798-9

Hayashi, A. T., & Gregory, M. (2019). Maintaining scholarly integrity in the age of bibliometrics. J. Legal Educ, 69 , 138.

Holden, G., Rosenberg, G., & Barker, K. (2005). Bibliometrics. A potential decision making aid in hiring, reappointment, tenure and promotion decisions. Social Work in Health Care, 41 (3–4), 67–92. https://doi.org/10.1300/J010v41n03_03

Holman, L., Stuart-Fox, D., & Hauser, C. E. (2018). The gender gap in science: How long until women are equally represented? PLOS Biology, 16 (4), e2004956. https://doi.org/10.1371/journal.pbio.2004956

Huang, J., Gates, A. J., Sinatra, R., & Barabási, A. L. (2020). Historical comparison of gender inequality in scientific careers across countries and disciplines. Proceedings of the National Academy of Sciences . https://doi.org/10.1073/pnas.1914221117

Hutson, S. R. (2006). Self-citation in archaeology: age, gender, prestige, and the self. Journal of Archaeological Method and Theory, 13 (1), 1–18. https://doi.org/10.1007/s10816-006-9001-5

Jacsó, P. (2005). As we may search—Comparison of major features of the web of science, Scopus, and Google Scholar citation-based and citation-enhanced databases. Current Science, 89 (9), 1537–1547.

Jacsó, P. (2006a). Deflated, inflated and phantom citation counts. Online Information Review, 30 (3), 297–309. https://doi.org/10.1108/14684520610675816

Jacsó, P. (2006b). Dubious hit counts and cuckoo’s eggs. Online Information Review, 30 (2), 188–193. https://doi.org/10.1108/14684520610659201

Jacsó, P. (2012a). Google scholar author citation tracker: Is it too little, too late? Online Information Review, 36 (1), 126–141. https://doi.org/10.1108/14684521211209581

Jacsó, P. (2012b). Google Scholar Metrics for Publications. Online Information Review, 36 (4), 604–619. https://doi.org/10.1108/14684521211254121

Jadidi, M., Karimi, F., Lietz, H., & Wagner, C. (2018). Gender disparities in science? Dropout, productivity, collaborations and success of male and female computer scientists. Advances in Complex Systems, 21 (03n04), 1750011. https://doi.org/10.1142/s0219525917500114

Article   MathSciNet   Google Scholar  

Judge, T. A., Cable, D. M., Colbert, A. E., & Rynes, S. L. (2007). What causes a management article to be cited—article, author, or journal? Academy of Management Journal, 50 (3), 491–506. https://doi.org/10.5465/amj.2007.25525577

Kamdem, J. P., Roos, D. H., Sanmi, A. A., Calabró, L., Abolaji, A. O., de Oliveira, C. S., Barros, L. M., Duarte, A. E., Barbosa, N. V., Souza, D. O., & Rocha, J. B. T. (2019). Productivity of CNPq researchers from different fields in biomedical sciences: The need for objective bibliometric parameters—A report from Brazil. Science and Engineering Ethics, 25 (4), 1037–1055. https://doi.org/10.1007/s11948-018-0025-5

Khan, T. A., Jabeen, N., & Christensen, T. (2022). Rewarding academics: Experiences of the tenure track system in Pakistan. Higher Education Quarterly . https://doi.org/10.1111/hequ.12410

Knobloch-Westerwick, S., & Glynn, C. J. (2013). The matilda effect—role congruity effects on scholarly communication: A citation analysis of communication research and journal of communication articles. Communication Research, 40 (1), 3–26. https://doi.org/10.1177/0093650211418339

Knobloch-Westerwick, S., Glynn, C. J., & Huge, M. (2013). The matilda effect in science communication: An experiment on gender bias in publication quality perceptions and collaboration interest. Science Communication, 35 (5), 603–625. https://doi.org/10.1177/1075547012472684

Kwiek, M. (2018). High research productivity in vertically undifferentiated higher education systems: Who are the top performers? Scientometrics, 115 (1), 415–462. https://doi.org/10.1007/s11192-018-2644-7

Article   PubMed   PubMed Central   Google Scholar  

Larivière, V., Ni, C., Gingras, Y., Cronin, B., & Sugimoto, C. R. (2013). Bibliometrics: Global gender disparities in science. Nature, 504 (7479), 211–213. https://doi.org/10.1038/504211a

Lawrence, J. H., Celis, S., & Ott, M. (2014). Is the Tenure Process Fair? What Faculty Think. The Journal of Higher Education, 85 (2), 155–192. https://doi.org/10.1080/00221546.2014.11777323

Leahey, E. (2006). Gender differences in productivity: Research specialization as a missing link. Gender & Society, 20 (6), 754–780. https://doi.org/10.1177/0891243206293030

Lerchenmueller, M. J., & Sorenson, O. (2018). The gender gap in early career transitions in the life sciences. Research Policy, 47 (6), 1007–1017. https://doi.org/10.1016/j.respol.2018.02.009

Ley, T. J., & Hamilton, B. H. (2008). The gender gap in NIH grant applications. Science, 322 (5907), 1472–1474. https://doi.org/10.1126/science.1165878

Li, X., Wu, Q., & Liu, Y. (2017). A quantitative analysis of researcher citation personal display considering disciplinary differences and influence factors. Scientometrics, 113 (2), 1093–1112. https://doi.org/10.1007/s11192-017-2501-0

Liao, C. H., & Lian, J. W. (2022). Gender inequality in applying research project and funding. Journal of Information Science . https://doi.org/10.1177/01655515221097861

Article   PubMed Central   Google Scholar  

Long, J. S. (1992). Measures of sex differences in scientific productivity*. Social Forces, 71 (1), 159–178. https://doi.org/10.1093/sf/71.1.159

Maliniak, D., Powers, R., & Walter, B. F. (2013). The gender citation gap in international relations. International Organization, 67 (4), 889–922. https://doi.org/10.1017/S0020818313000209

Marsicano, C. R., Braxton, J. M., & Nichols, A. R. K. (2022). The use of google scholar for tenure and promotion decisions. Innovative Higher Education, 47 (4), 639–660. https://doi.org/10.1007/s10755-022-09592-y

Merton, R. K. (1968). The matthew effect in science. Science, 159 (3810), 56–63. https://doi.org/10.1126/science.159.3810.56

Mikki, S. (2010). Comparing google scholar and ISI web of science for earth sciences. Scientometrics, 82 (2), 321–331. https://doi.org/10.1007/s11192-009-0038-6

Mingers, J., & Lipitakis, E. (2010). Counting the citations: A comparison of web of science and google scholar in the field of business and management. Scientometrics, 85 (2), 613–625. https://doi.org/10.1007/s11192-010-0270-0

Moed, H. F., Bar-Ilan, J., & Halevi, G. (2016). A new methodology for comparing google scholar and scopus. Journal of Informetrics, 10 (2), 533–551. https://doi.org/10.1016/j.joi.2016.04.017

National Academy of Sciences, N. A. o. E. I. o. M. (2007). Beyond Bias and Barriers: Fulfilling the Potential of Women in Academic Science and Engineering . The National Academies Press.

Østby, G., Strand, H., Nordås, R., & Gleditsch, N. P. (2013). Gender gap or gender bias in peace research? Publication Patterns and citation rates for journal of peace research, 1983–20081. International Studies Perspectives, 14 (4), 493–506. https://doi.org/10.1111/insp.12025

Park, S. H., & Gordon, M. E. (1996). Publication records and tenure decisions in the field of strategic management. Strategic Management Journal, 17 (2), 109–128.

Potthoff, M., & Zimmermann, F. (2017). Is there a gender-based fragmentation of communication science? An investigation of the reasons for the apparent gender homophily in citations. Scientometrics, 112 (2), 1047–1063. https://doi.org/10.1007/s11192-017-2392-0

Powell, A., Hassan, T. M., Dainty, A. R. J., & Carter, C. (2009). Note: Exploring gender differences in construction research: A European perspective. Construction Management and Economics, 27 (9), 803–807. https://doi.org/10.1080/01446190903179736

RAND Corporation. (2005). Is there gender bias in federal grant programs?

Rossiter, M. W. (1993). The matthew matilda effect in science. Social Studies of Science, 23 (2), 325–341.

Singh, V. K., Srichandan, S. S., & Lathabai, H. H. (2022). ResearchGate and google scholar: How much do they differ in publications, citations and different metrics and why? Scientometrics, 127 (3), 1515–1542. https://doi.org/10.1007/s11192-022-04264-2

Slyder, J. B., Stein, B. R., Sams, B. S., Walker, D. M., Jacob Beale, B., Feldhaus, J. J., & Copenheaver, C. A. (2011). Citation pattern and lifespan: A comparison of discipline, institution, and individual. Scientometrics, 89 (3), 955–966.

Stack, S. (2002). Gender and scholarly productivity: The case of criminal justice. Journal of Criminal Justice, 30 (3), 175–182. https://doi.org/10.1016/S0047-2352(01)00134-9

Thelwall, M., & Kousha, K. (2017). ResearchGate versus google scholar: Which finds more early citations? Scientometrics, 112 (2), 1125–1131. https://doi.org/10.1007/s11192-017-2400-4

Toutkoushian, R. K. (1994). Using citations to measure sex discrimination in faculty salaries. The Review of Higher Education, 18 (1), 61–82.

Uhly, K. M., Visser, L. M., & Zippel, K. S. (2017). Gendered patterns in international research collaborations in academia. Studies in Higher Education, 42 (4), 760–782. https://doi.org/10.1080/03075079.2015.1072151

van den Besselaar, P., & Leydesdorff, L. (2009). Past performance, peer review and project selection: A case study in the social and behavioral sciences. Research Evaluation, 18 (4), 273–288. https://doi.org/10.3152/095820209x475360

van den Besselaar, P., & Sandström, U. (2017). Vicious circles of gender bias, lower positions, and lower performance: Gender differences in scholarly productivity and impact. PLoS ONE, 12 (8), e0183301.

Wallerstein, I. (1999). The end of the world as we know it: Social science for the twenty-first century. University of Minnesota Press.

Ward, K. B. G., Grant, J., & Linda. (1992). Visibility and dissemination of Women’s and Men’s sociological scholarship four papers on inequality. Social Problems, 39 , 291–298.

Wildgaard, L. (2015). A comparison of 17 author-level bibliometric indicators for researchers in astronomy, environmental science, philosophy and public health in web of science and google scholar. Scientometrics, 104 (3), 873–906. https://doi.org/10.1007/s11192-015-1608-4

Zigerell, L. (2015). Is the gender citation gap in international relations driven by elite papers? Research & Politics, 2 (2), 2053168015585192. https://doi.org/10.1177/2053168015585192

Download references

Open access funding provided by National University of Public Service.

Author information

Authors and affiliations.

University Carlos III of Madrid, Madrid, Spain

Manuel Goyanes

Nemzeti Kozszolgalati Egyetem, Budapest, Hungary

Tamás Tóth & Gergő Háló

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Gergő Háló .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Goyanes, M., Tóth, T. & Háló, G. Gender differences in google scholar representation and impact: an empirical analysis of political communication, journalism, health communication, and media psychology. Scientometrics (2024). https://doi.org/10.1007/s11192-024-04945-0

Download citation

Received : 01 June 2023

Accepted : 17 January 2024

Published : 16 February 2024

DOI : https://doi.org/10.1007/s11192-024-04945-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Google Scholar
  • Matilda effect
  • Gender bias
  • Productivity
  • Find a journal
  • Publish with us
  • Track your research

Catholic University Logo

University Libraries

  • How Do I...
  • Search for Articles with Google Scholar
  • Adding a Library Printer to Your Laptop

The Catholic University library catalog and many of the article databases Catholic University subscribes to are accessible through Google Scholar .

On-campus access

Visit https://scholar.google.com and begin searching. You're good to go!

Off-campus access

If you are off campus you will need to set the preferences so that Google will show you the resources that Catholic University provides.

  • Go to https://scholar.google.com
  • Look at the left corner menu icon and click Settings from the menu.
  • Click on Library Links from the navbar along the side of the page.
  • Enter CUA in the text field next to Library Links then click on the Search button.
  • Check the box in the front of our university name, then click Save in the lower right corner.

Searching with Google Scholar

Within Google Scholar you may conduct searches by keyword, author and article title. There is also an advanced search with more options. In the result list, when you see ViewIt@CatholicU , that means we have access to the electronic copy for the article. Click on ViewIt@CatholicU , the next page will show that item in our SearchBox with a link to the full text.

Google Scholar is good for conducting simple searches across a broad number of databases. For complex or in depth searching we recommend that you search individual subject databases .

Google Scholar™ is a trademark of Google Inc.

linkedin tracking

How to Find Free Articles on Google Scholar

There are tons of resources for students to find articles online, and Google Scholar is a top pick. Here's how to use it.

Are you working on a research project or simply looking for credible information? Google Scholar can help you find free and credible research articles.

Instead of searching for scholarly articles in a standard Google search, you can use a simpler method to find articles. Google Scholar is a division of Google that focuses on scholarly literature, that way you can easily find articles that you need for your research.

Find Free Articles on Google Scholar

You might enjoy reading insanely weird articles on Wikipedia . However, maybe it's time that you read information from the academics that the world has to offer.

Hands browsing on a laptop.

It can be frustrating searching the internet for articles, without finding anything that doesn't require payment. Google Scholar offers a wide variety of research articles, many of which are available for free.

Here's how to find free articles on Google Scholar:

  • Head to Google Scholar .
  • Type out a keyword search in the search bar.
  • When the results are displayed, only check for articles with a PDF text link.
  • Click on the link for your desired article.
  • Check if the article has a free downloadable link, or if you can read it for free online.
  • Once you have found a free article, save the PDF document onto your device or read it online.

Typically, free articles on Google Scholar have a visible PDF text link next to the article title. If you are unlucky, the link will lead you to the publisher's website, where you would have to purchase the article.

However, when the article is free, you can save the document or read it online.

Google Scholar opening page

Finding Recently Published Articles

Google Scholar allows you to filter your search to a specific time frame. This way, you can find articles that were recently published, or that were published over 5 to 10 years ago.

To find an article according to the year it was published, click Since Year on Google Scholar's left sidebar. This allows you to find article papers that were published from the specified year. You can also choose whether you want the results page on Google Chrome to sort articles by date or relevance.

Person on a couch using a laptop

Click Sort by date to show just the new additions. If you are not too concerned about when the articles were published, you may click on Any time , which you will find in the left sidebar.

If you are looking for a less academic platform to gather information, you can use LinkedIn as a research tool  instead.

Improve Your Research Skills

Knowing how to research effectively is not an easy skill to have. Luckily, the internet makes life easier in so many ways. One of those ways is helping you research better.

If you are a research student, you might want to find out how to make the most of your search browser. That way you can get the best results from your research.

Finding Scholarly Articles: Home

Profile Photo

What's a Scholarly Article?

Your professor has specified that you are to use scholarly (or primary research or peer-reviewed or refereed or academic) articles only in your paper. What does that mean?

Scholarly or primary research articles are peer-reviewed , which means that they have gone through the process of being read by reviewers or referees  before being accepted for publication. When a scholar submits an article to a scholarly journal, the manuscript is sent to experts in that field to read and decide if the research is valid and the article should be published. Typically the reviewers indicate to the journal editors whether they think the article should be accepted, sent back for revisions, or rejected.

To decide whether an article is a primary research article, look for the following:

  • The author’s (or authors') credentials and academic affiliation(s) should be given;
  • There should be an abstract summarizing the research;
  • The methods and materials used should be given, often in a separate section;
  • There are citations within the text or footnotes referencing sources used;
  • Results of the research are given;
  • There should be discussion   and  conclusion ;
  • With a bibliography or list of references at the end.

Caution: even though a journal may be peer-reviewed, not all the items in it will be. For instance, there might be editorials, book reviews, news reports, etc. Check for the parts of the article to be sure.   

You can limit your search results to primary research, peer-reviewed or refereed articles in many databases. To search for scholarly articles in  HOLLIS , type your keywords in the box at the top, and select  Catalog&Articles  from the choices that appear next.   On the search results screen, look for the  Show Only section on the right and click on  Peer-reviewed articles . (Make sure to  login in with your HarvardKey to get full-text of the articles that Harvard has purchased.)

Many of the databases that Harvard offers have similar features to limit to peer-reviewed or scholarly articles.  For example in Academic Search Premier , click on the box for Scholarly (Peer Reviewed) Journals  on the search screen.

Review articles are another great way to find scholarly primary research articles.   Review articles are not considered "primary research", but they pull together primary research articles on a topic, summarize and analyze them.  In Google Scholar , click on Review Articles  at the left of the search results screen. Ask your professor whether review articles can be cited for an assignment.

A note about Google searching.  A regular Google search turns up a broad variety of results, which can include scholarly articles but Google results also contain commercial and popular sources which may be misleading, outdated, etc.  Use Google Scholar  through the Harvard Library instead.

About Wikipedia .  W ikipedia is not considered scholarly, and should not be cited, but it frequently includes references to scholarly articles. Before using those references for an assignment, double check by finding them in Hollis or a more specific subject  database .

Still not sure about a source? Consult the course syllabus for guidance, contact your professor or teaching fellow, or use the Ask A Librarian service.

  • Last Updated: Oct 3, 2023 3:37 PM
  • URL: https://guides.library.harvard.edu/FindingScholarlyArticles

Harvard University Digital Accessibility Policy

Banner

Creating and Managing Google Scholar Profiles

These instructions were adapted under a CC-BY 4.0 license from UC Davis College of Biological Sciences' Create and Manage a Google Scholar Profile.

Step 1: Create your basic profile

  • Log on to  scholar.google.com  and click the “My Profile” link at the top of the page to get your account setup started.
  • On the first screen, add your affiliation information and university email address so Google Scholar can confirm your account. Add keywords that are relevant to your research interests so others can find you when browsing a subject area. Provide a link to your faculty page.
  • Click “Next Step,” and--that’s it! Your basic profile is done. Now, let’s add some publications to it.

Step 2: Add publications

Google has likely already been indexing your work for some time now as part of their mission as a scholarly search engine. However, keep in mind that  Google Scholar does not index  everything .

Google Scholar will provide you with a list of publications they think belong to you. You’ll need to read through the list of publications that it suggests as yours and select which ones you want to add to your profile.

Beware--if you have a common name, it’s likely there’s some publications in this list that don’t belong to you. And there’s also possibly content that you don’t want on your profile because it’s not a scholarly article, or is not representative of your current research path, and so on.

Read through the publications list and deselect any that you do not want to add to your profile (like the below newsletter item that Google Scholar thinks is a scholarly article).

Click the grey “Add” button at the top of your profile.

A screenshot of a list of papers that Google Scholar has associated with an author. An arrow points to the checkbox next to one result, with text reading "Deselect any publications you don't want added to your profile"

Confirm you want Google to automatically add new publications to your profile in the future. If you’ve got a very common name, note that this might add publications you didn’t author to your profile. But if you’re a prolific author, it can be worth it for the time it saves you approving new articles every month.

Your profile is now almost complete! Two more steps: add a photo by clicking the “Change Photo” link on your profile homepage, and set your private profile to “Public.”

Step 3: Make your profile public

Your profile is private if you’ve just created it. Change your profile visibility by clicking the link to "Make it public" under your name and title. You can also make your profile public by clicking the Edit button and selecting the box next to the words "Make my profile public."

Step 4: Add missing articles

You might have articles that Google Scholar didn’t automatically add to your profile. If that’s the case, you’ll need to add it manually.

Click the “Add” icon button (looks like a plus) in the grey toolbar within your profile.

Select "Add articles manually."

Complete the form for each new paper to add to your profile. Include as much information as possible so that Google Scholar can find citations to yoru work.

Click "Save" after filling out the form and repeat as needed.

Step 5: Clean up your Google Scholar Profile data

Thanks to Google Scholar Profiles’ “auto add” functionality, your Profile might include some articles you didn’t author.

If that’s the case, you can remove them in one of two ways:

  • Click on the title of each offending article to get to the article’s page, and then clicking the “Delete” button a the top of the page.
  • From the main Profile page, tick the boxes next to each incorrect article and click the “Delete” button in the top grey bar.

Google Scholar will automatically populate your profile (to the best of its abilities).

  • Last Updated: Feb 13, 2024 3:54 PM
  • URL: https://guides.libraries.wm.edu/GSprofiles

Google Scholar Metrics

Google Scholar Metrics provide an easy way for authors to quickly gauge the visibility and influence of recent articles in scholarly publications. Scholar Metrics summarize recent citations to many publications, to help authors as they consider where to publish their new research.

To get started, you can browse the top 100 publications in several languages , ordered by their five-year h-index and h-median metrics. To see which articles in a publication were cited the most and who cited them, click on its h-index number to view the articles as well as the citations underlying the metrics.

You can also explore publications in research areas of your interest. To browse publications in a broad area of research, select one of the areas in the left column. For example: Engineering & Computer Science or Health & Medical Sciences .

To explore specific research areas, select one of the broad areas, click on the "Subcategories" link and then select one of the options. For example: Databases & Information Systems or Development Economics.

Browsing by research area is, as yet, available only for English publications. You can, of course, search for specific publications in all languages by words in their titles.

Scholar Metrics are currently based on our index as it was in July 2023 .

Available Metrics

The h-index of a publication is the largest number h such that at least h articles in that publication were cited at least h times each. For example, a publication with five articles cited by, respectively, 17, 9, 6, 3, and 2, has the h-index of 3.

The h-core of a publication is a set of top cited h articles from the publication. These are the articles that the h-index is based on. For example, the publication above has the h-core with three articles, those cited by 17, 9, and 6.

The h-median of a publication is the median of the citation counts in its h-core. For example, the h-median of the publication above is 9. The h-median is a measure of the distribution of citations to the articles in the h-core.

Finally, the h5-index , h5-core , and h5-median of a publication are, respectively, the h-index, h-core, and h-median of only those of its articles that were published in the last five complete calendar years.

We display the h5-index and the h5-median for each included publication. We also display an entire h5-core of its articles, along with their citation counts, so that you can see which articles contribute to the h5-index. And there's more! Click on the citation count for any article in the h5-core to see who cited it.

Coverage of Publications

Scholar Metrics currently cover articles published between 2018 and 2022 , both inclusive. The metrics are based on citations from all articles that were indexed in Google Scholar in July 2023 . This also includes citations from articles that are not themselves covered by Scholar Metrics.

Since Google Scholar indexes articles from a large number of websites, we can't always tell in which journal a particular article has been published. To avoid misidentification of publications, we have included only the following items:

  • journal articles from websites that follow our inclusion guidelines ;
  • selected conference articles in Engineering and Computer Science.

Furthermore, we have specifically excluded the following items:

  • court opinions, patents, books, and dissertations;
  • publications with fewer than 100 articles published between 2018 and 2022;
  • publications that received no citations to articles published between 2018 and 2022.

Overall, Scholar Metrics cover a substantial fraction of scholarly articles published in the last five years. However, they don't currently cover a large number of articles from smaller publications.

Inclusion and Corrections

If you can't find the journal you're looking for, try searching by its abbreviated title or alternate title. There're sometimes several ways to refer to the same publication. (Fun fact: we've seen 959 ways to refer to PNAS.)

If you're wondering why your journal is not included, or why it has fewer citations than it surely deserves, that is often a matter of configuring your website for indexing in Google Scholar. Please refer to the inclusion manual . Also, keep in mind that Scholar Metrics only include publications with at least a hundred articles in the last five years.

  • Privacy & Terms

Supporting the Information Needs of Whole Leaders for the Whole World

ORU Library logo

How to Find Articles

Eagle search, access online resources—off campus login, finding articles, search strategies, google scholar.

  • ORU COE Dissertation Templates
  • Dissertations
  • Organizations
  • Literature Reviews

You can find articles on education in several ways:

ORU subscribed

  • opens new window Google Scholar

EagleSearch searches the ORU Library catalog and several of the databases to which ORU Library subscribes. It is a good place to start if you aren't sure where to look. You can enter a phrase or question in this search box; similar to the way you would search Google.

Go to the Library's home page and on the EagleSearch tab, enter your search terms in the search box.

ORU databases, including full text articles, journals and ebooks, are fee-based resources and, as such, are generally restricted to current ORU students, faculty and staff. For off-campus access it is necessary to login with your ORU Network username and password (Single Sign-on) when prompted.

If you do not know your login, the IT Help Desk can assist you ([email protected] or 918-495-6321). More information is available in the ORU IT Student Guide .

If you know your Single Sign-on and it works for other ORU applications (D2L, ePortfolio, email, etc.) but not for remote library access, troubleshooting guide may help you discover a technical issue which may be interfering with your access.

Subject specific databases are the best way to find articles on a topic.

When searching databases, you need to construct a search (break your topic into key concepts) rather than entering a phrase or question. See Finding Books or How to Find Articles on a Topic for tips on constructing a search.

database button link on the library's home page

On the Databases listing page, select Education in the first dropdown, Choose a Subject .

google scholar research articles

Education Research Complete is a good database to start your searching. Click on the linked database title, Education Research Complete .

Once you are in Education Research Complete , you will want to add all of the Education databases available from EBSCO. This way you can avoid searching each of these separately.

add more databases in Education Source

Click on the linked Choose Databases .

Add more Education databases by clicking on the checkboxes next to the databases you want to add in the Choose Databases pop-up window. Some you may want add include: Educational Administration Abstracts, ERIC, Professional Development Collection , and Teacher Reference Center . Then click on the yellow OK button.

Choose more databases in EBSCO interface popup

Enter your search terms in the search boxes.

example EBSCO Education database search

As you can see in the image to the left, I have entered "government fund*" in the first search box, "k-12 education" in the second search box, and in the third search box; US OR U.S. OR "United States" OR America.

You can change the dropdown menu if desired. Sometimes it can be a good idea to limit one of your search terms to the title if you find you are getting a lot of results. Click the Search button to perform the search.

At the time of writing, my search returned 6 results. In this case, I would change my second search box to k-12 (take education and the quotation marks out of the second search box). By doing that, my search results went up to 20.

Education database limits

To ensure you are looking at scholarly sources, click on the limits on the left, including Scholarly (Peer-review ed) Journals and change the earliest publication date to the date range you need. Scroll down a little to set Language limits and click on the box next to Academic Journals .

set up Alert

Don't forgot to set up an alert in EBSCO to have the databases email to you any articles added to the databases that fit within your search. Setting up an alert saves you time because you don't have to come back to the database to redo your search.

Click on the Search dropdown that appears to the top right of your results. Click on Email Alert under Create an alert . Follow the steps as prompted.

Don't forget to go back the Databases list of Education databases and search the others that are not provided by EBSCO (e.g., SAGE Premier ).

Google Scholar is a good place to search for articles when:

  • You are having trouble finding articles in the databases or EagleSearch.
  • You have a known citation and want to see where it is available (freely or at ORU). (If you live off-campus, see Setting up Google Scholar to set up your Google Scholar search so that E-resources@ORU links will appear on the results list of items in our databases.)
  • You want to make sure you have searched comprehensively (e.g., for your Master's thesis or Doctoral dissertation).

Google Scholar Search

  • Next: Citations >>
  • Last Updated: Feb 9, 2024 8:55 AM
  • URL: https://library.oru.edu/educationResearch
  • Open access
  • Published: 14 February 2024

123VCF: an intuitive and efficient tool for filtering VCF files

  • Milad Eidi 1 ,
  • Samaneh Abdolalizadeh 2 ,
  • Soheila Moeini 3 , 4 ,
  • Masoud Garshasbi 1 &
  • Javad Zahiri 5  

BMC Bioinformatics volume  25 , Article number:  68 ( 2024 ) Cite this article

291 Accesses

6 Altmetric

Metrics details

The advent of Next-Generation Sequencing (NGS) has catalyzed a paradigm shift in medical genetics, enabling the identification of disease-associated variants. However, the vast quantum of data produced by NGS necessitates a robust and dependable mechanism for filtering irrelevant variants. Annotation-based variant filtering, a pivotal step in this process, demands a profound understanding of the case-specific conditions and the relevant annotation instruments. To tackle this complex task, we sought to design an accessible, efficient and more importantly easy to understand variant filtering tool.

Our efforts culminated in the creation of 123VCF, a tool capable of processing both compressed and uncompressed Variant Calling Format (VCF) files. Built on a Java framework, the tool employs a disk-streaming real-time filtering algorithm, allowing it to manage sizable variant files on conventional desktop computers. 123VCF filters input variants in accordance with a predefined filter sequence applied to the input variants. Users are provided the flexibility to define various filtering parameters, such as quality, coverage depth, and variant frequency within the populations. Additionally, 123VCF accommodates user-defined filters tailored to specific case requirements, affording users enhanced control over the filtering process. We evaluated the performance of 123VCF by analyzing different types of variant files and comparing its runtimes to the most similar algorithms like BCFtools filter and GATK VariantFiltration. The results indicated that 123VCF performs relatively well. The tool's intuitive interface and potential for reproducibility make it a valuable asset for both researchers and clinicians.

The 123VCF filtering tool provides an effective, dependable approach for filtering variants in both research and clinical settings. As an open-source tool available at https://project123vcf.sourceforge.io , it is accessible to the global scientific and clinical community, paving the way for the discovery of disease-causing variants and facilitating the advancement of personalized medicine.

Peer Review reports

The advent of next-generation sequencing (NGS) technologies has revolutionized the field of genomics, enabling the analysis of large-scale genomic datasets with unprecedented accuracy and resolution. However, the sheer volume of data generated by NGS requires efficient and reliable tools for variant analysis. This analysis typically involves the identification of disease-causing variants by filtering out irrelevant variants using annotation-based filtering, a critical step in the analysis pipeline that requires an understanding of both the case's conditions and available annotations [ 1 , 2 ].

Several standalone and web-based tools, such as ANNOVAR, wANNOVAR, VEP, and SnpEff, are available to annotate variants [ 3 , 4 , 5 , 6 ]. However, variant filtration, the subsequent step in the analysis pipeline, requires specialized, flexible, and user-friendly tools. Graphical User Interface (GUI) based tools, such as VCF.Filter, VCF-Miner, and BrowseVCF, enable users to filter any desired annotation, while others, like GEMINI has predefined annotations that restrict the user [ 7 , 8 , 9 , 10 ]. Command Line Interface (CLI) based tools, such as GATK-VariantFiltration, VCFtools, BCFtools filter, and Exomiser, require advanced bioinformatics and programming skills, limiting their accessibility to a broader user base [ 11 , 12 , 13 , 14 ]. A comprehensive comparison is provided at Table  1 .

This study aimed to develop 123VCF, a user-friendly and efficient GUI-based filtering tool that enables researchers and clinicians to define filters easily through a text file. 123VCF employs a disk-streaming real-time filtering algorithm, efficiently processing variant files without the need to load them into the computer's memory.

Implementation

Effective variant filtering is a pivotal stage in Next-Generation Sequencing (NGS) data analysis, involving variant annotation and subsequent filtering based on user-defined criteria. However, traditional variant filtering tools often suffer from memory-intensive processes, especially when dealing with extensive datasets, as they load the entire input VCF file into memory before applying filters [ 13 ]. To address this challenge, we introduce 123VCF, an innovative tool that employs a memory-efficient algorithm for variant filtering, eliminating the need to load the input VCF file into memory. This breakthrough not only ensures faster processing but also enables seamless handling of large datasets.

123VCF is a freely available, versatile, and cross-platform tool developed using Java Swing, and it is distributed under the MIT license. The tool provides users with a user-friendly graphical interface enabling them to filter VCF files based on annotations within the "INFO" and "FORMAT" fields. Additionally, researchers can easily isolate de novo variants in multi-sample VCF files by specifying genotypes for each sample. To ensure simplicity and independence from third-party codes, all components of 123VCF were entirely developed by the authors, resulting in a straightforward and lightweight tool.

The filtering process is initiated by conducting an analysis of the filtering order file in comparison to the header section of the submitted VCF file, ensuring a comprehensive evaluation. Subsequently, each filter is systematically applied to every variant, employing intricate regular expressions rules tailored for string and numerical based filters. Through this advanced approach, only those variants that successfully meet all specified criteria, both in terms of string matching and numerical operations, are selected and documented in the designated output file(s). The underlying algorithm's core concept is visualized in Fig.  1 , providing a clear representation of the methodology employed by 123VCF for efficient variant filtering. With its ease of use and powerful filtering capabilities, 123VCF emerges as a valuable tool for researchers and bioinformaticians in diverse genomic analyses.

figure 1

123VCF algorithm's steps

123VCF offers users the flexibility to include or exclude heterozygous and homozygous variants from the sample, allowing for precise and customized filtering. The tool can generate a Tab-Separated Values (TSV) file containing all passed variants, which can be easily imported into spreadsheet-based programs for further analysis. Additionally, 123VCF can generate another TSV file specifically for variants that overlap with a user-provided BED file, allowing researchers and clinicians to identify possible compound heterozygous variants. These TSV files provide a convenient and customizable way to prioritize and analyze variants of interest. The efficiency of 123VCF were evaluated using a set of variant files and also compared to the most similar algorithms, demonstrating its ability to handle large datasets without compromising performance. The tool's disk-streaming real-time filtering algorithm was found to be efficient, providing accurate filtering results in a short amount of time.

123VCF provides a robust functionality that allows users to define and apply custom filtration orders using plain text files, as outlined in the user manual. This feature offers a high level of convenience, enabling users to utilize their laboratory-specific filters repeatedly without limitations. By incorporating this feature, users can streamline their workflow and enhance reproducibility, ultimately improving the efficiency and accuracy of their analysis. Furthermore, to facilitate the use of this feature, we have provided several filtering order files along with the tool, providing users with a starting point for customizing their own filtering orders.

In order to demonstrate the efficacy of 123VCF, a thorough benchmark analysis was conducted using a diverse collection of VCF files from prominent projects [ 10 , 15 , 16 , 17 ]. To ensure consistency in annotations, ANNOVAR with identical databases was employed for all six VCF files [ 5 ]. The benchmark comprised VCF files with varying numbers of variants and samples, and the condensed results are presented in Table  2 , providing information on variant and sample counts, annotated VCF file sizes, applied filters, and run time of 123VCF, BCFtools filter and GATK VariantFiltration in seconds.

Table 2 clearly shows that 123VCF is an expeditious and effective filtering tool capable of processing large VCF files within seconds. The algorithm of 123VCF demonstrated precision in filtering variants in large VCF files while maintaining optimal performance, providing a significant tool for variant analysis to researchers and clinicians. It is crucial to highlight that 123VCF adopts a distinct filtering strategy compared to other available tools, making direct comparisons challenging. Nevertheless, our rigorous benchmark analysis demonstrates that 123VCF is an exceptionally efficient tool, particularly when multiple impactful filters are employed. In this benchmark, we chose to compare 123VCF with the most similar algorithms, BCFtools filter and GATK VariantFiltration tools. The runtimes of the similar tools are included in the rightmost columns of Table  2 . It is important to highlight that we utilized identical uncompressed non-indexed VCF files for this benchmark.

A notable factor affecting 123VCF's performance is the I/O speed of the hard disks. Utilizing Solid-State Drives (SSD) hard drives can significantly enhance its efficiency. To optimize runtimes, we introduced an option to remove filtered-out variants from the output files, as organizing variants in the output files was identified as the most time-intensive operation in our algorithm. Additionally, 123VCF's ability to handle varying file sizes with little impact on performance makes it an invaluable resource for researchers dealing with different scales of data in NGS data analysis.

In conclusion, the development of 123VCF has yielded a highly efficient VCF file filtering tool with notable advantages over existing filtering tools. The tool's versatility in allowing users to define filters based on any desired annotation, and its filtering algorithm contribute to its efficacy in genetic analysis.

Another significant advantage of 123VCF is its standalone architecture, which allows users to run the tool on a local computer without requiring an internet connection. This ensures the privacy of submitted information, making it a highly secure tool for genetic analysis.

In addition, we added a command line interface to 123VCF to make it even more user-friendly and reproducible. This will allow users to easily automate their analyses and integrate 123VCF into their existing workflows. We believe that this new feature will further increase the accessibility of 123VCF and streamline the analysis process. Our team is dedicated to providing the best possible user experience, and we are excited to continue innovating and improving the tool in the future.

Availability and requirements

Project name: 123VCF.

Project home page: https://project123vcf.sourceforge.io .

Operating system(s): Platform independent.

Programming language: Java. Other requirements: Java 1.8.

License: MIT.

Any restrictions to use by non-academics: None.

Availability of data and materials

The compressed annotated VCF files utilized in our benchmark analysis are accessible through the project Source Forge page: https://sourceforge.net/projects/project123vcf/files/Benchmark_Data/ .

Schutz S, Monod-Broca C, Bourneuf L, Marijon P, Montier T. Cutevariant: a standalone GUI-based desktop application to explore genetic variations from an annotated VCF file. Bioinform Adv. 2022;2(1):vbab028.

Article   PubMed   Google Scholar  

Eidi M, Garshasbi M. A novel ISCA2 variant responsible for an early-onset neurodegenerative mitochondrial disorder: a case report of multiple mitochondrial dysfunctions syndrome 4. BMC Neurol. 2019;19(1):1–7.

Article   CAS   Google Scholar  

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. 2016;17(1):1–4.

Article   Google Scholar  

Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat Protoc. 2015;10(10):1556.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164.

Article   PubMed   PubMed Central   Google Scholar  

Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 2012;6(2):80–92.

Salatino S, Ramraj V. BrowseVCF: a web-based application and workflow to quickly prioritize disease-causative variants in VCF files. Brief Bioinform. 2017;18(5):774.

PubMed   Google Scholar  

Müller H, Jimenez-Heredia R, Krolo A, Hirschmugl T, Dmytrus J, Boztug K, Bock CVCF. Filter: interactive prioritization of disease-linked genetic variants from sequencing data. Nucleic Acids Res. 2017;45(W1):W567–72.

Paila U, Chapman BA, Kirchner R, Quinlan AR. GEMINI: integrative exploration of genetic variation and genome annotations. PLoS Comput Biol. 2013;9(7):e1003153.

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Hart SN, Duffy P, Quest DJ, Hossain A, Meiners MA, Kocher JP. VCF-Miner: GUI-based application for mining variants and annotations stored in VCF files. Brief Bioinform. 2016;17(2):346–51.

Smedley D, Jacobsen JOB, Jäger M, Köhler S, Holtgrewe M, Schubach M, et al. Next-generation diagnostics and disease-gene discovery with the exomiser. Nat Protoc. 2015;10(12):2004.

Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156.

Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987.

McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303.

Corpas M, Valdivia-Granda W, Torres N, Greshake B, Coletta A, Knaus A, et al. Crowdsourced direct-to-consumer genomic analysis of a family quartet. BMC Genomics. 2015;16(1):1–16. https://doi.org/10.1186/s12864-015-1973-7 .

Zook JM, Catoe D, McDaniel J, Vang L, Spies N, Sidow A, et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci Data. 2016;3(1):1–26.

Auton A, Abecasis GR, Altshuler DM, Durbin RM, Bentley DR, Chakravarti A, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.

Article   ADS   PubMed   Google Scholar  

Download references

Acknowledgements

Not applicable.

The authors declare no funding or financial support for this research.

Author information

Authors and affiliations.

Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran

Milad Eidi & Masoud Garshasbi

Department of Genetics and Molecular Medicine, School of Medicine, Zanjan University of Medical Sciences (ZUMS), Zanjan, Iran

Samaneh Abdolalizadeh

Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montreal, QC, Canada

Soheila Moeini

Research Centre, Montreal Heart Institute, Montreal, QC, Canada

Department of Neuroscience, University of California San Diego, San Diego, CA, USA

Javad Zahiri

You can also search for this author in PubMed   Google Scholar

Contributions

ME conceived the idea, developed the software, designed the benchmark, wrote the manuscript and coordinator of the team. SA contributed to the coding and execution of the benchmark. SM wrote the manuscript. MG provided clinical supervision, and JZ provided computational supervision. MG and JZ provided feedback on the implementation, benchmark and manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Masoud Garshasbi or Javad Zahiri .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no financial or personal relationships with individuals or organizations that could inappropriately influence or bias the content of this paper. However, it should be noted that some of the authors are developers of the software being presented in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Eidi, M., Abdolalizadeh, S., Moeini, S. et al. 123VCF: an intuitive and efficient tool for filtering VCF files. BMC Bioinformatics 25 , 68 (2024). https://doi.org/10.1186/s12859-024-05661-5

Download citation

Received : 29 May 2023

Accepted : 17 January 2024

Published : 14 February 2024

DOI : https://doi.org/10.1186/s12859-024-05661-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Next generation sequencing
  • VCF filtering
  • Variant analysis
  • Variant filtering
  • Exome sequencing
  • Genome sequencing

BMC Bioinformatics

ISSN: 1471-2105

google scholar research articles

  • Review the Research Process
  • Find Books & Articles
  • go back to the Education Library This link opens in a new window

Need help? Just ask!

Profile Photo

Using Article Databases

Article databases allow you to access scholarly journal articles and often other resources like books, reference materials, and popular magazine and newspaper articles. Database search results are more focused and precise than those you find while web searching. Databases are also more trustworthy than the web. All material in databases is evaluated for accuracy and credibility by subject area specialists.

InfoKat Discovery - find books, research articles, and more.

Not restricted to UK or on-campus users

Top Research Databases

UK or on-campus users only

Other Useful Links

As the official digital dissertations archive for the Library of Congress, ProQuest Dissertations and Theses includes millions of searchable citations to dissertations and theses from 1861 to the present day with over a million full-text dissertations. The database offers full text for most dissertations added since 1997 and retrospective full-text coverage for older graduate works.

  • Business Resources A guide on how to do research in the field of business.
  • Marketing Resources Library guide to finding resources related to marketing.
  • Google Scholar
  • << Previous: Review the Research Process
  • Next: go back to the Education Library >>
  • Last Updated: Feb 15, 2024 11:34 AM
  • URL: https://libguides.uky.edu/644KHP

IMAGES

  1. Finding scholarly articles in Google Scholar

    google scholar research articles

  2. How to use and find Research Papers on Google Scholar? 10 Tips for

    google scholar research articles

  3. Google Scholar

    google scholar research articles

  4. Where To Find Journal Articles For PhD Research: A Beginner's Guide

    google scholar research articles

  5. Google Scholar Article Search

    google scholar research articles

  6. How to use Google scholar: the ultimate guide

    google scholar research articles

VIDEO

  1. How To Add Research Articles to the Library Using Google Scholar

  2. Avoid Google, start your research on Google Scholar

  3. Google Scholar research paper search

  4. TIPS TO FIND ARTICLES SOURCES FOR RESEARCH SCIENTIFIC

  5. How to search articles from Google Scholar

  6. How to Google Scholar plus pros and cons!

COMMENTS

  1. Google Scholar

    Stand on the shoulders of giants Google Scholar provides a simple way to broadly search for scholarly literature. Search across a wide variety of disciplines and sources: articles, theses,...

  2. Google Scholar

    Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other ...

  3. How to use Google Scholar: the ultimate guide

    Learn how to use Google Scholar, a free academic search engine that can help you find scholarly sources of research papers. Find out how to search effectively, customize your preferences, access full text, export citations, and more.

  4. About Google Scholar

    Stand on the shoulders of giants Google Scholar provides a simple way to broadly search for scholarly literature. From one place, you can search across many disciplines and sources: articles ...

  5. Google Scholar

    Google Scholar searches for scholarly literature in a simple, familiar way. You can search across many disciplines and sources at once to find articles, books, theses, court opinions, and content from academic publishers, professional societies, some academic web sites, and more.

  6. 18 Google Scholar tips all students should know

    1. Copy article citations in the style of your choice. With a simple click of the cite button (which sits below an article entry), Google Scholar will give you a ready-to-use citation for the article in five styles, including APA, MLA and Chicago. You can select and copy the one you prefer. 2. Dig deeper with related searches.

  7. Google Scholar

    Learn how to use Google Scholar with your HarvardKey to access full text of articles from Harvard Library subscriptions. Find tips and tricks for searching, connecting, and exploring Google Scholar with your HarvardKey.

  8. Research Guides: Find Journal Articles: Google Scholar

    Google Scholar is a web search engine that finds scholarly literature, including papers, theses, books, and reports. By searching Google Scholar from the library's webpage, you will have free linked access to the library's subscription holdings. Other links from Google Scholar may prompt you to pay for articles, but DO NOT PAY for articles.

  9. The Use of Google Scholar for Research and Research Dissemination

    In Google Scholar, when a research topic is searched, a list of publications is created. The default setting lists the most relevant publications first and can be changed to list the most recent publications first. For each article, a "Cited by" link is provided. Clicking on that link takes the reader to an article's "cited by" list.

  10. Google Scholar reveals its most influential papers for 2021

    The 2021 Google Scholar Metrics ranking tracks papers published between 2016 and 2020, and includes citations from all articles that were indexed in Google Scholar as of July 2020....

  11. Academic Guides: Full-Text Articles: Articles at Google Scholar

    Follow these steps to manually link Google Scholar to the Walden Library collection: Go to Google Scholar (scholar.google.com). O n the upper left side of your screen, click on the three lines icon. Click the Settings link or gear icon. Depending on your screen size, the link or icon may be at the top or the bottom of that section.

  12. Research at Google

    Research at Google Publications Google publishes hundreds of research papers each year. Publishing is important to us; it enables us to collaborate and share ideas with, as well as learn...

  13. Google Scholar pioneer on search engine's future

    Google Scholar, the free search engine for scholarly literature, turns ten years old on 18 November. By 'crawling' over the text of millions of academic papers, including those behind publishers ...

  14. Research Basics: Find Articles Using Google Scholar

    It searches across many disciplines and covers a wide variety of resources, including journal articles, theses, books, abstracts, and more. Although Google Scholar is aimed at the academic community, it uses a very broad definition of "scholarly literature." It is important to realize that not everything in Google Scholar is peer reviewed.

  15. The Role of Google Scholar in Evidence Reviews and Its Applicability to

    Google Scholar (GS), a commonly used web-based academic search engine, catalogues between 2 and 100 million records of both academic and grey literature (articles not formally published by commercial academic publishers). Google Scholar collates results from across the internet and is free to use.

  16. Gender differences in google scholar representation and ...

    Improving gender equality in top-tier scholars and addressing gender bias in research impact are among the significant challenges in academia. However, extant research has observed that lingering gender differences still undermine female scholars. This study examines the recognition of female scholars through Google Scholar data in four different subfields of communication, focusing on two ...

  17. Publications

    1999 181 1998 148 Algorithms and Theory 1313 Data Management 166 Data Mining and Modeling 353 Distributed Systems and Parallel Computing 340 Economics and Electronic Commerce 339 Education Innovation 68 General Science 326 Hardware and Architecture 145 Health & Bioscience 349 Human-Computer Interaction and Visualization 802

  18. Search for Articles with Google Scholar

    Learn how to access the Catholic University library catalog and article databases through Google Scholar on-campus or off-campus. Find out how to search by keyword, author and title, and how to view the full text of the articles in ViewIt@CatholicU.

  19. How to Find Free Articles on Google Scholar

    Learn how to use Google Scholar to find free and credible research articles for your projects or studies. Follow the simple steps to search by keyword, check for PDF links, and filter by date or relevance.

  20. Research Guides: Finding Scholarly Articles: Home

    Review articles are another great way to find scholarly primary research articles. Review articles are not considered "primary research", but they pull together primary research articles on a topic, summarize and analyze them. In Google Scholar, click on Review Articles at the left of the search results screen. Ask your professor whether review ...

  21. The Numbers Don't Speak for Themselves: Racial Disparities and the

    Research article. First published online May 3, 2018. The Numbers Don't Speak for Themselves: Racial Disparities and the Persistence of Inequality in the Criminal Justice System ... Google Scholar. Adams G., Tormala T. T., O'Brien L. T. (2006). The effect of self-affirmation on perception of racism. Journal of Experimental Social Psychology ...

  22. Home

    Log on to scholar.google.com and click the "My Profile" link at the top of the page to get your account setup started.; On the first screen, add your affiliation information and university email address so Google Scholar can confirm your account. Add keywords that are relevant to your research interests so others can find you when browsing a subject area.

  23. Google Scholar Metrics Help

    Google Scholar Metrics provide an easy way for authors to quickly gauge the visibility and influence of recent articles in scholarly publications. Scholar Metrics summarize recent citations to many publications, to help authors as they consider where to publish their new research. To get started, you can browse the top 100 publications in ...

  24. Articles

    You are having trouble finding articles in the databases or EagleSearch. You have a known citation and want to see where it is available (freely or at ORU). (If you live off-campus, see Setting up Google Scholar to set up your Google Scholar search so that E-resources@ORU links will appear on the results list of items in our databases.)

  25. 123VCF: an intuitive and efficient tool for filtering VCF files

    Article CAS Google Scholar McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. 2016;17(1):1-4. Article Google Scholar Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR.

  26. Research Guides: KHP 644: Find Books & Articles

    Article databases allow you to access scholarly journal articles and often other resources like books, reference materials, and popular magazine and newspaper articles. Database search results are more focused and precise than those you find while web searching. Databases are also more trustworthy than the web.

  27. Finding Articles

    For a known article title, Google Scholar can be an easy way to find out if the CSUSM Library has access to the full text. This search box is connected to the library's proxy server, meaning that the results in a Google Scholar search will include links to full-text articles in the CSUSM collection.

  28. Corporate Social Responsibility Research: An Ongoing and Worthwhile

    We "tell the story" of corporate social responsibility (CSR) research by presenting a curated Collection of 19 articles published from 1973 through 2022 in all Academy of Management journals: Academy of Management Annals, Academy of Management Discoveries, Academy of Management Journal, Academy of Management Learning and Education, Academy of Management Perspectives, and Academy of ...