REVIEW article

Challenges and future directions of big data and artificial intelligence in education.

\r\nHui Luan

  • 1 Institute for Research Excellence in Learning Sciences, National Taiwan Normal University, Taipei, Taiwan
  • 2 National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan
  • 3 School of Dentistry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
  • 4 Graduate School of Education, Rutgers – The State University of New Jersey, New Brunswick, NJ, United States
  • 5 Apprendis, LLC, Berlin, MA, United States
  • 6 Department of Computer Science and Information Engineering, College of Electrical Engineering and Computer Science, National Central University, Taoyuan City, Taiwan
  • 7 Graduate School of Informatics, Kyoto University, Kyoto, Japan
  • 8 Department of Electrical Engineering, College of Technology and Engineering, National Taiwan Normal University, Taipei, Taiwan
  • 9 Centro de Tecnologia, Universidade Federal de Santa Maria, Santa Maria, Brazil
  • 10 Department of Chinese and Bilingual Studies, Faculty of Humanities, The Hong Kong Polytechnic University, Kowloon, Hong Kong
  • 11 Program of Learning Sciences, National Taiwan Normal University, Taipei, Taiwan

We discuss the new challenges and directions facing the use of big data and artificial intelligence (AI) in education research, policy-making, and industry. In recent years, applications of big data and AI in education have made significant headways. This highlights a novel trend in leading-edge educational research. The convenience and embeddedness of data collection within educational technologies, paired with computational techniques have made the analyses of big data a reality. We are moving beyond proof-of-concept demonstrations and applications of techniques, and are beginning to see substantial adoption in many areas of education. The key research trends in the domains of big data and AI are associated with assessment, individualized learning, and precision education. Model-driven data analytics approaches will grow quickly to guide the development, interpretation, and validation of the algorithms. However, conclusions from educational analytics should, of course, be applied with caution. At the education policy level, the government should be devoted to supporting lifelong learning, offering teacher education programs, and protecting personal data. With regard to the education industry, reciprocal and mutually beneficial relationships should be developed in order to enhance academia-industry collaboration. Furthermore, it is important to make sure that technologies are guided by relevant theoretical frameworks and are empirically tested. Lastly, in this paper we advocate an in-depth dialog between supporters of “cold” technology and “warm” humanity so that it can lead to greater understanding among teachers and students about how technology, and specifically, the big data explosion and AI revolution can bring new opportunities (and challenges) that can be best leveraged for pedagogical practices and learning.

Introduction

The purpose of this position paper is to present current status, opportunities, and challenges of big data and AI in education. The work has originated from the opinions and panel discussion minutes of an international conference on big data and AI in education ( The International Learning Sciences Forum, 2019 ), where prominent researchers and experts from different disciplines such as education, psychology, data science, AI, and cognitive neuroscience, etc., exchanged their knowledge and ideas. This article is organized as follows: we start with an overview of recent progress of big data and AI in education. Then we present the major challenges and emerging trends. Finally, based on our discussions of big data and AI in education, conclusion and future scope are suggested.

Rapid advancements in big data and artificial intelligence (AI) technologies have had a profound impact on all areas of human society including the economy, politics, science, and education. Thanks in large part to these developments, we are able to continue many of our social activities under the COVID-19 pandemic. Digital tools, platforms, applications, and the communications among people have generated vast amounts of data (‘big data’) across disparate locations. Big data technologies aim at harnessing the power of extensive data in real-time or otherwise ( Daniel, 2019 ). The characteristic attributes of big data are often referred to as the four V’s. That is, volume (amount of data), variety (diversity of sources and types of data), velocity (speed of data transmission and generation), and veracity (the accuracy and trustworthiness of data) ( Laney, 2001 ; Schroeck et al., 2012 ; Geczy, 2014 ). Recently, a 5th V was added, namely value (i.e., that data could be monetized; Dijcks, 2013 ). Because of intrinsic big data characteristics (the five Vs), large and complex datasets are impossible to process and utilize by using traditional data management techniques. Hence, novel and innovative computational technologies are required for the acquisition, storage, distribution, analysis, and management of big data ( Lazer et al., 2014 ; Geczy, 2015 ). Big data analytics commonly encompasses the processes of gathering, analyzing, and evaluating large datasets. Extraction of actionable knowledge and viable patterns from data are often viewed as the core benefits of the big data revolution ( Mayer-Schönberger and Cukier, 2013 ; Jagadish et al., 2014 ). Big data analytics employ a variety of technologies and tools, such as statistical analysis, data mining, data visualization, text analytics, social network analysis, signal processing, and machine learning ( Chen and Zhang, 2014 ).

As a subset of AI, machine learning focuses on building computer systems that can learn from and adapt to data automatically without explicit programming ( Jordan and Mitchell, 2015 ). Machine learning algorithms can provide new insights, predictions, and solutions to customize the needs and circumstances of each individual. With the availability of large quantity and high-quality input training data, machine learning processes can achieve accurate results and facilitate informed decision making ( Manyika et al., 2011 ; Gobert et al., 2012 , 2013 ; Gobert and Sao Pedro, 2017 ). These data-intensive, machine learning methods are positioned at the intersection of big data and AI, and are capable of improving the services and productivity of education, as well as many other fields including commerce, science, and government.

Regarding education, our main area of interest here, the application of AI technologies can be traced back to approximately 50 years ago. The first Intelligent Tutoring System “SCHOLAR” was designed to support geography learning, and was capable of generating interactive responses to student statements ( Carbonell, 1970 ). While the amount of data was relatively small at that time, it was comparable to the amount of data collected in other traditional educational and psychological studies. Research on AI in education over the past few decades has been dedicated to advancing intelligent computing technologies such as intelligent tutoring systems ( Graesser et al., 2005 ; Gobert et al., 2013 ; Nye, 2015 ), robotic systems ( Toh et al., 2016 ; Anwar et al., 2019 ), and chatbots ( Smutny and Schreiberova, 2020 ). With the breakthroughs in information technologies in the last decade, educational psychologists have had greater access to big data. Concretely speaking, social media (e.g., Facebook, Twitter), online learning environments [e.g., Massive Open Online Courses (MOOCs)], intelligent tutoring systems (e.g., AutoTutor), learning management systems (LMSs), sensors, and mobile devices are generating ever-growing amounts of dynamic and complex data containing students’ personal records, physiological data, learning logs and activities, as well as their learning performance and outcomes ( Daniel, 2015 ). Learning analytics, described as “the measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs” ( Long and Siemens, 2011 , p. 34), are often implemented to analyze these huge amounts of data ( Aldowah et al., 2019 ). Machine learning and AI techniques further expand the capabilities of learning analytics ( Zawacki-Richter et al., 2019 ). The essential information extracted from big data could be utilized to optimize learning, teaching, and administration ( Daniel, 2015 ). Hence, research on big data and AI is gaining increasing significance in education ( Johnson et al., 2011 ; Becker et al., 2017 ; Hwang et al., 2018 ) and psychology ( Harlow and Oswald, 2016 ; Yarkoni and Westfall, 2017 ; Adjerid and Kelley, 2018 ; Cheung and Jak, 2018 ). Recently, the adoption of big data and AI in the psychology of learning and teaching has been trending as a novel method in cutting-edge educational research ( Daniel, 2015 ; Starcic, 2019 ).

The Position Formulation

A growing body of literature has attempted to uncover the value of big data at different education levels, from preschool to higher education ( Chen N.-S. et al., 2020 ). Several journal articles and book chapters have presented retrospective descriptions and the latest advances in the rapidly expanding research area from different angles, including systematic literature review ( Zawacki-Richter et al., 2019 ; Quadir et al., 2020 ), bibliometric study ( Hinojo-Lucena et al., 2019 ), qualitative analysis ( Malik et al., 2019 ; Chen L. et al., 2020 ), and social network analysis ( Goksel and Bozkurt, 2019 ). More details can be found in the previously mentioned reviews. In this paper, we aim at presenting the current progress of the application of big data and AI in education. By and large, the research on the learner side is devoted to identifying students’ learning and affective behavior patterns and profiles, improving methods of assessment and evaluation, predicting individual students’ learning performance or dropouts, and providing adaptive systems for personalized support ( Papamitsiou and Economides, 2014 ; Zawacki-Richter et al., 2019 ). On the teacher side, numerous studies have attempted to enhance course planning and curriculum development, evaluation of teaching, and teaching support ( Zawacki-Richter et al., 2019 ; Quadir et al., 2020 ). Additionally, teacher dashboards, such as Inq-Blotter, driven by big data techniques are being used to inform teachers’ instruction in real time while students simultaneously work in Inq-ITS ( Gobert and Sao Pedro, 2017 ; Mislevy et al., 2020 ). Big data technologies employing learning analytics and machine learning have demonstrated high predictive accuracy of students’ academic performance ( Huang et al., 2020 ). Only a small number of studies have focused on the effectiveness of learning analytics programs and AI applications. However, recent findings have revealed encouraging results in terms of improving students’ academic performance and retention, as well as supporting teachers in learning design and teaching strategy refinement ( Viberg et al., 2018 ; Li et al., 2019 ; Sonderlund et al., 2019 ; Mislevy et al., 2020 ).

Despite the growing number of reports and methods outlining implementations of big data and AI technologies in educational environments, we see a notable gap between contemporary technological capabilities and their utilization for education. The fast-growing education industry has developed numerous data processing techniques and AI applications, which may not be guided by current theoretical frameworks and research findings from psychology of learning and teaching. The rapid pace of technological progress and relatively slow educational adoption have contributed to the widening gap between technology readiness and its application in education ( Macfadyen, 2017 ). There is a pressing need to reduce this gap and stimulate technological adoption in education. This work presents varying viewpoints and their controversial issues, contemporary research, and prospective future developments in adoption of big data and AI in education. We advocate an interdisciplinary approach that encompasses educational, technological, and governmental spheres of influence. In the educational domain, there is a relative lack of knowledge and skills in AI and big data applications. On the technological side, few data scientists and AI developers are familiar with the advancements in education psychology, though this is changing with the advent of graduate programs at the intersection of Learning Sciences and Computer Science. Finally, in terms of government policies, the main challenges faced are the regulatory and ethical dilemmas between support of educational reforms and restrictions on adoptions of data-oriented technologies.

An Interdisciplinary Approach to Educational Adoption of Big Data and AI

In response to the new opportunities and challenges that the big data explosion and AI revolution are bringing, academics, educators, policy-makers, and professionals need to engage in productive collaboration. They must work together to cultivate our learners’ necessary competencies and essential skills important for the 21st century work, driven by the knowledge economy ( Bereiter, 2002 ). Collaboration across diverse disciplines and sectors is a demanding task—particularly when individual sides lack a clear vision of their mutually beneficial interests and the necessary knowledge and skills to realize that vision. We highlight several overlapping spheres of interest at the intersection of research, policy-making, and industry engagements. Researchers and the industry would benefit from targeted educational technology development and its efficient transfer to commercial products. Businesses and governments would benefit from legislature that stimulates technology markets while suitably protecting data and users’ privacy. Academics and policy makers would benefit from prioritizing educational reforms enabling greater adoption of technology-enhanced curricula. The recent developments and evolving future trends at intersections between researchers, policy-makers, and industry stakeholders arising from advancements and deployments of big data and AI technologies in education are illustrated in Figure 1 .

www.frontiersin.org

Figure 1. Contemporary developments and future trends at the intersections between research, policy, and industry driven by big data and AI advances in education.

The constructive domains among stakeholders progressively evolve along with scientific and technological developments. Therefore, it is important to reflect on longer-term projections and challenges. The following sections highlight the novel challenges and future directions of big data and AI technologies at the intersection of education research, policy-making, and industry.

Big Data and AI in Education: Research

An understanding of individual differences is critical for developing pedagogical tools to target specific students and to tailor education to individual needs at different stages. Intelligent educational systems employing big data and AI techniques are capable of collecting accurate and rich personal data. Data analytics can reveal students’ learning patterns and identify their specific needs ( Gobert and Sao Pedro, 2017 ; Mislevy et al., 2020 ). Hence, big data and AI have the potential to realize individualized learning to achieve precision education ( Lu et al., 2018 ). We see the following emerging trends, research gaps, and controversies in integrating big data and AI into education research so that there is a deep and rigorous understanding of individual differences that can be used to personalize learning in real time and at scale.

(1) Education is progressively moving from a one-size-fits-all approach to precision education or personalized learning ( Lu et al., 2018 ; Tsai et al., 2020 ). The one-size-fits-all approach was designed for average students, whereas precision education takes into consideration the individual differences of learners in their learning environments, along with their learning strategies. The main idea of precision education is analogous to “precision medicine,” where researchers harvest big data to identify patterns relevant to specific patients such that prevention and treatment can be customized. Based on the analysis of student learning profiles and patterns, precision education predicts students’ performance and provides timely interventions to optimize learning. The goal of precision education is to improve the diagnosis, prediction, treatment, and prevention of learning outcomes ( Lu et al., 2018 ). Contemporary research gaps related to adaptive tools and personalized educational experiences are impeding the transition to precision education. Adaptive educational tools and flexible learning systems are needed to accommodate individual learners’ interaction, pace, and learning progress, and to fit the specific needs of the individual learners, such as students with learning disabilities ( Xie et al., 2019 ; Zawacki-Richter et al., 2019 ). Hence, as personalized learning is customized for different people, researchers are able to focus on individualized learning that is adaptive to individual needs in real time ( Gobert and Sao Pedro, 2017 ; Lu et al., 2018 ).

(2) The research focus on deploying AI in education is gradually shifting from a computational focus that demonstrates use cases of new technology to cognitive focus that incorporates cognition in its design, such as perception ( VanRullen, 2017 ), emotion ( Song et al., 2016 ), and cognitive thinking ( Bramley et al., 2017 ). Moreover, it is also shifting from a single domain (e.g., domain expertise, or expert systems) to a cross-disciplinary approach through collaboration ( Spikol et al., 2018 ; Krouska et al., 2019 ) and domain transfers ( L’heureux et al., 2017 ). These controversial shifts are facilitating transitions from the knowing of the unknown (gaining insights through reasoning) to the unknown of the unknown (figuring out hidden values and unknown results through algorithms) ( Abed Ibrahim and Fekete, 2019 ; Cutumisu and Guo, 2019 ). In other words, deterministic learning, aimed at deductive/inductive reasoning and inference engines, predominated in traditional expert systems and old AI. Whereas, today, dynamic and stochastic learning, the outcome of which involves some randomness and uncertainty, is gradually becoming the trend in modern machine learning techniques.

(3) The format of machine-generated data and the purpose of machine learning algorithms should be carefully designed. There is a notable gap between theoretical design and its applicability. A theoretical model is needed to guide the development, interpretation, and validation of algorithms ( Gobert et al., 2013 ; Hew et al., 2019 ). The outcomes of data analytics and algorithmically generated evidence must be shared with educators and applied with caution. For instance, efforts to algorithmically detect mental states such as boredom, frustration, and confusion ( Baker et al., 2010 ) must be supported by the operational definitions and constructs that have been prudently evaluated. Additionally, the affective data collected by AI systems should take into account the cultural differences combined with contextual factors, teachers’ observations, and students’ opinions ( Yadegaridehkordi et al., 2019 ). Data need to be informatively and qualitatively balanced, in order to avoid implicit biases that may propagate into algorithms trained on such data ( Staats, 2016 ).

(4) There are ethical and algorithmic challenges when balancing human provided learning and machine assisted learning. The significant influence of AI and contemporary technologies is a double-edged sword ( Khechine and Lakhal, 2018 ). On the one hand, it facilitates better usability and drives progress. On the other, it might lead to the algorithmic bias and loss of certain essential skills among students who are extensively relying on technology. For instance, in creativity- or experience-based learning, technology may even become an obstacle to learning, since it may hinder students from attaining first-hand experiences and participating in the learning activities ( Cuthbertson et al., 2004 ). Appropriately balancing the technology adoption and human involvement in various educational contexts will be a challenge in the foreseeable future. Nonetheless, the convergence of human and machine learning has the potential for highly effective teaching and learning beyond the simple “sum of the parts of human and artificial intelligence” ( Topol, 2019 ).

(5) Algorithmic bias is another controversial issue ( Obermeyer et al., 2019 ). Since modern AI algorithms extensively rely on data, their performance is governed solely by data. Algorithms adapt to inherent qualitative and quantitative characteristics of data. For example, if data is unbalanced and contains disproportionately better information on students from general population in comparison to minorities, the algorithms may produce systematic and repeatable errors disadvantaging minorities. These controversial issues need to be addressed before its wide implementation in education practice since every single student is precious. More rigorous studies and validation in real learning environments are required though work along these lines is being done ( Sao Pedro et al., 2013 ).

(6) The fast expansion of technology and inequalities of learning opportunities has aroused great controversies. Due to the exponential nature of technological progress, particularly big data and AI revolution, a fresh paradigm and new learning landscape are on the horizon. For instance, the elite smartphone 10 years ago, in 2010, was BlackBerry. Today, 10 years later, even in sub-Saharan Africa, 75% of the population has mobile phones several generations more advanced ( GSMA Intelligence, 2020 ). Hence, the entry barriers are shifting from the technical requirements to the willingness of and/or need for adoption. This has been clearly demonstrated during the COVID-19 pandemic. The need for social distancing and continuing education has led to online/e-learning deployments within months ( United Nations, 2020 ). A huge amount of learning data is created accordingly. The extraction of meaningful patterns and the discovery of knowledge from these data is expected to be carried out through learning analytics and AI techniques. Inevitably, the current learning cultures, learning experiences, and classroom dynamics are changing as “we live algorithmic lives” ( Bucher, 2018 ). Thus, there is a critical need to adopt proper learning theories of educational psychology and to encourage our learners to be active participants rather than passive recipients or merely tracked objects ( Loftus and Madden, 2020 ). For example, under the constructionist framework ( Tsai, 2000 ), the technology-enhanced or AI-powered education may empower students to know their learning activities and patterns, predict their possible learning outcomes, and strategically regulate their learning behavior ( Koh et al., 2014 ; Loftus and Madden, 2020 ). On the other hand, in the era of information explosion and AI revolution, the disadvantaged students and developing countries are indeed facing a wider digital divide. To reduce the inequalities and bring more opportunities, cultivating young people’s competencies is seemed like one of the most promising means ( UNESCO, 2015 ). Meanwhile, overseas support from international organizations such as World Bank and UNESCO are imperative for developing countries in their communication infrastructure establishment (e.g., hardware, software, connectivity, electricity). Naturally, technology will not replace or hinder human learning; rather, a smart use of new technologies will facilitate transfer and acquisition of knowledge ( Azevedo et al., 2019 ).

An overarching theme from the above trends of research is that we need theories of cognitive and educational psychology to guide our understanding of the individual learner (and individual differences), in order to develop best tools, algorithms, and practices for personalized learning. Take, for example, VR (virtual reality) or AR (augmented reality) as a fast-developing technology for education. The industry has developed many different types of VR/AR applications (e.g., Google Expeditions with over 100 virtual field trips), but these have typically been developed in the views of the industry (see further discussion below) and may not be informed by theories and data from educational psychology about how students actually learn. To make VR/AR effective learning tools, we must separate the technological features from the human experiences and abilities (e.g., cognitive, linguistic, spatial abilities of the learner; see Li et al., 2020 ). For example, VR provides a high-fidelity 3D real-life virtual environment, and the technological tools are built on the assumption that 3D realism enables the learner to gain ‘perceptual grounding’ during learning (e.g., having access to visual, auditory, tactile experiences as in real world). Following the ‘embodied cognition’ theory ( Barsalou, 2008 ), we should expect VR learning to yield better learning outcomes compared with traditional classroom learning. However, empirical data suggest that there are significant individual differences in that some students benefit more than others from VR learning. It may be that the individuals with higher cognitive and perceptual abilities need no additional visuospatial information (provided in VR) to succeed in learning. In any case, we need to understand how embodied experiences (provided by the technology) interact with different learners’ inherent abilities (as well as their prior knowledge and background) for the best application of the relevant technology in education.

Big Data and AI in Education: Policy-Making

Following the revolution triggered by breakthroughs in big data and AI technology, policy-makers have attempted to formulate strategies and policies regarding how to incorporate AI and emerging technologies into primary, secondary, and tertiary education ( Pedró et al., 2019 ). Major challenges must be overcome in order to suitably integrate big data and AI into educational practice. The following three segments highlight pertinent policy-oriented challenges, gaps, and evolving trends.

(1) In digitally-driven knowledge economies, traditional formal education systems are undergoing drastic changes or even a paradigm shift ( Peters, 2018 ). Lifelong learning is quickly being adopted and implemented through online or project-based learning schemes that incorporate multiple ways of teaching ( Lenschow, 1998 ; Sharples, 2000 ; Field, 2001 ; Koper and Tattersall, 2004 ). This new concept of continual education will require micro-credits or micro-degrees to sustain learners’ efforts ( Manuel Moreno-Marcos et al., 2019 ). The need to change the scope and role of education will become evident in the near future ( Williams, 2019 ). For example, in the next few years, new instruction methods, engagement, and assessment will need to be developed in formal education to support lifelong education. The system should be based on micro-credits or micro-degrees.

(2) Solutions for integrating cutting-edge research findings, innovative theory-driven curricula, and emerging technologies into students’ learning are evidently beneficial, and perhaps even ready for adoption. However, there is an apparent divergence between a large number of pre-service and in-service teachers and their willingness to support and adopt these emerging technologies ( Pedró et al., 2019 ). Pre-service teachers have greater exposure to modern technologies and, in general, are more willing to adopt them. In-service teachers have greater practical experience and tend to more rely on it. To bridge the gap, effective teacher education programs and continuing education programs have to be developed and offered to support the adoption of these new technologies so that they can be implemented with fidelity ( O’Donnell, 2008 ). This issue could become even more pressing to tackle in light of the extended period of the COVID-19 pandemic.

(3) A suitable legislative framework is needed to protect personal data from unscrupulous collection, unauthorized disclosure, commercial exploitation, and other abuses ( Boyd and Crawford, 2012 ; Pardo and Siemens, 2014 ). Education records and personal data are highly sensitive. There are significant risks associated with students’ educational profiles, records, and other personal data. Appropriate security measures must be adopted by educational institutions. Commercial educational system providers are actively exploiting both legislative gaps and concealed data acquisition channels. Increasing numbers of industry players are implementing data-oriented business models ( Geczy, 2018 ). There is a vital role to play for legislative, regulatory, and enforcing bodies at both the national and local levels. It is pertinent that governments enact, implement, and enforce privacy and personal data protection legislation and measures. In doing so, there is a need to strike a proper balance between desirable use of personal data for educational purposes and undesirable commercial monetization and abuse of personal data.

Big Data and AI in Education: Industry

As scientific and academic aspects of big data and AI in education have their unique challenges, so does the commercialization of educational tools and systems ( Renz et al., 2020 ). Numerous countries have attempted to stimulate innovation-based growth through enhancing technology transfer and fostering academia-industry collaboration ( Huggins and Thompson, 2015 ). In the United States, this was initiated by the Bayh-Dole Act ( Mowery et al., 2001 ). Building a reciprocal and sustained partnership is strongly encouraged. It facilitates technology transfers and strengthens the links between academia and the education industry. There are several points to be considered when approaching academia-industry collaboration. It is important that collaboration is mutually beneficial. The following points highlight the overlapping spheres of benefits for both educational commerce and academia. They also expose existing gaps and future prospects.

(1) Commercializing intelligent educational tools and systems that include the latest scientific and technological advances can provide educators with tools for developing more effective curricula, pedagogical frameworks, assessments, and programs. Timely release of educational research advances onto commercial platforms is desirable by vendors from development, marketing, and revenue perspectives ( Renz and Hilbig, 2020 ). Implementation of the latest research enables progressive development of commercial products and distinctive differentiation for marketing purposes. This could also potentially solve the significant gap between what the industry knows and develops and what the academic research says with regard to student learning. Novel features may also be suitably monetized—hence, expanding revenue streams. The gaps between availability of the latest research and its practical adoption are slowing progress and negatively impacting commercial vendors. A viable solution is a closer alignment and/or direct collaboration between academia and industry.

(2) A greater spectrum of commercially and freely available tools helps maintain healthy market competition. It also helps to avoid monopolies and oligopolies that stifle innovation, limit choices, and damage markets for educational tools. Some well-stablished or free-of-charge platforms (e.g., Moodle, LMS) might show such potential of oligopolies during the COVID-19 pandemic. With more tools available on the market, educators and academics may explore novel avenues for improving education and research. New and more effective forms of education may be devised. For instance, multimodal virtual educational environments have high potential future prospects. These are environments that would otherwise be impossible in conventional physical settings (see previous discussion of VR/AR). Expanding educational markets and commerce should inevitably lead to expanding resources for research and development funding ( Popenici and Kerr, 2017 ). Collaborative research projects sponsored by the industry should provide support and opportunities for academics to advance educational research. Controversially, in numerous geographies there is a decreasing trend in collaborative research. To reverse the trend, it is desirable that academic researchers and industry practitioners increase their engagements via mutual presentations, educations, and even government initiatives. All three stakeholders (i.e., academia, industry, and government) should play more active roles.

(3) Vocational and practical education provides numerous opportunities for fruitful academia-industry collaboration. With the changing nature of work and growing technology adoption, there is an increasing demand for radical changes in vocational education—for both teachers and students ( World Development and Report, 2019 ). Domain knowledge provided by teachers is beneficially supplemented by AI-assisted learning environments in academia. Practical skills are enhanced in industrial environments with hands-on experience and feedback from both trainers and technology tools. Hence, students benefit from acquiring domain knowledge and enhancing their skills via interactions with human teachers and trainers. Equally, they benefit from gaining the practical skills via interactions with simulated and real-world technological environments. Effective vocational training demands teachers and trainers on the human-learning side, and AI environments and actual technology tools on machine-learning side. Collaboration between academia and industry, as well as balanced human and machine learning approaches are pertinent for vocational education.

Discussion and Conclusion

Big data and AI have enormous potential to realize highly effective learning and teaching. They stimulate new research questions and designs, exploit innovative technologies and tools in data collection and analysis, and ultimately become a mainstream research paradigm ( Daniel, 2019 ). Nonetheless, they are still fairly novel and unfamiliar to many researchers and educators. In this paper, we have described the general background, core concepts, and recent progress of this rapidly growing domain. Along with the arising opportunities, we have highlighted the crucial challenges and emerging trends of big data and AI in education, which are reflected in educational research, policy-making, and industry. Table 1 concisely summarizes the major challenges and possible solutions of big data and AI in education. In summary, future studies should be aimed at theory-based precision education, incorporating cross-disciplinary application, and appropriately using educational technologies. The government should be devoted to supporting lifelong learning, offering teacher education programs, and protecting personal data. With regard to the education industry, reciprocal and mutually beneficial relationships should be developed in order to enhance academia-industry collaboration.

www.frontiersin.org

Table 1. Major challenges and possible solutions for integrating big data and AI into education.

Regarding the future development of big data and AI, we advocate an in-depth dialog between the supporters of “cold” technology and “warm” humanity so that users of technology can benefit from its capacity and not see it as a threat to their livelihood. An equally important issue is that overreliance on technology may lead to an underestimation of the role of humans in education. Remember the fundamental role of schooling: the school is a great equalizer as well as a central socialization agent. We need to better understand the role of social and affective processing (e.g., emotion, motivation) in addition to cognitive processing in student learning successes (or failures). After all, human learning is a social behavior, and a number of key regions in our brains are wired to be socially engaged (see Li and Jeong, 2020 for a discussion).

It has been estimated that approximately half of the current routine jobs might be automated in the near future ( Frey and Osborne, 2017 ; World Development and Report, 2019 ). However, the teacher’s job could not be replaced. The teacher-student relationship is indispensable in students’ learning, and inspirational in students’ personal growth ( Roorda et al., 2011 ; Cheng and Tsai, 2019 ). On the other hand, new developments in technologies will enable us to collect and analyze large-scale, multimodal, and continuous real-time data. Such data-intensive and technology-driven analysis of human behavior, in real-world and simulated environments, may assist teachers in identifying students’ learning trajectories and patterns, developing corresponding lesson plans, and adopting effective teaching strategies ( Klašnja-Milicevic et al., 2017 ; Gierl and Lai, 2018 ). It may also support teachers in tackling students’ more complex problems and cultivating students’ higher-order thinking skills by freeing the teachers from their monotonous and routine tasks ( Li, 2007 ; Belpaeme et al., 2018 ). Hence, it is now imperative for us to embrace AI and technology and prepare our teachers and students for the future of AI-enhanced and technology-supported education.

The adoption of big data and AI in learning and teaching is still in its infancy and limited by technological and mindset challenges for now; however, the convergence of developments in psychology, data science, and computer science shows great promise in revolutionizing educational research, practice, and industry. We hope that the latest achievements and future directions presented in this paper will advance our shared goal of helping learners and teachers pursue sustainable development.

Author Contributions

HLu wrote the initial draft of the manuscript. PG, HLa, JG, and PL revised the drafts and provided theoretical background. SY, HO, JB, and RG contributed content for the original draft preparation of the manuscript. C-CT provided theoretical focus, design, draft feedback, and supervised throughout the research. All authors contributed to the article and approved the submitted version.

This work was financially supported by the Institute for Research Excellence in Learning Sciences of National Taiwan Normal University (NTNU) from the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan.

Conflict of Interest

JG was employed by company Apprendis, LLC, Berlin.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abed Ibrahim, L., and Fekete, I. (2019). What machine learning can tell us about the role of language dominance in the diagnostic accuracy of german litmus non-word and sentence repetition tasks. Front. Psychol. 9:2757. doi: 10.3389/fpsyg.2018.02757

CrossRef Full Text | Google Scholar

Adjerid, I., and Kelley, K. (2018). Big data in psychology: a framework for research advancement. Am. Psychol. 73, 899–917. doi: 10.1037/amp0000190

PubMed Abstract | CrossRef Full Text | Google Scholar

Aldowah, H., Al-Samarraie, H., and Fauzy, W. M. (2019). Educational data mining and learning analytics for 21st century higher education: a review and synthesis. Telemat. Inform. 37, 13–49. doi: 10.1016/j.tele.2019.01.007

Anwar, S., Bascou, N. A., Menekse, M., and Kardgar, A. (2019). A systematic review of studies on educational robotics. J. Pre-College Eng. Educ. Res. (J-PEER) 9, 19–42. doi: 10.7771/2157-9288.1223

Azevedo, J. P. W. D., Crawford, M. F., Nayar, R., Rogers, F. H., Barron Rodriguez, M. R., Ding, E. Y. Z., et al. (2019). Ending Learning Poverty: What Will It Take?. Washington, D.C: The World Bank.

Google Scholar

Baker, R. S. J. D., D’Mello, S. K., Rodrigo, M. M. T., and Graesser, A. C. (2010). Better to be frustrated than bored: the incidence, persistence, and impact of learners’ cognitive-affective states during interactions with three different computer-based learning environments. Int. J. Human-Comp. Stud. 68, 223–241. doi: 10.1016/j.ijhcs.2009.12.003

Barsalou, L. W. (2008). “Grounding symbolic operations in the brain’s modal systems,” in Embodied Grounding: Social, Cognitive, Affective, and Neuroscientific Approaches , eds G. R. Semin and E. R. Smith (Cambridge: Cambridge University Press), 9–42. doi: 10.1017/cbo9780511805837.002

Becker, S. A., Cummins, M., Davis, A., Freeman, A., Hall, C. G., and Ananthanarayanan, V. (2017). NMC Horizon Report: 2017 Higher Education Edition. Austin, TX: The New Media Consortium.

Belpaeme, T., Kennedy, J., Ramachandran, A., Scassellati, B., and Tanaka, F. (2018). Social robots for education: a review. Sci. Robot. 3:eaat5954. doi: 10.1126/scirobotics.aat5954

Bereiter, C. (2002). Education and MIND in the Knowledge Age. Mahwah, NJ: LEA.

Boyd, D., and Crawford, K. (2012). Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon. Inform. Commun. Soc. 15, 662–679. doi: 10.1080/1369118x.2012.678878

Bramley, N. R., Dayan, P., Griffiths, T. L., and Lagnado, D. A. (2017). Formalizing Neurath’s ship: approximate algorithms for online causal learning. Psychol. Rev. 124, 301–338. doi: 10.1037/rev0000061

Bucher, T. (2018). If Then: Algorithmic Power and Politics. New York, NY: Oxford University Press.

Carbonell, J. R. (1970). AI in CAI: an artificial-intelligence approach to computer-assisted instruction. IEEE Trans. Man-Machine Sys. 11, 190–202. doi: 10.1109/TMMS.1970.299942

Chen, C. P., and Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inform. Sci. 275, 314–347. doi: 10.1016/j.ins.2014.01.015

Chen, L., Chen, P., and Lin, Z. (2020). Artificial intelligence in education: a review. IEEE Access 8, 75264–75278. doi: 10.1109/ACCESS.2020.2988510

Chen, N.-S., Yin, C., Isaias, P., and Psotka, J. (2020). Educational big data: extracting meaning from data for smart education. Interact. Learn. Environ. 28, 142–147. doi: 10.1080/10494820.2019.1635395

Cheng, K.-H., and Tsai, C.-C. (2019). A case study of immersive virtual field trips in an elementary classroom: students’ learning experience and teacher-student interaction behaviors. Comp. Educ. 140:103600. doi: 10.1016/j.compedu.2019.103600

Cheung, M. W.-L., and Jak, S. (2018). Challenges of big data analyses and applications in psychology. Zeitschrift Fur Psychol. J. Psychol. 226, 209–211. doi: 10.1027/2151-2604/a000348

Cuthbertson, B., Socha, T. L., and Potter, T. G. (2004). The double-edged sword: critical reflections on traditional and modern technology in outdoor education. J. Adv. Educ. Outdoor Learn. 4, 133–144. doi: 10.1080/14729670485200491

Cutumisu, M., and Guo, Q. (2019). Using topic modeling to extract pre-service teachers’ understandings of computational thinking from their coding reflections. IEEE Trans. Educ. 62, 325–332. doi: 10.1109/te.2019.2925253

Daniel, B. (2015). Big data and analytics in higher education: opportunities and challenges. Br. J. Educ. Technol. 46, 904–920. doi: 10.1111/bjet.12230

Daniel, B. K. (2019). Big data and data science: a critical review of issues for educational research. Br. J. Educ. Technol. 50, 101–113. doi: 10.1111/bjet.12595

Dijcks, J. (2013). Oracle: Big data for the enterprise. Oracle White Paper . Redwood Shores, CA: Oracle Corporation.

Field, J. (2001). Lifelong education. Int. J. Lifelong Educ. 20, 3–15. doi: 10.1080/09638280010008291

Frey, C. B., and Osborne, M. A. (2017). The future of employment: how susceptible are jobs to computerisation? Technol. Forecast. Soc. Change 114, 254–280. doi: 10.1016/j.techfore.2016.08.019

Geczy, P. (2014). Big data characteristics. Macrotheme Rev. 3, 94–104.

Geczy, P. (2015). Big data management: relational framework. Rev. Bus. Finance Stud. 6, 21–30.

Geczy, P. (2018). Data-Oriented business models: gaining competitive advantage. Global J. Bus. Res. 12, 25–36.

Gierl, M. J., and Lai, H. (2018). Using automatic item generation to create solutions and rationales for computerized formative testing. Appl. Psychol. Measurement 42, 42–57. doi: 10.1177/0146621617726788

Gobert, J., Sao Pedro, M., Raziuddin, J., and Baker, R. S. (2013). From log files to assessment metrics for science inquiry using educational data mining. J. Learn. Sci. 22, 521–563. doi: 10.1080/10508406.2013.837391

Gobert, J. D., and Sao Pedro, M. A. (2017). “Digital assessment environments for scientific inquiry practices,” in The Wiley Handbook of Cognition and Assessment , eds A. A. Rupp and J. P. Leighton (West Sussex: Frameworks, Methodologies, and Applications), 508–534. doi: 10.1002/9781118956588.ch21

Gobert, J. D., Sao Pedro, M. A., Baker, R. S., Toto, E., and Montalvo, O. (2012). Leveraging educational data mining for real-time performance assessment of scientific inquiry skills within microworlds. J. Educ. Data Min. 4, 104–143. doi: 10.5281/zenodo.3554645

Goksel, N., and Bozkurt, A. (2019). “Artificial intelligence in education: current insights and future perspectives,” in Handbook of Research on Learning in the Age of Transhumanism , eds S. Sisman-Ugur and G. Kurubacak (Hershey, PA: IGI Global), 224–236 doi: 10.4018/978-1-5225-8431-5.ch014

Graesser, A. C., Chipman, P., Haynes, B. C., and Olney, A. (2005). AutoTutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans. Educ. 48, 612–618. doi: 10.1109/te.2005.856149

GSMA Intelligence (2020). The Mobile Economy 2020 . London: GSM Association.

Harlow, L. L., and Oswald, F. L. (2016). Big data in psychology: introduction to the special issue. Psychol. Methods 21, 447–457. doi: 10.1037/met0000120

Hew, K. F., Lan, M., Tang, Y., Jia, C., and Lo, C. K. (2019). Where is the “theory” within the field of educational technology research? Br. J. Educ. Technol. 50, 956–971. doi: 10.1111/bjet.12770

Hinojo-Lucena, F. J., Aznar-Díaz, I., Cáceres-Reche, M. P., and Romero-Rodríguez, J. M. (2019). Artificial intelligence in higher education: a bibliometric study on its impact in the scientific literature. Educ. Sci. 9:51. doi: 10.3390/educsci9010051

Huang, A. Y., Lu, O. H., Huang, J. C., Yin, C., and Yang, S. J. (2020). Predicting students’ academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Int. Learn. Environ. 28, 206–230. doi: 10.1080/10494820.2019.1636086

Huggins, R., and Thompson, P. (2015). Entrepreneurship, innovation and regional growth: a network theory. Small Bus. Econ. 45, 103–128. doi: 10.1007/s11187-015-9643-3

Hwang, G.-J., Spikol, D., and Li, K.-C. (2018). Guest editorial: trends and research issues of learning analytics and educational big data. Educ. Technol. Soc. 21, 134–136.

Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., et al. (2014). Big data and its technical challenges. Commun. ACM. 57, 86–94. doi: 10.1145/2611567

Johnson, L., Smith, R., Willis, H., Levine, A., and Haywood, K. (2011). The 2011 Horizon Report. Austin, TX: The New Media Consortium.

Jordan, M. I., and Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science 349, 255–260. doi: 10.1126/science.aaa8415

Khechine, H., and Lakhal, S. (2018). Technology as a double-edged sword: from behavior prediction with UTAUT to students’ outcomes considering personal characteristics. J. Inform. Technol. Educ. Res. 17, 63–102. doi: 10.28945/4022

Klašnja-Milicevic, A., Ivanovic, M., and Budimac, Z. (2017). Data science in education: big data and learning analytics. Comput. Applicat. Eng. Educ. 25, 1066–1078. doi: 10.1002/cae.21844

Koh, J. H. L., Chai, C. S., and Tsai, C. C. (2014). Demographic factors, TPACK constructs, and teachers’ perceptions of constructivist-oriented TPACK. J. Educ. Technol. Soc. 17, 185–196.

Koper, R., and Tattersall, C. (2004). New directions for lifelong learning using network technologies. Br. J. Educ. Technol. 35, 689–700. doi: 10.1111/j.1467-8535.2004.00427.x

Krouska, A., Troussas, C., and Virvou, M. (2019). SN-Learning: an exploratory study beyond e-learning and evaluation of its applications using EV-SNL framework. J. Comp. Ass. Learn. 35, 168–177. doi: 10.1111/jcal.12330

Laney, D. (2001). 3D data management: controlling data volume, velocity and variety. META Group Res. Note 6, 70–73.

Lazer, D., Kennedy, R., King, G., and Vespignani, A. (2014). The parable of Google Flu: traps in big data analysis. Science 343, 1203–1205. doi: 10.1126/science.1248506

Lenschow, R. J. (1998). From teaching to learning: a paradigm shift in engineering education and lifelong learning. Eur. J. Eng. Educ. 23, 155–161. doi: 10.1080/03043799808923494

L’heureux, A., Grolinger, K., Elyamany, H. F., and Capretz, M. A. (2017). Machine learning with big data: challenges and approaches. IEEE Access 5, 7776–7797. doi: 10.1109/ACCESS.2017.2696365

Li, H., Gobert, J., and Dickler, R. (2019). “Evaluating the transfer of scaffolded inquiry: what sticks and does it last?,” in Artificial Intelligence in Education , eds S. Isotani, E. Millán, A. Ogan, P. Hastings, B. McLaren, and R. Luckin (Cham: Springer), 163–168. doi: 10.1007/978-3-030-23207-8_31

Li, P., and Jeong, H. (2020). The social brain of language: grounding second language learning in social interaction. npj Sci. Learn. 5:8. doi: 10.1038/s41539-020-0068-7

Li, P., Legault, J., Klippel, A., and Zhao, J. (2020). Virtual reality for student learning: understanding individual differences. Hum. Behav. Brain 1, 28–36. doi: 10.37716/HBAB.2020010105

Li, X. (2007). Intelligent agent–supported online education. Dec. Sci. J. Innovat. Educ. 5, 311–331. doi: 10.1111/j.1540-4609.2007.00143.x

Loftus, M., and Madden, M. G. (2020). A pedagogy of data and Artificial intelligence for student subjectification. Teach. Higher Educ. 25, 456–475. doi: 10.1080/13562517.2020.1748593

Long, P., and Siemens, G. (2011). Penetrating the fog: analytics in learning and education. Educ. Rev. 46, 31–40. doi: 10.1007/978-3-319-38956-1_4

Lu, O. H. T., Huang, A. Y. Q., Huang, J. C. H., Lin, A. J. Q., Ogata, H., and Yang, S. J. H. (2018). Applying learning analytics for the early prediction of students’ academic performance in blended learning. Educ. Technol. Soc. 21, 220–232.

Macfadyen, L. P. (2017). Overcoming barriers to educational analytics: how systems thinking and pragmatism can help. Educ. Technol. 57, 31–39.

Malik, G., Tayal, D. K., and Vij, S. (2019). “An analysis of the role of artificial intelligence in education and teaching,” in Recent Findings in Intelligent Computing Techniques. Advances in Intelligent Systems and Computing , eds P. Sa, S. Bakshi, I. Hatzilygeroudis, and M. Sahoo (Singapore: Springer), 407–417.

Manuel Moreno-Marcos, P., Alario-Hoyos, C., Munoz-Merino, P. J., and Delgado Kloos, C. (2019). Prediction in MOOCs: a review and future research directions. IEEE Trans. Learn. Technol. 12, 384–401. doi: 10.1109/TLT.2018.2856808

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., et al. (2011). Big data: The Next Frontier for Innovation, Competition and Productivity. New York, NY: McKinsey Global Institute.

Mayer-Schönberger, V., and Cukier, K. (2013). Big data: A Revolution That Will Transform How we live, Work, and Think. Boston, MA: Houghton Mifflin Harcourt.

Mislevy, R. J., Yan, D., Gobert, J., and Sao Pedro, M. (2020). “Automated scoring in intelligent tutoring systems,” in Handbook of Automated Scoring , eds D. Yan, A. A. Rupp, and P. W. Foltz (London: Chapman and Hall/CRC), 403–422. doi: 10.1201/9781351264808-22

Mowery, D. C., Nelson, R. R., Sampat, B. N., and Ziedonis, A. A. (2001). The growth of patenting and licensing by US universities: an assessment of the effects of the Bayh–Dole act of 1980. Res. Pol. 30, 99–119. doi: 10.1515/9780804796361-008

Nye, B. D. (2015). Intelligent tutoring systems by and for the developing world: a review of trends and approaches for educational technology in a global context. Int. J. Art. Intell. Educ. 25, 177–203. doi: 10.1007/s40593-014-0028-6

Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453. doi: 10.1126/science.aax2342

O’Donnell, C. (2008). Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K-12 curriculum intervention research. Rev. Educ. Res. 78, 33–84. doi: 10.3102/0034654307313793

Papamitsiou, Z., and Economides, A. A. (2014). Learning analytics and educational data mining in practice: a systematic literature review of empirical evidence. Educ. Technol. Soc. 17, 49–64.

Pardo, A., and Siemens, G. (2014). Ethical and privacy principles for learning analytics. Br. J. Educ. Technol. 45, 438–450. doi: 10.1111/bjet.12152

Pedró, F., Subosa, M., Rivas, A., and Valverde, P. (2019). Artificial Intelligence in Education: Challenges and Opportunities for Sustainable Development. Paris: UNESCO.

Peters, M. A. (2018). Deep learning, education and the final stage of automation. Educ. Phil. Theory 50, 549–553. doi: 10.1080/00131857.2017.1348928

Popenici, S. A., and Kerr, S. (2017). Exploring the impact of artificial intelligence on teaching and learning in higher education. Res. Pract. Technol. Enhanced Learn. 12:22. doi: 10.1186/s41039-017-0062-8

Quadir, B., Chen, N.-S., and Isaias, P. (2020). Analyzing the educational goals, problems and techniques used in educational big data research from 2010 to 2018. Int. Learn. Environ. 1–17. doi: 10.1080/10494820.2020.1712427

Renz, A., and Hilbig, R. (2020). Prerequisites for artificial intelligence in further education: identification of drivers, barriers, and business models of educational technology companies. Int. J. Educ. Technol. Higher Educ. 17:14. doi: 10.1186/s41239-020-00193-3

Renz, A., Krishnaraja, S., and Gronau, E. (2020). Demystification of artificial intelligence in education–how much ai is really in the educational technology? Int. J. Learn. Anal. Art. Intell. Educ. (IJAI). 2, 4–30. doi: 10.3991/ijai.v2i1.12675

Roorda, D. L., Koomen, H. M. Y., Spilt, J. L., and Oort, F. J. (2011). The influence of affective teacher-student relationships on students’ school engagement and achievement: a meta-analytic approach. Rev. Educ. Res. 81, 493–529. doi: 10.3102/0034654311421793

Sao Pedro, M., Baker, R., and Gobert, J. (2013). “What different kinds of stratification can reveal about the generalizability of data-mined skill assessment models,” in Proceedings of the 3rd Conference on Learning Analytics and Knowledge (Leuven), 190–194.

Schroeck, M., Shockley, R., Smart, J., Romero-Morales, D., and Tufano, P. (2012). Analytics: the real-world use of big data. IBM Global Bus. Serv. 12, 1–20. doi: 10.1002/9781119204183.ch1

Sharples, M. (2000). The design of personal mobile technologies for lifelong learning. Comp. Educ. 34, 177–193. doi: 10.1016/s0360-1315(99)00044-5

Smutny, P., and Schreiberova, P. (2020). Chatbots for learning: a review of educational chatbots for the facebook messenger. Comp. Educ. 151:103862. doi: 10.1016/j.compedu.2020.103862

Sonderlund, A. L., Hughes, E., and Smith, J. (2019). The efficacy of learning analytics interventions in higher education: a systematic review. Br. J. Educ. Technol. 50, 2594–2618. doi: 10.1111/bjet.12720

Song, Y., Dai, X.-Y., and Wang, J. (2016). Not all emotions are created equal: expressive behavior of the networked public on China’s social media site. Comp. Hum. Behav. 60, 525–533. doi: 10.1016/j.chb.2016.02.086

Spikol, D., Ruffaldi, E., Dabisias, G., and Cukurova, M. (2018). Supervised machine learning in multimodal learning analytics for estimating success in project-based learning. J. Comp. Ass. Learn. 34, 366–377. doi: 10.1111/jcal.12263

Staats, C. (2016). Understanding implicit bias: what educators should know. Am. Educ. 39, 29–33. doi: 10.2307/3396655

Starcic, A. I. (2019). Human learning and learning analytics in the age of artificial intelligence. Br. J. Educ. Technol. 50, 2974–2976. doi: 10.1111/bjet.12879

The International Learning Sciences Forum (2019). The International Learning Sciences Forum: International Trends for Ai and Big Data in Learning Sciences. Taipei: National Taiwan Normal University.

Toh, L. P. E., Causo, A., Tzuo, P. W., Chen, I. M., and Yeo, S. H. (2016). A review on the use of robots in education and young children. J. Educ. Technol. Soc. 19, 148–163.

Topol, E. J. (2019). High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56. doi: 10.1038/s41591-018-0300-7

Tsai, C. C. (2000). Relationships between student scientific epistemological beliefs and perceptions of constructivist learning environments. Educ. Res. 42, 193–205. doi: 10.1080/001318800363836

Tsai, S. C., Chen, C. H., Shiao, Y. T., Ciou, J. S., and Wu, T. N. (2020). Precision education with statistical learning and deep learning: a case study in Taiwan. Int. J. Educ. Technol. Higher Educ. 17, 1–13. doi: 10.1186/s41239-020-00186-2

UNESCO (2015). SDG4-Education 2030, Incheon Declaration (ID) and Framework for Action. For the Implementation of Sustainable Development Goal 4, Ensure Inclusive and Equitable Quality Education and Promote Lifelong Learning Opportunities for All, ED-2016/WS/28. London: UNESCO

United Nations (2020). Policy Brief: Education During Covid-19 and Beyond. New York, NY: United Nations

VanRullen, R. (2017). Perception science in the age of deep neural networks. Front. Psychol. 8:142. doi: 10.3389/fpsyg.2017.00142

Viberg, O., Hatakka, M., Bälter, O., and Mavroudi, A. (2018). The current landscape of learning analytics in higher education. Comput. Human Behav. 89, 98–110. doi: 10.1016/j.chb.2018.07.027

Williams, P. (2019). Does competency-based education with blockchain signal a new mission for universities? J. Higher Educ. Pol. Manag. 41, 104–117. doi: 10.1080/1360080x.2018.1520491

World Development and Report (2019). The Changing Nature of Work. Washington, DC: The World Bank/International Bank for Reconstruction and Development.

Xie, H., Chu, H.-C., Hwang, G.-J., and Wang, C.-C. (2019). Trends and development in technology-enhanced adaptive/personalized learning: a systematic review of journal publications from 2007 to 2017. Comp. Educ. 140:103599. doi: 10.1016/j.compedu.2019.103599

Yadegaridehkordi, E., Noor, N. F. B. M., Ayub, M. N. B., Affal, H. B., and Hussin, N. B. (2019). Affective computing in education: a systematic review and future research. Comp. Educ. 142:103649. doi: 10.1016/j.compedu.2019.103649

Yarkoni, T., and Westfall, J. (2017). Choosing prediction over explanation in psychology: lessons from machine learning. Perspect. Psychol. Sci. 12, 1100–1122. doi: 10.1177/1745691617693393

Zawacki-Richter, O., Marín, V. I., Bond, M., and Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education–where are the educators? Int. J. Educ. Technol. Higher Educ. 16:39. doi: 10.1186/s41239-019-0171-0

Keywords : big data, artificial intelligence, education, learning, teaching

Citation: Luan H, Geczy P, Lai H, Gobert J, Yang SJH, Ogata H, Baltes J, Guerra R, Li P and Tsai C-C (2020) Challenges and Future Directions of Big Data and Artificial Intelligence in Education. Front. Psychol. 11:580820. doi: 10.3389/fpsyg.2020.580820

Received: 07 July 2020; Accepted: 22 September 2020; Published: 19 October 2020.

Reviewed by:

Copyright © 2020 Luan, Geczy, Lai, Gobert, Yang, Ogata, Baltes, Guerra, Li and Tsai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chin-Chung Tsai, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Big Data in Higher Education: Research Methods and Analytics Supporting the Learning Journey

  • Published: 24 July 2017
  • Volume 22 , pages 237–241, ( 2017 )

Cite this article

  • David Gibson 1  

9024 Accesses

11 Citations

7 Altmetric

Explore all metrics

Avoid common mistakes on your manuscript.

1 Introduction

One of the promises of mining big data for insights in higher education is to enable a new level of evidence-based research into learning and teaching. The broader term data science , which can be applied to many types and kinds of data big and small, captures a theoretical and methodological sea change occurring in educational and social science research methods that is situated apart from or perhaps between traditional qualitative and quantitative methods (Gibson and Ifenthaler 2017 ; Gibson and Webb 2015 ). Today, due to the fine-grained data captured during digital learning, it is possible to gain highly detailed insight into student performance and learning trajectories as required for personalizing and adapting curriculum as well as assessment (Baker and Yacef 2009 ).

In this new era of data-driven learning and teaching, researchers need to be equipped for the change with an advanced set of competencies that encompass areas needed for computationally intensive research (Buckingham Shum et al. 2013 ). For example, new data-management techniques are needed for big data, and new knowledge is needed for working with interdisciplinary teams with members who understand programming languages as well as the cognitive, behavioral, social and emotional perspectives on learning. A new horizon of professional knowledge is needed, including new heuristics, which incline a researcher or teaching-researcher toward computational modeling when tackling complex research problems (Gibson 2012 ).

This special issue on analytics in higher education learning and teaching focuses on some of the enabling computational approaches and challenges in research concerning the journey of a learner from pre-university experiences, recruitment, personalized learning, adaptive curriculum and assessment resources and effective teaching, to post-university life-long learning. The collection includes three original research articles, three work-in-progress reports, two articles on emerging technology and one integrative review.

2 Empirical Investigations

Empirical investigations report quantitative or qualitative research demonstrating advances in digital learning, gamification, automated assessment or learning analytics. In this section the three articles address understanding the college-going aspirations of students and their journey into higher education, the design of adaptive learning experiences, and building predictive early warning systems.

Research on factors leading to college-going choices of middle students has not yet utilized the extensive fine-grained data now becoming available on learning and engagement. This situation led the team of Maria Ofelia Z. San Pedro, Ryan S. Baker, and Neil T. Heffernan to collaborate on An Integrated Look at Middle School Engagement and Learning in Digital Environments as Precursors to College Attendance . The team used data mining methods on interaction-based assessments of student behavior, academic emotions and knowledge gathered from a middle school online learning environment, and evaluated relationships among the factors for impact on outcomes in high school and college. The measures were used to develop a prediction model of college attendance, to examine relationships concerning intermediate outcomes on the journey to college and to develop a path model for the educational experiences students have during middle school, high school, and college attendance. The research provides a picture of the cognitive and non-cognitive mechanisms that students experience throughout varied phases in their years in school, and how those mechanisms may be related to one another. Such understanding may provide educators with new information about students’ trajectories within the college pipeline.

Once students are enrolled into higher education, the issue of delivering a personalized curriculum becomes of prime interest. In Using Data to Understand How to Better Design Adaptive Learning , Min Liu, Jina Kang, Wenting Zou, Hyeyeon Lee, Zilong Pan and Stephanie Corliss investigate how the behavior patterns of learners with different characteristics interact with an adaptive learning environment. They collected data from the needs and interests of incoming 1st-year students in a pharmacy professional degree program who were engaged in an adaptive learning intervention that provided remedial instruction. The study found that affective factors such as motivation as well as the alignment among system components had an impact on how learners accessed and performed. Data visualizations revealed relationships that might have been otherwise missed. Their article is part of the bigger picture of how exploratory data mining can help inform the design of adaptive learning environments.

Another issue of great concern is how to make predictions that trigger early interventions that might help prevent attrition. The research project reported in Predicting Student Success: A Naïve Bayesian Application to Community College Data by Fermin Ornelas and Carlos Ordonez describes how this team developed and implemented a continuous Naïve Bayesian Classifier for courses at a community college. The method improved the teams’ previous prediction accuracy from 70 to 90% for both at-risk and successful students while easing some of the challenges of interpretation of results and implementation of interventions. Predictive results were obtained across eleven courses and cumulative gain charts show the potential for improvements that are made possible by focusing on high-level risk students. The findings may be relevant for implementation of early alert systems in other higher education contexts.

3 Work-in-Progress Studies

Work-in-progress studies provide early insights into leading projects or that document progressions of excellent research within the field of digital learning, gamification, automated assessment or learning analytics. There are three studies in this section that focus on student perceptions of fine-grained analytics on dashboards, an analysis challenge concerning the grain size of analytic observations and the impacts on students of an augmented reality serious game approach to onboarding into the university.

With the increased analytics capability in higher education, more institutions are developing or implementing student dashboards. But despite their emergence, students have had limited involvement in the development process. The ongoing research project reported in Give me a customizable dashboard: Personalized learning analytics dashboards in higher education by Lynne D. Roberts, Joel A. Howell, and Kristen Seaman reports on student perceptions of and preferences concerning dashboards. Four focus group transcripts representing 41 students identified five key themes including: ‘provide everyone with the same learning opportunities’, ‘to compare or not to compare’, ‘dashboard privacy’, ‘automate alerts’ and ‘make it meaningful—give me a customizable dashboard’. A content analysis of students’ drawings of desired dashboards demonstrates that students are interested in learning opportunities, comparisons to peers and personally meaningful data. A survey of students reported here highlights the tension between students’ personal autonomy and the collective uniform activity required to ensure equity. The research suggests potential for providing students with a level of control over their learning analytics as a means to increase self-regulated learning and academic achievement. This evolving research is aimed at better understanding students emotional and behavioral responses to feedback and alerts on dashboards.

‘Supplemental Instruction’ is a voluntary, non-remedial, peer-facilitated, course-specific intervention that has been widely demonstrated to increase student success, yet concerns persist regarding the biasing effects of disproportionate participation by already higher-performing students. With a focus on maintaining access for all students, the research team of Maureen A. Guarcello, Richard A. Levine, Joshua Beemer, James P. Frazee, Mark A. Laumakis, and Stephen A. Schellenberg examined data from a large, public university in the Western United States. In the article Balancing Student Success: Assessing Supplemental Instruction through Coarsened Exact Matching the team used data including student demographics, performance, and participation in supplemental programs to evaluate the efficacy of supplemental instruction. The analysis was conducted in the first year of implementation within a traditionally high-challenge introductory psychology course. Findings indicate a statistically significant relationship between student participation in supplemental instruction and increased odds of successful course completion. Furthermore, the application of Coarsened Exact Matching reduced concerns that increased course performance was attributed to an over-representation of higher performing students who elected to attend the voluntary sessions.

Students in Hong Kong are introduced to academic integrity and ethics issues through mobile Augmented Reality learning trails—Trails of Integrity and Ethics—which are accessed on smart devices. In Bringing abstract academic integrity and ethical concepts into real - life situations , Theresa Kwong, Eva Wong and Kevin Yue report on the exploratory analytics being conducted on the initial stages of their large-scale, government-funded project which inducts university students. The augmented reality trails immerse students in collaborative problem solving tasks centered on ethical dilemmas, addressed in real locations where various dilemmas might arise, and gives contextually appropriate digital advice and information on demand and as-needed. Students play out the consequences of their decisions, which help reinforce the links between the theoretical concept of academic integrity and ethics and the practical application in everyday contexts. To evaluate the effectiveness of the experiences, the analysis triangulates user experience surveys, qualitative feedback, clickstream data, and text mining of pre- and post- discussions. Preliminary analysis of thousands of student responses suggests that augmented reality learning trails can be adopted and applied to a wider scope of the academic curriculum and co-curriculum.

4 Emerging Technology Reports

The emerging technology reports section presents two views on developments in educational technology that address new potentials for digital learning environments. In nStudy: A System for Researching Information Problem Solving , Philip H. Winne, John C. Nesbit and Fred Popowich discuss a new technology for tracing how students work on solving information problems. The platform and toolset gathers fine-grained data about what students do as they work on information problems, which information they work with, and how they adapt tactics and strategies in response to feedback. The technology is implemented as an extension to the Chrome web browser supported by a server-side database that warehouses logged trace data. Trace data record the information learners operate on and operations they apply to that information. Peripheral systems on the server extract data, analyze the data and generate learning analytics for delivery to students and their instructors or researchers on demand or when conditions are matched.

Stephanie Teasley in Student Facing Dashboards: One Size Fits All? reports that early implementations of dashboards provide mixed results about the effects of their use. In particular, the ‘one-size-fits-all’ design of many existing systems is questioned based on research on performance feedback and student motivation, which has shown that both internal and external student-level factors affect the impact of feedback interventions, especially those using social comparisons. She asserts that integrating data from student information systems into underlying algorithms to produce personalized dashboards may mediate the possible negative effects of feedback, especially comparative feedback, and support more consistent benefits from the use of such systems.

5 Integrative Review

Acknowledging that various disciplines attempt to infer learning from big data using different methodologies, the next authors provide a framework for moving transdisciplinary conversations forward in research collaborations. In their integrative review entitled Inferring learning from big data: The importance of a transdisciplinary and multidimensional approach Jason M. Lodge, Sakinah S. J. Alhadad, Melinda J. Lewis and Dragan Gašević discuss the need for systematic collaboration across different paradigms and disciplinary backgrounds in interpreting big data for enhancing learning.

I trust that you will find one or more of these projects and reports to be of interest and will lead you to follow these researchers as data science evolves in higher education learning and teaching.

David Gibson

Baker, R., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining, 1, 3–17. doi: 10.1109/ASE.2003.1240314 .

Google Scholar  

Buckingham Shum, S., Hawksey, M., Baker, R. S., Jeffery, N., Behrens, J. T., & Pea, R. (2013). Educational data scientists: A scarce breed. In 3rd international conference on learning analytics and knowledge, LAK 2013 (pp. 278–281). doi: 10.1145/2460296.2460355 .

Gibson, D. (2012). Game changers for transforming learning environments. In F. Miller (Ed.), Transforming learning environments: Strategies to shape the next generation (advances in educational administration) (Vol. 16, pp. 215–235). Bingley: Emerald Group Publishing Ltd. doi: 10.1108/S1479-3660(2012)0000016014 .

Chapter   Google Scholar  

Gibson, D., & Ifenthaler, D. (2017). Preparing the next generation of education researchers for big data in higher education. In B. Kei Daniel (Ed.), Big data and learning analytics: Current theory and practice in higher education (pp. 29–42). Berlin: Springer.

Gibson, D., & Webb, M. (2015). Data science in educational assessment. Education and Information Technologies, 24 (4), 697–713. doi: 10.1007/s10639-015-9411-7 .

Article   Google Scholar  

Download references

Author information

Authors and affiliations.

Curtin University, Bentley, WA, Australia

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to David Gibson .

Rights and permissions

Reprints and permissions

About this article

Gibson, D. Big Data in Higher Education: Research Methods and Analytics Supporting the Learning Journey. Tech Know Learn 22 , 237–241 (2017). https://doi.org/10.1007/s10758-017-9331-2

Download citation

Published : 24 July 2017

Issue Date : October 2017

DOI : https://doi.org/10.1007/s10758-017-9331-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

Big Data Analytics for Smart Education

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Front Psychol

The Research Trend of Big Data in Education and the Impact of Teacher Psychology on Educational Development During COVID-19: A Systematic Review and Future Perspective

Associated data.

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

The COVID-19 outbreak, along with post-pandemic impact has prompted Internet Plus education to re-examine numerous facets of technology-oriented academic research, particularly Educational Big Data (EBD). However, the unexpected transition from face-to-face offline education to online lessons has urged teachers to introduce educational technology into teaching practice, which has had an overwhelming impact on teachers' professional and personal lives. The aim of this present work is to fathom which research foci construct EBD in a comprehensive manner and how positive psychological indicators function in the technostress suffered by less agentic teachers. To this end, CiteSpace 5.7 and VOSviewer were applied to examine a longitudinal study of the literature from Web of Science Core Collection with the objective of uncovering the explicit patterns and knowledge structures in scientific network knowledge maps. Thousand seven hundred and eight articles concerned with educational data that met the criteria were extracted and analyzed. Research spanning 15 years was conducted to reveal that the knowledge base has accumulated dramatically after many governments' initiatives since 2012 with an accelerating annual growth and decreasing geographic imbalance. The review also identified some influential authors and journals whose effects will continue to have future implications. The authors identified several topical foci such as data mining, student performance, learning environment and psychology, learning analytics, and application. More specifically, the authors identified the scientific shift from data mining application to data privacy and educational psychology, from general scan to specific investigation. Among the conclusions, the results highlighted the important integration of educational psychology and technology during critical periods of educational development.

Introduction

Educational Big Data (EBD) is currently faced with an unprecedented recognition of existing educational psychology, with technological platforms playing an increasingly vital role in the adaptation of current approaches toward technology-based programs. EBD has emerged as a vital area of study for both educators and researchers, reflecting the magnitude and impact of data-related problems to be solved in educational practices, particularly with the application of innovational technologies. Nowadays, recreational desires, commercial insights, research needs, and government initiatives necessarily accelerate the utilization of technological devices, producing a great amount of data on an unprecedented scale. For better and for worse, the accumulation and circulation of massive data on each form have become an integral part within the development of contemporary social community. It is a topic that merits the close focuses of all walks of life, especially those in academic research. To analyze and further dig out the underlying function of big data for both public and private benefit, researchers from different domains have tried to unpack and define big data in increasingly powerful ways (Mikalef et al., 2018 ).

Back to 2012, the need for research on the previous large volume of human experience to improve the working efficacy and well-being for offspring stood out. This demand produced the idea of BIG DATA as massive quantities of information produced by humanity, surroundings, and their interrelations (Boyd and Crawford, 2012 ). Big data has several characteristics known as “5V”: Volume, Velocity, Variety, Veracity, and Value (Demchenko et al., 2013 ). Volume , one of the characteristics of big data, indicates the amount of data is huge and unpredictable along collecting, restoring, and calculating. Velocity introduces one nature of big data. It calls for fast processing to online or real-time data analyses, which also requires the unique data mining technology different from traditional ones. Variety is the basic concept in big data, referencing a variety of data sources (including semi-structured and unstructured data) and the data types and formats breaking through the traditional limited category of structured data, either structured or unstructured. Veracity refers to the quality of data. When the source becomes more complicated and diverse, the truth and reliability need to be further analyzed. Finally, value mentions the laborious input that would bring the high value in return. Similarly, Saggi and Jain ( 2018 ) added two more characteristics, namely Valence and Variability . However, as large-volume, intricate, growing data assets from a variety of sources, analyzing big data in traditional manner is a challenging but fruitful work (Wu et al., 2013 ; Osman, 2019 ).

Currently, several scholars, such as Frizzo-Barker et al. ( 2016 ) have become more involved and thrilled about the feasibility of big data. Actually, the appeal of big data has never been lost in many different realms such as economics (Varian, 2014 ), business (McAfee et al., 2012 ), ecology (Hampton et al., 2013 ), geography physical (Li et al., 2016 ), medical care (Liao et al., 2018 ), and health care sciences services (Bates et al., 2014 ). Moreover, in these research fields, the research method of systematic reviews has already been adopted to provide broader assessment (Connolly et al., 2012 ; Perez et al., 2013 ; Rose et al., 2018 ).

Additionally, despite the rapid application of science mapping in the domain of information science, social science, and medical research, the utilization of comprehensive visualization networks to better understand the evolution of Educational Big Data is quite novel (Eynon, 2013 ). Despite the exponentially increasing growth and interest among participants and scholars, there is not enough research on big data in education, especially with the application of systematic bibliometric analysis. Our primary objective of the visual analysis is to apply data mining technology to excavate high-quality and effective information from data and use informative pictures to clearly display research achievements in the field of education based on Authors, Countries, Journals, Institutions, Key Words , and Research Topics . This allows us to faster capture changes across multiple data sets without the need to acquire sophisticated computer skills or master clustering techniques (Van Eck and Waltman, 2017 ). Finally, it would provide insights for future studies and highlight the potential directions for the big data in education. Therefore, a macroscopic overview needs to be available on the main characteristics based on the bibliometric review.

Literature Review

Initiated 15 years ago, educational big data has drawn many educational scholars' attention and gained abundant academic achievements. The educational realm has never lost its crucial role with the advent of big data. Much data in the field of education has increased significantly since the release of the Internet and researchers can explore some groups of subjects without necessarily depending on complicated measuring methods. Earlier, gStudy and learning kits were utilized as a medium through which learners construct knowledge and produce more informative data about knowledge construction in psychology (Winne, 2006 ). In the age of big data, which provides educational scholars with comprehensive ways to reconceptualize research questions and analyze educational data (Daniel, 2015 ), technological tools are applied to collect useful data within short time and relatively low cost (Mayer-Schönberger, 2016 ). Software technologies in education contribute greatly to big data and improve learning for the better and promote school reform based on the three axioms in educational psychology, respectively, “Learners Construct Knowledge (Cognitive Operations), Learners Are Agents (The capacity to exercise choices with respect to preferences), and Data Include Randomness” (Winne, 2006 ). However, in the technology-based teaching environment, it is agentive teachers who play an important role in applying educational technology in their teaching practices and they determine which methods are used to construct the classroom pattern and how to execute teaching plan in more effective ways. The need to focus on teachers' psychology in educational development emerges during investigation.

Online course, instruction, and guidelines produce a considerable amount of educational data, which provides teachers with the access to student's performance and learning patterns (Oi et al., 2017 ). Those data could help teachers to further analyze students' learning route and teaching pedagogy (Holland, 2019 ). Visualization techniques were suggested to capture and identify available and fruitful patterns in educational data (Greer and Mark, 2016 ). For instance, a newly appointed math teacher can utilize visualization tools and test data to know in which branch a student performs better, statistics or geometry. Therefore, visualization outputs have an ability to help teachers with limited disciplinary knowledge to interpret and unscramble student data (Ong, 2015 ). Currently, it is also necessary and important for educational scholars to realize what big data really means for education.

Research Design

Research questions.

Based on the bibliometric research of big data in education psychology, the main research questions can be uncovered based on the statistics from databases. Moreover, in the light of this systematic investigation, some insights into educational reforms in school and implications for teachers and teaching could be unveiled, and also the need to fathom how educational technology affects teachers' actions psychologically in teaching stands out.

  • What is the growth trajectory and geographic distribution of literature in the domain of EBD from 2006 to 2021?
  • What main research foci and trends have gained the greatest attention from the clustering analysis?
  • What implications and insights for teachers and teaching could be acquired from the literature review of EBD?

Materials and Methods

The data in this paper was obtained from Web of Science (WoS) on February 5, 2021. WoS is naturally regarded as the world's largest comprehensive academic information research database covering more than 8,700 core academic journals. In order to retrieve more high-qualified articles, the authors selected core collection as research objectives. Details of collection of data are shown in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0001.jpg

Stages of data collection.

In this study, related keywords were listed and used to complicate the Boolean logic models: TS = (“Big Data” Or “Learning Analytics” Or “Data Analytics” Or “Data Mining” Or “Big Data Era” Or “Data Models” Or “Data Management”) And TS = (“Education” Or “Language” Or “Learning” Or “Educational Psychology” Or “Educational Application”), and according to returned results, the first publication in education appeared in 2006. After running the process, 1,779 items met the selection criteria. Up to the date of analysis, the results show that article (1,643, 92.4%) enjoys the most frequent document type, the second is review (65, 3.7%), and at the third position is editorial material (62, 3.5%). Other types include book review (6, 0.3%), correction (2, 0.1%), and software review (1, 0.05%). After discarding duplicate data, the number of total unique records is 1,708.

As the most frequently used tool in bibliometrics, science mapping presents the current status of research and possible developmental directions. In this paper, bibliometric software VOSviewer and CiteSpace (Chen, 2006 ) are utilized for data analysis. Bibliometric software CiteSpace provides effective methodology in systematic scientometric review (Chen and Song, 2019 ), which specializes in analyzing keywords timeline picture for possible research direction. During the process of keywords visualization, time span was set to “2006 to 2021,” time slice was one year, node type could be confined to analysis-preferred themes ( author, organization, reference, et al .) and other parameters were set to default values. The results would be presented in the next discussion part. Apart from timezone analysis, VOSviewer is another bibliometric mapping software used for co-occurrence and co-citation analysis (Van Eck and Waltman, 2010 ). Prior to data processing, filtered data would be imported into network dataset for visualizing and exploring maps. To illustrate and satisfy different research objectives, network, overlay and density visualization are available for data representation. The keywords could be distributed as nodes in display in the three maps ( open button on the file tab in the action panel). In the function panel, different outcomes could be scrutinized based on the different subjects, which are shown in the discussion section.

CiteSpace as well as VOSviewer utilizes nodes to represent keywords and lines as co-occurrence relationship, which could be visualized in the form of graph structures. However, these two pieces of bibliometric software enjoy different priority in data processing due to the nuance of theoretical algorithm. It is acknowledged that CiteSpace has the power to better display the development trend in the specific research realm and forms the frontier of research evolution. VOSviewer, on the other hand, concentrates on the display of main information retrieved from database. Hence, in this research review, timeline and keywords burst would be analyzed using CiteSpace, and co-citation display of different sections would be managed by VOSviewer.

Research Results

In this part, the authors present the results of publications in big data in education in a comprehensive bibliometric way. A state of the art in education is presented in A state of the art in EBD study. Further, keywords analysis of research foci, co-authorship and co-citation analysis are shown respectively.

A State of the Art in EBD Study

In this section, the authors analyze the current status of study from different aspects, including annual trends of publications, the distribution of institution and journals, and citation.

The Annual Trends of Publications of EBD

After the application of big data in education, 1,708 papers were published on the Web of Science core collection from its inception in 2006 to February 5, 2021. The annual trend of these publications is demonstrated in Figure 2 . The graph shows that there are three stages from 2006 to 2021. In the first stage, 2006 to 2010 (the figures under 10), the study of big data in education is still in the initial stage. From 2011 (20) to 2014 (39), in spite of some subtle declines, the number of publications slightly increases compared to 2006 (6). After 2014 (39), the figure shows dramatic growth. Up to 2020, the figure rises to 415. This significant growth trend signals that big data in education has drawn more and more scholars' attention and probably continues to increase in the next two decades.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0002.jpg

The annual trends of publications of big data in education based on WoS core data base.

The Distribution of Influential Institution on EBD

We utilize CiteSpace to get the knowledge map of institution co-occurrence network (as shown in Figure 3 ). The results show that the whole graph network is distributed densely. Moreover, there are many connections between each node, which indicates that scholars in the study of big data in education cooperate closely, and most of the research is cooperative research.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0003.jpg

Institution co-occurrence network of big data in education.

Further, to further analyze the prominent institutes in the domain of education, select network summary table in CiteSpace and get the top dominant results (See Table 1 ). It shows that Open University enjoys the priority of greatest numbers of publications, which has 41 papers in the field of educational big data. Not like traditional face-to-face models, OU attaches great importance to e-learning for the purpose of flexibility. Lara et al. ( 2014 ) from OU applied educational data mining and other major strengths to meet the challenge of the spatial and temporal gap between students and teachers. The Monash University (Australia) is at the second position and has totally published 36 papers followed by the Edinburgh University (33), the Sydney University (22) and the Carlos III Madrid University (22) respectively. From the listed 10 organizations below, the clear and plain fact shows the number of each individual's publications is <50 papers, which indicates that in the domain of education, big data is still a niche topic.

Different analytical tools in data processing.

TLS, total link strength; APY, average publication year; IF, impact factors .

The Analysis of Citation and H-Index

In order to achieve higher impact on the scientific community, scholars often want to publish their findings in some certain high-impact journals (Bhandari et al., 2007 ). The number of citations becomes the main indicator to access the quality of a paper (Tahamtan et al., 2016 ). Web of Science has its own analyzing tools to create citation report, which could reflect citations to source items. From core collection between 2006 to 2021, the total number of citations is 15,944 (see Figure 4 ) and the number of “without self-citations” is 12,277.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0004.jpg

Sum of times cited each year on web of science.

Hirsch ( 2005 ) originally coined H-index to access the one person's academic achievement and further indicated that if a researcher's total papers have at least h citations each and the other outputs have < h citations each, then he or she has factor h . From the citation report from Web of Science, what can be concluded is that the h -index of research results is 51 and average citations per item is 8.93. According to the findings of citation and h -index, although compared with medica and information science, the integration of big data with education is a relatively new topic, great attention has been caught in this field.

Keywords Analysis of EBD

This section provides the scientific landscapes of keywords in educational big data. The keywords co-occurrence network map, the density visualization map and timeline map will be exhibited by using VOSviewer bibliometric software. Further, with the help of CiteSpace, the table of citation bursts will be displayed respectively.

Keywords Co-occurrence Network

How academic knowledge is stored and evolved over time is an intriguing question. New ideas and findings cannot be kept separate from existing principles and concepts (Palvia et al., 2002 ; Oh et al., 2005 ). The structure of knowledge and its variations are interrelated within social community, which makes network perspective available in study. For the sake of convenience and effectiveness, some keywords serve as an indicator of the significance of research topics (Choi et al., 2011 ). Therefore, the analysis of keywords occurrence network could report research hotspots and future trends of some certain realms to some extent.

After importing network data into VOSviewer software, 5,229 keywords were obtained. Further, the threshold of minimum occurrences was set as 15 and the keywords with the greatest total link strength were selected to create a network visualization map (see Figure 5 ). According to the manual of VOSviewer 1.6.16, the size of the nodes stands for the occurrences and weights of the keywords. If one item has the biggest circle, the largest weights it has. The distance between two words represents their relations in the intensity distribution. The shorter distance two words has, the stronger their relatedness. Moreover, nodes with the same color represent that they are in the same cluster (Van Eck and Waltman, 2010 ). Hence, from the distribution of keywords in the map, it is clear to see that the biggest node is “learning analytics” which appears 630 times. “Big data” (203), “Education” (177) and “Educational Data Mining” (160) disclose their occurrence in the study respectively.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0005.jpg

Keywords network visualization of big data in education.

Further, the whole network occurrence map could be divided into five clusters. Each one represented a distinct branch of big data in education. To be specific, in the red cluster (cluster 1, 31 items), keywords such as Educational Data Mining, Data Mining, Model, Machine Learning, MOOC (Massive Open Online Course), E-Learning, Learning Management Systems , etc., focused on the research of data mining and application in education. The green cluster (cluster 2, 30 items) included the keywords such as Learning Analytics, Big Data, Education, Analytics, Design, Tools, System, Ethics, University , etc., which were concerned with learning analytics. Another blue cluster (cluster 3) consisted of 26 keywords, including, Efficacy, Motivation, Online Learning, Belief, Support, Self-Regulated Learning, Student Engagement, Environment , etc. The blue cluster unveiled the importance and potential of psychological factors in the language teaching and more specifically teachers' development. Next, in the yellow cluster (cluster 4, 17 items), keywords like Engagement, Patter, Social Network Analysis, Participant, Blending Learning, Learning Design , etc., showed the common feature as learning environment and patter. There were four items in the last purple cluster (cluster 5), Facebook, Networks, Science , and Social Media , which implied the source of educational big data.

To be more specific, the information of top 10 keywords with their occurrence, links and total link strength are displayed in Table 2 . The link strength and total link strength are another two indexes to quantify the relatedness of keywords (Pinto et al., 2014 ). The first index appertains to frequency of co-occurrence and the total link strength refers to the sum of the link strength of the keywords. Based on the Table 2 , apart from big data and education, keywords like learning analytics, data mining, students, online and performance enjoy the privilege of co-occurrence.

The top institutes with big data in education publication.

VOSviewer can also export the map of density visualization (see Figure 6 ). According to Van Eck and Waltman ( 2011 ), keywords in the map have the similar way as in the network visualization. Each item owns its self-color to identify the density of keywords at that point. By default, blue, green and yellow are the three main colors to show the distribution of density. The larger the number of the items in the neighborhood of the node, the more frequently the keywords appear and the closer the color is to yellow and vice versa. From the output of density visualization map, learning analytics, big data, education, educational data mining and students have the most crucial impact in the field of educational big data.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0006.jpg

Keywords density visualization of big data in education.

Keywords Timeline View of EBD

The software CiteSpace has the keyword-analyzing capacity for unveiling the cutting-edge research by presenting the certain research contents and distribution of some certain topics over time (Chen, 2006 ). Based on the keywords co-occurrence map, the authors selected the “time zone” option in the control panel and got the output ( Figure 7 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0007.jpg

Keywords Timeline view based on CiteSpace.

From the output of CiteSpace, Table 3 lists the main keywords appeared during the different periods.

The details of top 10 occurrence keywords.

Obviously, from the distribution of keyword nodes, it can be spotted that there are three distinct time zones from starting year 2006 to 2021. For the sake of convenience, the authors separate the whole timeline into three periods, namely “Incubation Period,” “Boom Period” and “New Stage of Incubation Period” respectively.

In the period of incubation (2006–2011), the study on educational big data is little, and the products are relatively immature. The keywords in that period focused on “data mining,” “algorithm,” “computer” and “educational environment.” Realizing the potential of big data, scholars in the realm of education have made great efforts to exploit the application of technology to utilize and analyze the massive valuable data in powerful ways. They adapted themselves to the era of big data. The research in this period paved the way for the further investigation of big data in education. For instance, in 2008, Romero, Ventura and Garcia conducted a survey of application of the data mining tool in learning management systems and introduced to all potential administrators, which opened the door of educational data mining.

After 2011, the study enjoyed a period of prosperity (2012–2016). More and more eminent scholars and experts treated data from educational context as a valuable way to trace students' performance. Big data has become a research focus in the field of education. The publications begin to accumulate and the keywords like “learning analytics,” “classroom,” “blended learning” and “self-regulated learning” direct the way to integration of big data and education. In the paper, named Translating Learning into Numbers: A Generic Framework for Learning Analytics , Greller and Drachsler ( 2012 ) investigated the main dimensions of learning analytics to design a practical framework in support of educational implementation and teaching efficiency.

After 2016, the investigation entered the new round of incubation. In this period, the keywords like “data analytics,” “challenge,” “data privacy” and “course-design” account for the main proportions and the products of educational big data are gradually mature. Moreover, based on the keyword timeline map, it is clear to address that the sparks of new ideas are about to be kindled and there are two main directions: one is concerning data analytics providing the possibility of implementation, the other is about individual privacy issues. On March 19, 2021, the State Council Information Office (SCIO) held a press conference in Beijing on the fourth Digital China Summit. Yang, vice minister of the Cyberspace Administration of China, demonstrated the great importance attached to the publicity of data security and personal information protection, which underpinned the practicability of the nationwide enforcement of Network Security (Yang, 2021 ). The overall and coordinated efforts on various facets such as policy, law and supervision have been made to formulate national laws to provide legal protection for data security and personal privacy protection at the legal level.

From the trend of the research, the focus has shifted from technology-based investigation and practices to curriculum designing and subjects' self (psychological impact on teachers' professional development or “growing-up”). From the start, research has paved the way for the future analysis of educational practice in the school climate, then with the development and maturing of the educationally technological foundation, the main picture of the study has been transmitted to individual subjects, especially psychological factors. Recently, the combination of principles in cognitive psychology and education has shed light on the foggy investigation of psychology, such as students' attentiveness, teachers' agentic involvement as well as currency of mind wandering in educational settings (Szpunar et al., 2013 ).

Keyword Citation Bursts

The trail of the scientific development can be traced from the keywords of the research works (Yang et al., 2020 ). Keywords with transition phenomenon have the way to unveil the implicit information of trends. The analysis of citation bursts can single out several keywords which has been paid special attention to by the related community within a certain period of time (Su and Lee, 2010 ; Chen et al., 2019 ; Su et al., 2019 ). Based on the powerful function of CiteSpace, this paper chooses the burst start time method to produce the top 13 keywords with strongest citation bursts (see Figure 8 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0008.jpg

Results of strongest citation bursts in educational big data.

The red part in the figure shows the start year when the citation burst occurred. As can be seen from Figure 8 , together with “e-learning” and “algorithm” which lasted for the longest time, the keyword “data mining” starting in 2006 was the first one to be proposed in the research of educational big data. The figure charts the dynamic transition from 2011 by inspecting the keywords order such as “system,” “education,” “patter,” “number” and “network.” Like Ozga's (Ozga, 2009 ) contribution to the prominent role of data in the use of benchmarking, performance criterion and monitoring within education context, Grek and Ozga's (Grek and Ozga, 2010 ) indigenous investigation of the European educational environment has pictured the inseparable relations between data and the education landscape, using data systems to track policy problems and develop policy solutions (Lingard et al., 2012 ). What is more, the Actor Network Approach is another policy-related factor concentrated on assemblages of human and non-human materials within any educational environment. Hence, from the message which the keywords order conveyed, the trend has flowed from individual performance evaluation to educational policy influence beyond the national scale due to the policy as numbers phenomenon and neoliberalism (Selwyn, 2015 ).

Co-authorship Visualization Analysis

Academic research needs laborious engagement and investment, which means that it would be impossible for just one individual to accomplish a research project. Co-authorship has been used as an index for research collaboration by science policy academics and evaluators (Bond et al., 2019 ).

In this section, VOSviewer was applied to investigate the collaborative pattern of author, country, and institution of big data in education. Inputting 1,708 items into software, the authors chose the unit of “authors” in the type of “co-authorship” and obtained the cooperative network of the authors in the field of educational big data ( Figure 9 ). Of the 1,708 papers published between 2006 and 2021 by 4,380 authors, 633 authors (accounting for 14.45%), were credited on two publications, 246 contributors (5.62%) on three publications, and 126 contributors (2.88%) met the thresholds of four. In order to examine the prominent authors in the realm of educational big data, the authors set up the threshold of three. However, there were 124 items singled out without connections to other authors, leaving the rest of the 122 items to be analyzed in the network.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0009.jpg

Authors cooperative network in the realm of big data in education. (A) Network visualization map based on link weights; (B) Overlay visualization map based on link weights.

According to the manual of VOSviewer, lines between different contributors unveil the collaboration links, and several colors in the map represent the distinct clusters in the domain of educational big data. For instance, in Figure 9 , the authors like “Yin, Cheng Jiu,” “Shimada, Atsushi,” “Ogata, Hiroaki,” “Chu, Hui Chun” and “Hwang, Gwo Jen” were grouped in the cluster 2 and highlighted in green.

In Figure 9B , the gradient colors disclosed an interesting trend of cooperation relatedness of contributors from single author to cooperative movement. The top productive authors “Gasevic, Dragan,” “Rienties, Bart,” “Dawson, Shane” and “Williamson, Ben” were in descending order, without latest contributions. Some authors like “Broos, Tom,” “Gentili Sheridan” recently have some new publications. For instance, in April 2020, Broos has realized the potential of learning analytics (LA), and proposed the coordination model to support the prosperous interaction between LA policymaking and implementation in Latin-America (Broos et al., 2020 ), which hopefully guided the futural LA initiatives. Later in June, Broos with other authors conducted the empirical investigation of learning analytics to improve academic support in Latin America (Guerra et al., 2020 ).

To make the Figure 9 more reliable in statistics, Table 4 lists the top document-productive authors. The average publication year shows that the top 10 contributors had publications after 2016. Combined with the annual trend of publications in Figure 2 , the numbers in the table also give us a clue that the domain of educational big data keeps the vigorous growth.

The keywords showed in the different periods.

Countries/Regions Analysis

Co-authorship analysis of geography.

VOSviewer provides us with powerful functions to visualize the countries co-authorship in the bibliometric way. Setting the minimum number at 10, 40 countries of the total 88 met the threshold (as seen in Figure 10 , links 302, total link strength 822).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0010.jpg

Co-authorship analysis of countries/regions. (A) Network visualization map based on document weights; (B) Overlay visualization based on document weights; (C) Density visualization based on document weights.

In Figure 10A , the size of the nodes shows the number of documents, and different colors represent the distinct scientific camps. There are seven clusters totally. For instance, USA, Canada, Singapore, South Korea, Brazil and Iran are in the same cluster, which has the same research direction. Lines between two nodes indicate the link strength and cooperative relatedness. The link strength between the USA and China is 13, between the USA and Canada being 40, while link strength between England and Germany is 7. What can be indicated from results is that the implementation of cooperation can not only rely on geographical factors. Overlay visualization map is identical to Figure 10A , except the different colors with the color bar in the bottom right corner from blue to green to yellow. The bar shows document changes geographically during the different periods. From the map, Ecuador, Chile, and Thailand had latest contributions while highly productive countries like USA, Canada, Australia kept relatively low voice in the realm of educational big data. From Figure 10C , the density visualization network shows that USA, Austrian, English, China, Spain, Germany, and Netherland are the pioneers and leaders in cooperation in the domain of educational big data.

Citation Analysis of Geography

Apart from above analysis in the co-authorship way, VOSviewer could also track the geographical data in the manner of citation. Co-citation refers to the relatedness of two contributors whose literature were simultaneously cited by another author (Zupic and Cater, 2015 ). Using VOSviewer, the authors set the threshold of 10 and got 40 countries of 88 in the co-citation visualization map (see Figure 11 ). The reference and journal co-citation analysis will be displayed in part institutions analysis.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0011.jpg

Citation analysis of geography. (A) Network visualization map based on citation weights; (B) Density Visualization based on citation weights.

Like above, the size of nodes represents the number of documents co-cited before and the distance between two nodes tells the scientific relations.

Australia, England, Canada, and USA kept strong cooperative links, while China, Turkey, Finland, and Japan had weak collaboration with others. It would be a wise idea for them to conduct more scientific research work with other countries in the future. The density visualization network concluded the main countries as USA, Australia, China, English, Canada, and Spain. Compared with Figure 10C , those countries with strong co-author relations generally hold large co-citation intensity. More specifically, the results of the geography analysis can be detailed in the Figure 12 .

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0012.jpg

Weight of documents and citations of regions.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0013.jpg

Range of different regions with different thresholds.

Institutions Analysis

Before utilizing VOSviewer to map the institution network, different thresholds of minimum number of documents could generate distinct results ( Figure 12 ). In the Figure 12 , there was a sudden drop between 1 to 5 documents, which possibly indicated that the majority of scholars had only one or two publications and the study had already attracted great attention. In other words, for future successors in this domain, the investigation of big data in education still has a long way to go.

In Table 1 , the top 10 organizations have been listed. In order to compensate the non-visual chart results, Figure 13 has presented the institution co-authorship network by VOSviewer both in network and density visualization way. From the network map, the authors made the conclusion that institutions generally have a high sense of cooperation on Education big data and keep tight academic relatedness. The Figure 13B shows the density map based on total link weights. From the trend of changing colors ranging from blue to green to yellow, collaboration of organizations in the European, Oceanian and North American region was much stronger than Asian countries, for instance Monash Univ (Australia, Oceania), Open Univ (England, Europe), Univ British Columbia (Canada, North America), Stanford Univ (UAS, North America) and Univ Edinburgh (Scotland, Europe).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0014.jpg

Co-authorship analysis of research organizations. (A) Network visualization map based on document weights; (B) Density visualization based on total link weights.

Co-citation Visualization Analysis of Reference

Co-citation is about co-cooperative relatedness when two papers were cited by the third paper (Boyack and Klavans, 2010 ), which is another index to survey the relevant literature in the bibliometric way. Unlike the method of citation analysis which focuses on the quality of subjects (including documents, sources, authors, organization, countries, etc.), co-citation could be used as a more scientific way to illustrate the collaborative pattern of research themes. Of the 55,947 cited references, the authors set the minimum number to get the results ( Figure 14 ) and details the authors' information in Table 5 . From the picture, the authors found that the biggest node was Ferguson (2012) about his theoretic contribution on learning analytics about drivers, developments and challenges published in Int. J. Technology Enhanced Learning , which stressed the relatedness of learning analytics, academic analysis, and educational data mining.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0015.jpg

Network visualization map of cited reference [items 41, links 762, total link strength 4,936].

The top 10 strongest co-authorship.

Co-citation Visualization Analysis of Journal

The authors set the minimum number of citations at 30 and visualized 289 journals of the 25,188 sources in Figure 15 . The size of each node shows the number and contribution of that journal. The distance between two nodes also vividly represents the situation of link strength and citation. The distribution of each node also tells that different aspects in educational big data keep tight cooperation. In other words, successful implementation of big data in education cannot be separated from application of science technology and data analysis. In Figure 15 , there are eight clusters, each representing distinct research subjects. “ Computers & Education,” “ Lecture Notes in Computer Science,” “ Expert Systems with Applications” belong to cluster 1 indicating computer application. Cluster 2 has covered the research of educational psychology such as “ Journal of Educational Psychology,” “Educational Psychologist” and “ Educational Psychology Review.” “Educational Technology & Society,” “ETR&D-Educational Technology Research and Development,” “American Behavioral Scientist” et al . in cluster 3 underpin the importance of educational technology.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0016.jpg

Journal co-citation network visualization map.

Table 6 lists top 10 most influential journals in the field of educational big data, Computers & Education enjoys the first priority in the order of citation. For instance, Kizilcec et al. ( 2017 ) once delivered a study of self-regulated learning strategies in Computers & Education to praise the digital learning environments for obtaining rich and large-scale educational data for scientific research and also pointed out several promising directions for data analysis in the domain of education, that is, the development of predictive models, feedback systems, and interventions with respect to Self-regulated learning strategies (SLR). In their words, anyone to address these challenges will put current research to a higher level and generate available strategies to underpin the platform of teacher/teaching.

Top five most co-cited papers.

Conclusion and Implications

This research review adapted a bibliometric method to document and analyze the WOS database over the past 15 years. Utilizing science mapping analysis, the authors tracked the 1,708 articles from 2006 to 2021. This conclusion part provides interpretation of the review and offers implications for futural research.

Interpretation of Results

The annual growth trajectory of publications unveiled that the number of papers fluctuated at low level during the initial periods between 2006 and 2014. However, after the critical period from 2012 to 2014, the trend showed exponential growth, probably because governments' initiatives to exploit the potential of big data, for example, the White House Big Data Report, Office of the Press Secretary in 2014 and the 2012 official report by the US Office of Educational Technology, Department of Education (Eynon, 2013 ). Apart from publication growth, the analysis of annual trend of citations also revealed the similar growth path. It is fair to state that big data in education receives and merits great attention from educational policymakers, administrators and educators.

The analysis of institutions exemplified that there was strong collaboration of organizations in the European, Oceanian and North American region, stronger than stronger than Asian countries. However, the results have also unveiled the fact that most research institutions lack outputs in quantity, giving the result of low thresholds of minimum number of documents. Enough scientific investigation on educational big data still needs to be on track.

The topographical analysis of research papers and institutions from WOS database uncovered the global geographic distribution with great contributions from UAS, Austrian, England, China, Spain, Germany and Netherland who also kept tight cooperation on the field. Optimistically, this regional imbalance has been redressed by the emerging countries from Asia (Thailand) and South America (Ecuador and Chile). Still, in some regions of Africa, due to economic situation and little opportunities of technology-assisted teaching or learning, there is little or no academic research on this realm, partly for lacking the international accessible knowledge about the most cutting-edge technology and application of educational learning analytics.

The co-citation analysis also identified the potential of integration of big data into educational practice. From the analysis of journal, computer and information science, technology analysis, ubiquitous network, and robust online education et al. All these foundations have transformed and improved how education itself functions. Educational settings have changed their ways to be more violent and drastic. In the era of big data, educators or educational administrators need to seize the initiatives to extract and analyze data for predicting and improving students' performance. This review also specified the core authors who have made groundbreaking and fundamental contribution on the field, for instance, Gasevic, Rienties, and Pardo.

Another contribution of this systematic review focused on distribution of keywords and timeline situation. From the vivid output, data mining, learning analytics, learning environment and psychology, the application of education, source of educational big data and users (or students) privacy were central to the educational big data research. To be more specific, the evolution also highlighted the shift from data mining to learning analytics to data analytics. Furthermore, apart from data-related analysis, researchers have also attached their attention from educational technology to individual educational psychology, more specifically, the psychological impacts on education, schoolteachers, students, school climate and even society. Interestingly, the application of educational technology inevitably raised the issue of ethics, particularly on privacy, which needs to be considered carefully and properly (Eynon, 2013 ).

To sum up, data mining, learning analytics, and algorithms highlight the shift to data analytics and educational psychology. Further, ethics, especially humanity privacy, provides new perspectives to rethink the potential of big data during educational activities and practices. Language program administrators and language teachers tether their efforts to the application of big data technology in the educational context, and psychological impacts should not be excluded when concerning the participants of school teaching or educating.

Implications of the Results

The rise of modern technology and up-to-data social media contributes to information overload resulting in major stress in educational practice, which in turn probably causes serious psychological and mental disorders. It is high time to call for publica awareness of psychological problems and bridge this information gap in teachers. Several implications go after the findings from the visualization map of the objects.

Objectivity as Criterion in Data-Driven Educational Policy and Technology-Based Educational Growth

New data-based technologies tend to witness the era of objectivity in educational data application and scientific policy governance (Williamson and Piattoeva, 2019 ). Data play a central and vital role in the practices of educational policy implementing at local, national, and global scales. Several works have displayed the close interrelatedness between collection, circulation and analysis of educational digital data and dynamic sociotechnical networks of human, technologies, and policies providing new perspectives of evaluating education (Piattoeva, 2015 ; Hartong, 2016 ; Sellar, 2017 ). Meanwhile, psychological, and behavioral insights have been uttered in data-driven educational policy. Data cannot be treated as something entirely unified or sequential, instead we should consider as discourses and practices (Graham and Shelton, 2013 ). Big data has been subjected to the dilemma in which individual privacy and information circulation cannot achieve synergetic and simultaneous progress. Challenges of data privacy and ethics remain unsolved mysteries. The need to use the politics of data perspective to retreat education in the era of big data have also been stressed by some scholars (Halford et al., 2013 ). It is critical to make the social structure of big data visible, not as neutral fact.

New Perspectives to Regard and Practice Data in Educational Settings

Statistically, samples include certain huge amounts of population from which volumes of data can be gathered and received to assess the effectiveness of school reform and its associated effects on teaching development. Though the growing value of big data to education, many academic institutions keep slow pace with the implementation of big data projects (Macfadyen, 2017 ). Also, the collaboration between nations on educational big data has suffered from great geographic imbalance during the past 10 years. Hence, we assert the necessity for geographical diversity in the course of educational research. International educational administrators, educational philosophers, national policymakers, the school educators, educational institutions, and researchers, especially those inactive participants, need to address the importance of conceptualization of the possibility of introducing different technologies to extract and process information to underpin and improve the student learning. However, in its most negative forms, educational technology resulting in educational big data in school practice contributes to teacher stress and anxiety disorders, while in less aggressive forms, the application of technology can help adapt and innovate to the development-oriented conditions. Therefore, in step with the curriculum innovations and technology-preferred trend, those educational executors, especially school educators need to be more active to motivate their agency to unveil negative psychological conditions (anxiety, pressure, depression, fear, et al.) as explanation of skills lack, and positive psychological states (confidence, passion, pride, trust, etc.) as indexes of competence (Doménech-Betoret et al., 2017 ).

(1) Realize the Importance of Teaching-Learning Inter-Relatedness Between Learners and Teachers

School life and correlations with instructors significantly count in students' academic achievement. Valuing and promoting this close interpersonal relationship for instance, showing empathy and respect, becoming technologically available and psychologically allowed in discipline-coached proceedings, which also satisfied students' psychological need for achievement.

(2) Accomplish the Mission of Students' Inspection of Self-Ability

The big data area has challenged and positioned students subject to technical-assistance-inclination dilemma, which needs to re-perceive what skills they have gained from the former education and what achievement they have contributed after the evaluating feedback from guidance. This self-recognition of capacity also relates with students' psychological need for achievement.

Teachers' Role in Shaping Educational Development in the Big Data Era

After statistic survey, pedagogues can easily figure out the learning patterns and scientific rules about language learning which can be utilized to improve teaching pedagogy and effectiveness. Not only based on their teaching practice experiences, but educators can also utilize various educational tools (such as learning management systems, intelligent tutoring systems, e-books, MOOCs, etc.) in their teaching and educational contexts (such as practices of blended learning, flipped learning, or distance learning on math courses, language courses, programming courses, etc.) to meet the demand of professional development.

The COVID-19 outbreak that changed the traditional face-to-face way to impart school knowledge and up-to-data educational reform that altered the former educational practice in schooling put much more challenges and pressure on educators and administrators. Educators, especially school educational executors, are called to perform and activate their agency to improve their educational development in accord with the era of big data. The special year 2019 has greatly pushed teachers to know about and master the methods to introduce educational technology into daily teaching practices, which burdened themselves to some extent. To this end, teachers had to apply data mining technologies to extract information from participation in a discussion and use data analytics to examine students' learning state. All these unexpected outcomes would probably cause teacher anxiety disorders or stress, and in the end burnout in teaching, which calls more teachers to become more resilient people who can manage the negative factors to have positive consequences. Teachers are historically regarded as the effective role to shape students' experiences in school, also to constantly influence students' knowledge-acquisition skills and well-being. The ripple effects of teachers' assessment, interactions with students, and psychological impact on pupils' growth are necessarily targeted with difficulty, especially in the time of educational changes. The previous studies have focused on psychological interventions on students, but the students' passive environment testifies its insufficiency and effectiveness. Fortunately, teacher psychology serves more powerful and useful interests of educational development in the big data era.

Teachers as active and dynamic participants, school educational environment and students play crucial and salient roles in the existence of school climate. Concerning the critical role of teacher in research, interactive ethnography paves the way for proceeding research design (Edwards, 2015 ), and normally discursive psychology, a model of research, is nominated as effective element in carrying out and performing agency in relation to dynamic school climate. Psychologically, teacher agency is a vital element to the successful implementation of educational innovation (Tao and Gao, 2017 ) in teaching practices. During educational development, educators are active, vigorous, and agentic contributors. However, the realization of positive educational growth is closely related to teacher agency. Consequently, compared with previous teaching, it might be more helpful for teachers to be given more opportunities to advance their personal agentic abilities. Hence, positive agency and teacher satisfaction resulting from pleasing school climate can lead to teacher development in the course of employment. What's more, the wide currency and growing-tendency of depression in education and living, lack necessary and productive cooperation in work, as well as the non-increasing rise in job satisfaction all suggest that the pressing need for synergy between positive emotion and education (school or lifelong). To utter it straighter and blunter, positive education, not only for knowledge-skill acquisition but also for achievement of sense of happiness, enables to increase resilience, positive engagement, and personal accomplishment, which also is highway to educational development (the relations shown in Figure 16 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-753388-g0017.jpg

Relations of teacher development.

Prior to elaborate the details in accordance with the variables in shaping teacher agency, many scholars have conducted scientific empirical studies to prove the pivotal role of agency within changes and constraints (e.g., Yang and Clarke, 2018 ; Yang and Markauskaite, 2021 ). Since the valuable contribution toward facilitating students' academic involvement and teachers' professional advancement, teacher agency has been shaped by the enables and constraints of educational changes under the impact of educational big data. In practice, most teachers would enact agency to reflect on and adopt preferred teaching modes to reveal the potent and innovative methods of blending traditional and incoming teaching approaches which often results in a bundle of challenges but also opportunities (Bryson and Andres, 2020 ). In the light of enactment, teachers' enactment of instruction, their knowledge and its guiding effect on teaching behaviors, their epistemology, and lastly their autonomy shape teacher agency to promote teaching quality and self-development which merit deeper explorations and investigations (Maclellan, 2017 ). Therefore, given teachers' professional agentic response and choices toward changes in the era of big data, it is expected for educational stakeholders to acknowledge the priority of teaches' digital competence and agentic participants in the teaching practice, also the need to discover the friendly and flexible interaction between agentic practices while teaching and the dynamic contextual resources (Gong et al., 2021 ).

Necessity of Bibliometric Approach

For those who have conducted the research review on this field adopting traditional methods of meta-analysis, we advise to add the complementary value by bibliometric approach. Science mapping allows us to extract the larger outcomes in the visual way among these piles of disordered and numerous literatures. Also, it has the ability to untangle some distinct features of contributors (authorship, co-citation, co-reference etc.). Therefore, we hold the idea of utilization of scientific bibliometric approach to showing research trends and foci vividly.

Limitations

Although the scientific research method of science mapping complements the traditional bibliometrics models, it cannot completely take the place of these review methodologies which has provided great positive contributions on quality assessments. Our research strategy applied temporal analysis of some particular databases to unveil the temporal variations, which responds to the evolution of the field. Undoubtedly, it may drop some specific research questions unanswered in the course of study. Another limitation comes from the sources of data. We examined the database from Web of Science core collections and the articles are only focused on English. In other words, we have not covered the whole literature in the field.

Data Availability Statement

Author contributions.

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

This paper is supported by the research project a Practical Study of the Instructional Mode of Integrating Reading and Writing Based on the Thematic Reading which is sponsored by China English Reading Academy and Foreign Language Teaching and Research Press (the grant number is CERA1351210), and the research project of Teaching Research and Educational Reform Project, 2021 which is sponsored by Shanghai Normal University.

  • Arnold K. E., Pistilli M. D. (2012). Course signals at Purdue: Using learning analytics to increase student success . In Proceedings of the 2nd international conference on learning analytics and knowledge. 267–270. 10.1145/2330601.2330666 [ CrossRef ] [ Google Scholar ]
  • Bates D. W., Saria S., Ohno-Machado L., Shah A., Escobar G. (2014). Big data in health care: using analytics to identify and manage high-risk and high-cost patients . Health Affairs 33 , 1123–1131. 10.1377/hlthaff.2014.0041 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bhandari M., Busse J., Devereaux P. J., Montori V. M., Swiontkowski M., Tornetta I. I. I., et al.. (2007). Factors associated with citation rates in the orthopedic literature . Canad. J. Surg. 50 :119. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bond M., Zawacki-Richter O., Nichols M. (2019). Revisiting five decades of educational technology research: a content and authorship analysis of the British journal of educational technology . Br. J. Educ. Technol. 50 , 12–63. 10.1111/bjet.12730 [ CrossRef ] [ Google Scholar ]
  • Boyack K. W., Klavans R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately? J. Am. Soc. Inf. Sci. Technol. 61 , 2389–2404. 10.1002/asi.21419 [ CrossRef ] [ Google Scholar ]
  • Boyd D., Crawford K. (2012). Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon . Inf. Commun. Soc. 15 , 662–679. 10.1080/1369118X.2012.678878 [ CrossRef ] [ Google Scholar ]
  • Broos T., Hilliger I., Pérez-Sanagustín M., Htun N. N., Millecamp M., Pesántez-Cabrera P., et al.. (2020). Coordinating learning analytics policymaking and implementation at scale . Br. J. Educ. Technol. 51 , 938–954. 10.1111/bjet.12934 [ CrossRef ] [ Google Scholar ]
  • Bryson J. R., Andres L. (2020). Covid-19 and rapid adoption and improvisation of online teaching: curating resources for extensive versus intensive online learning experiences . J. Geogr. High. Educ. 44 , 608–623. 10.1080/03098265.2020.1807478 [ CrossRef ] [ Google Scholar ]
  • Chen C. (2006). CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature . J. Am. Soc. Inf. Sci. Technol. 57 , 359–377. 10.1002/asi.20317 [ CrossRef ] [ Google Scholar ]
  • Chen C., Song M. (2019). Visualizing a field of research: a methodology of systematic scientometric reviews . PLoS ONE 14 :e0223994. 10.1371/journal.pone.0223994 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chen J., Meng S., Zhou W. (2019). The exploration of fuzzy linguistic research: a scientometric review based on CiteSpace . J. Intell. Fuzzy Syst. 37 , 3655–3669. 10.3233/JIFS-182737 [ CrossRef ] [ Google Scholar ]
  • Choi J., Yi S., Lee K. C. (2011). Analysis of keyword networks in MIS research and implications for predicting knowledge evolution . Inf. Manag. 48 , 371–381. 10.1016/j.im.2011.09.004 [ CrossRef ] [ Google Scholar ]
  • Connolly T. M., Boyle E. A., MacArthur E., Hainey T., Boyle J. M. (2012). a systematic literature review of empirical evidence on computer games and serious games . Comput. Educ. 59 , 661–686. 10.1016/j.compedu.2012.03.004 [ CrossRef ] [ Google Scholar ]
  • Daniel B. (2015). Big data and analytics in higher education: opportunities and challenges . Br. J. Educ. Technol. 46 , 904–920. 10.1111/bjet.12230 [ CrossRef ] [ Google Scholar ]
  • Demchenko Y., Grosso P., De Laat C., Membrey P. (2013). Addressing big data issues in scientific data infrastructure , in 2013 International Conference on Collaboration Technologies and Systems (CTS) (San Diego, CA: ), 48–55. 10.1109/CTS.2013.6567203 [ CrossRef ] [ Google Scholar ]
  • Doménech-Betoret F., Abellán-Roselló L., Gómez-Artiga A. (2017). Self-efficacy, satisfaction, and academic achievement: the mediator role of Students' expectancy-value beliefs . Front. Psychol. 8 :1193. 10.3389/fpsyg.2017.01193 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Edwards A. (2015). Recognizing and realizing teachers' professional agency . Teach. Teach . 21 , 779–784. 10.1080/13540602.2015.1044333 [ CrossRef ] [ Google Scholar ]
  • Eynon R. (2013). The rise of big data: what does it mean for education, technology, and media research? Learn. Media Technol. 38 , 237–240. 10.1080/17439884.2013.771783 [ CrossRef ] [ Google Scholar ]
  • Ferguson R. (2012). Learning analytics: drivers, developments and challenges . Inter. Jour. Tech. Enhan. Learn. 4 :304–317. 10.1504/IJTEL.2012.051816 [ CrossRef ] [ Google Scholar ]
  • Frizzo-Barker J., Chow-White P. A., Mozafari M., Ha D. (2016). An empirical study of the rise of big data in business scholarship . Int. J. Inf. Manag. 36 , 403–413. 10.1016/j.ijinfomgt.2016.01.006 [ CrossRef ] [ Google Scholar ]
  • Gašević D., Dawson S., Siemens G. (2015). Let's not forget: Learning analytics are about learning . TechTrends , 59 :64–71. 10.1007/s11528-014-0822-x [ CrossRef ] [ Google Scholar ]
  • Gong Y., Fan C. W., Wang C. (2021). Teacher agency in adapting to online teaching during COVID-19: a case study on teachers of Chinese as an additional language in Macau . J. Technol. Chin. Lang. Teach. 12 , 82–101. [ Google Scholar ]
  • Graham M., Shelton T. (2013). Geography and the future of big data, big data and the future of geography . Dialogues Hum. Geogr. 3 , 255–261. 10.1177/2043820613513121 [ CrossRef ] [ Google Scholar ]
  • Greer J., Mark M. (2016). Evaluation methods for intelligent tutoring systems revisited . Int. J. Artif. Intell. Educ. 26 , 387–392. 10.1007/s40593-015-0043-2 [ CrossRef ] [ Google Scholar ]
  • Grek S., Ozga J. (2010). Governing education through data: Scotland, England and the European education policy space . Br. Educ. Res. J. 36 , 937–952. 10.1080/01411920903275865 [ CrossRef ] [ Google Scholar ]
  • Greller W., Drachsler H. (2012). Translating learning into numbers: a generic framework for learning analytics . J. Educ. Technol. Soc. 15 , 42–57. [ Google Scholar ]
  • Guerra J., Ortiz-Rojas M., Zúñiga-Prieto M. A., Scheihing E., Jiménez A., Broos T., et al.. (2020). Adaptation and evaluation of a learning analytics dashboard to improve academic support at three Latin American universities . Br. J. Educ. Technol. 51 , 973–1001. 10.1111/bjet.12950 [ CrossRef ] [ Google Scholar ]
  • Halford S., Pope C., Weal M. (2013). Digital futures? Sociological challenges and opportunities in the emergent semantic web . Sociology 47 , 173–189. 10.1177/0038038512453798 [ CrossRef ] [ Google Scholar ]
  • Hampton S. E., Strasser C. A., Tewksbury J. J., Gram W. K., Budden A. E., Batcheller A. L., et al.. (2013). Big data and the future of ecology . Front. Ecol. Environ. 11 , 156–162. 10.1890/120103 [ CrossRef ] [ Google Scholar ]
  • Hartong S. (2016). Between assessments, digital technologies and big data: the growing influence of ‘hidden’ data mediators in education . Eur. Educ. Res. J. 15 , 523–536. 10.1177/1474904116648966 [ CrossRef ] [ Google Scholar ]
  • Hirsch J. E. (2005). An index to quantify an individual's scientific research output . Proc. Natl. Acad. Sci. 102 , 16569–16572. 10.1073/pnas.0507655102 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Holland A. A. (2019). Effective principles of informal online learning design: a theory-building metasynthesis of qualitative research . Comput. Educ. 128 , 214–226. 10.1016/j.compedu.2018.09.026 [ CrossRef ] [ Google Scholar ]
  • Kizilcec R. F., Pérez-Sanagustín M., Maldonado J. J. (2017). Self-regulated learning strategies predict learner behavior and goal attainment in massive open online courses . Comput. Educ. 104 , 18–33. 10.1016/j.compedu.2016.10.001 [ CrossRef ] [ Google Scholar ]
  • Lara J. A., Lizcano D., Martínez M. A., Pazos J., Riera T. (2014). A system for knowledge discovery in e-learning environments within the European Higher Education Area–Application to student data from Open University of Madrid, UDIMA . Comput. Educ. 72 , 23–36. 10.1016/j.compedu.2013.10.009 [ CrossRef ] [ Google Scholar ]
  • Li S., Dragicevic S., Castro F. A., Sester M., Winter S., Coltekin A., et al.. (2016). Geospatial big data handling theory and methods: a review and research challenges . ISPRS J. Photogrammetr. Remote. Sens. 115 , 119–133. 10.1016/j.isprsjprs.2015.10.012 [ CrossRef ] [ Google Scholar ]
  • Liao H., Tang M., Luo L., Li C., Chiclana F., Zeng X. J. (2018). A bibliometric analysis and visualization of medical big data research . Sustainability 10 :166. 10.3390/su10010166 [ CrossRef ] [ Google Scholar ]
  • Lingard B., Creagh S., Vass G. (2012). Education policy as numbers: data categories and two Australian cases of misrecognition . J. Educ. Policy 27 , 315–333. 10.1080/02680939.2011.605476 [ CrossRef ] [ Google Scholar ]
  • Macfadyen L. P. (2017). Overcoming barriers to educational analytics: how systems thinking and pragmatism can help . Educ. Technol. 57 , 31–39. Available online at: http://www.jstor.org/stable/44430538 [ Google Scholar ]
  • Maclellan E. (2017). Shaping agency through theorizing and practicing teaching in teacher education , in The SAGE Handbook of Research on Teacher Education , eds Clandinin D. J., Husu J. (London: Sage Publications; ), 139–142. 10.4135/9781526402042.n14 [ CrossRef ] [ Google Scholar ]
  • Mayer-Schönberger V. (2016). Big data for cardiology: novel discovery? Eur. Heart J. 37 , 996–1001. 10.1093/eurheartj/ehv648 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • McAfee A., Brynjolfsson E., Davenport T. H., Patil D. J., Barton D. (2012). Big data: the management revolution . Harvard Bus. Rev. 90 , 60–68. [ PubMed ] [ Google Scholar ]
  • Mikalef P., Pappas I. O., Krogstie J., Giannakos M. (2018). Big data analytics capabilities: a systematic literature review and research agenda . Inf. Syst. e-Bus. Manag. 16 , 547–578. 10.1007/s10257-017-0362-y [ CrossRef ] [ Google Scholar ]
  • Oh W., Choi J. N., Kim K. (2005). Coauthorship dynamics and knowledge capital: the patterns of cross-disciplinary collaboration in information systems research . J. Manag. Inf. Syst. 22 , 266–292. 10.2753/MIS0742-1222220309 [ CrossRef ] [ Google Scholar ]
  • Oi M., Yamada M., Okubo F., Shimada A., Ogata H. (2017). Reproducibility of findings from educational big data: a preliminary study , in Proceedings of the Seventh International Learning Analytics & Knowledge Conference (Vancouver, BC: ), 536–537). 10.1145/3027385.3029445 [ CrossRef ] [ Google Scholar ]
  • Ong V. K. (2015). Big data and its research implications for higher education: cases from UK higher education institutions , in 2015 IIAI 4th International Congress on Advanced Applied Informatics (Okayama: ), 487–491. 10.1109/IIAI-AAI.2015.178 [ CrossRef ] [ Google Scholar ]
  • Osman A. M. S. (2019). A novel big data analytics framework for smart cities . Future Gen. Comput. Syst. 91 , 620–633. 10.1016/j.future.2018.06.046 [ CrossRef ] [ Google Scholar ]
  • Ozga J. (2009). Governing education through data in England: from regulation to self-evaluation . J. Educ. Policy 24 , 149–162. 10.1080/02680930902733121 [ CrossRef ] [ Google Scholar ]
  • Palvia P. C., Palvia S. C. J., Whitworth J. E. (2002). Global information technology: a meta analysis of key issues . Inf. Manag. 39 , 403–414. 10.1016/S0378-7206(01)00106-9 [ CrossRef ] [ Google Scholar ]
  • Perez M. M., Noortgate W., Desmet P. (2013). Captioned video for l2 listening and vocabulary learning: a meta-analysis . System 41 , 720–739. 10.1016/j.system.2013.07.013 [ CrossRef ] [ Google Scholar ]
  • Piattoeva N. (2015). Elastic numbers: national examinations data as a technology of government . J. Educ. Policy 30 , 316–334. 10.1080/02680939.2014.937830 [ CrossRef ] [ Google Scholar ]
  • Pinto M., Pulgarín A., Escalona M. I. (2014). Viewing information literacy concepts: a comparison of two branches of knowledge . Scientometrics 98 , 2311–2329. 10.1007/s11192-013-1166-6 [ CrossRef ] [ Google Scholar ]
  • Romero C., Ventura S., García E. (2008). Data mining in course management systems: moodle case study and tutorial . Comput. Educ. 51 , 368–384. 10.1016/j.compedu.2007.05.016 [ CrossRef ] [ Google Scholar ]
  • Rose H., Briggs J. G., Boggs J. A., Sergio L., Ivanova-Slavianskaia N. (2018). A systematic review of language learner strategy research in the face of self-regulation . System 72 , 151–163. 10.1016/j.system.2017.12.002 [ CrossRef ] [ Google Scholar ]
  • Saggi M. K., Jain S. (2018). A survey towards an integration of big data analytics to big insights for value-creation . Inf. Process. Manag. 54 , 758–790. 10.1016/j.ipm.2018.01.010 [ CrossRef ] [ Google Scholar ]
  • Sellar S. (2017). Making network markets in education: the development of data infrastructure in Australian schooling . Global. Soc. Educ. 15 , 341–335. 10.1080/14767724.2017.1330137 [ CrossRef ] [ Google Scholar ]
  • Selwyn N. (2015). Data entry: towards the critical study of digital data and education . Learn. Media Technol. 40 , 64–82. 10.1080/17439884.2014.921628 [ CrossRef ] [ Google Scholar ]
  • Su H. N., Lee P. C. (2010). Mapping knowledge structure by keyword co-occurrence: a first look at journal papers in technology foresight . Scientometrics 85 , 65–79. 10.1007/s11192-010-0259-8 [ CrossRef ] [ Google Scholar ]
  • Su X. W., Li X., Kang Y. X. (2019). A bibliometric analysis of research on intangible cultural heritage using CiteSpace . Sage Open 9 :1–18. 10.1177/2158244019840119 [ CrossRef ] [ Google Scholar ]
  • Szpunar K. K., Moulton S. T., Schacter D. L. (2013). Mind wandering and education: from the classroom to online learning . Front. Psychol. 4 :495. 10.3389/fpsyg.2013.00495 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tahamtan I., Afshar A. S., Ahamdzadeh K. (2016). Factors affecting number of citations: a comprehensive review of the literature . Scientometrics 107 , 1195–1225. 10.1007/s11192-016-1889-2 [ CrossRef ] [ Google Scholar ]
  • Tao J., Gao X. (2017). Teacher agency and identity commitment in curricular reform . Teach. Teach. Educ. 63 , 346–355. 10.1016/j.tate.2017.01.010 [ CrossRef ] [ Google Scholar ]
  • Van Eck N. J., Waltman L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping . Scientometrics 84 , 523–538. 10.1007/s11192-009-0146-3 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Van Eck N. J., Waltman L. (2011). Text mining and visualization using VOSviewer . arxiv [preprint].arxiv:1109.2058. [ Google Scholar ]
  • Van Eck N. J., Waltman L. (2017). Citation-based clustering of publications using CitNetExplorer and VOSviewer . Scientometrics 111 , 1053–1070. 10.1007/s11192-017-2300-7 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Varian H. R. (2014). Big data: new tricks for econometrics . J. Econ. Perspect. 28 , 3–28. 10.1257/jep.28.2.3 [ CrossRef ] [ Google Scholar ]
  • Williamson B., Piattoeva N. (2019). Objectivity as standardization in data-scientific education policy, technology and governance . Learn. Media Technol. 44 , 64–76. 10.1080/17439884.2018.1556215 [ CrossRef ] [ Google Scholar ]
  • Winne P. H. (2006). How software technologies can improve research on learning and bolster school reform . Educ. Psychol. 41 , 5–17. 10.1207/s15326985ep4101_3 [ CrossRef ] [ Google Scholar ]
  • Wu X., Zhu X., Wu G. Q., Ding W. (2013). Data mining with big data . IEEE Trans. Knowl. Data Eng. 26 , 97–107. 10.1109/TKDE.2013.109 [ CrossRef ] [ Google Scholar ]
  • Yang H., Clarke M. (2018). Spaces of agency within contextual constraints: a case study of teacher's response to EFL reform in a Chinese university . Asia Pac. J. Educ. 38 , 187–201. 10.1080/02188791.2018.1460252 [ CrossRef ] [ Google Scholar ]
  • Yang H., Markauskaite L. (2021). Preservice teachers' perezhivanie and epistemic agency during the practicum . Pedagogy Cult. Soc. 22 :1–22. 10.1080/14681366.2021.1946841 [ CrossRef ] [ Google Scholar ]
  • Yang R., Wong C. W., Miao X. (2020). Analysis of the trend in the knowledge of environmental responsibility research . J. Cleaner Prod. 278 :123402. 10.1016/j.jclepro.2020.123402 [ CrossRef ] [ Google Scholar ]
  • Yang X. W. (2021). SCIO Briefing on the 4th Digital China Summit . Available online at: http://www.scio.gov.cn/xwfbh/xwbfbh/wqfbh/44687/45087/index.htm (accessed April 3, 2021).
  • Zupic I., Cater T. (2015). Bibliometric methods in management and organization . Organ. Res. Methods 18 , 429–472. 10.1177/1094428114562629 [ CrossRef ] [ Google Scholar ]

chrome icon

Big data in education: a state of the art, limitations, and future research directions

Content maybe subject to  copyright     Report

27  citations

16  citations

12  citations

10  citations

9  citations

3,740  citations

View 5 reference excerpts

"Big data in education: a state of t..." refers background or methods in this paper

... According to Kitchenham and Charters (2007), electronic databases provide a broad perspective on a subject rather than a limited set of specific journals and conferences. In order to find the relevant articles, keywords on big data and education were searched to obtain relatable results. The general words correlated to education were also explored (education OR academic OR university OR learning. OR curriculum OR higher education OR school). This search string was paired with big data. The second stage is a manual search stage (S2). In this stage, a manual search was performed on the references of all initial searched studies. Kitchenham (2004) suggested that manual search should be applied to the primary study references. ...

... According to Kitchenham and Charters (2007), recommendation and irrelevant studies should be excluded from the review subject. ...

... Kitchenham (2004) suggested that manual search should be applied to the primary study references. ...

... This research applies the Kitchenham and Charters (2007) strategies. ...

... According to Kitchenham and Charters (2007), electronic databases provide a broad perspective on a subject rather than a limited set of specific journals and conferences. ...

1,477  citations

View 1 reference excerpt

"Big data in education: a state of t..." refers background in this paper

... Similarly, Wolfert, Ge, Verdouw, and Bogaardt (2017) conducted a review study on the use of big data in smart farming. ...

1,288  citations

View 3 reference excerpts

... However, longitudinal data are more appropriate for multidimensional measurements and to analyze the large data sets in the future (Sorensen, 2018). ...

... …Martínez-Abad, Gamazo, & Rodríguez-Conde, 2018; Ong, 2015; Ozgur, Kleckner, & Li, 2015; Qiu, Huang, & Patel, 2015; Selwyn, 2014; Sooriamurthi, 2018; Sorensen, 2018; Zhang, 2015) Integration of big data into the curriculum Studies that introduce big data topics into different courses and… ...

... The large-scale administrative data can play a tremendous role in managing various educational problems (Sorensen, 2018). ...

1,267  citations

941  citations

... Big data demand is significantly increasing in different fields of endeavour such as insurance and construction (Dresner Advisory Services, 2017), healthcare (Wang, Kung, & Byrd, 2018), telecommunication (Ahmed et al., 2018), and e-commerce (Wu & Lin, 2018). ...

Related Papers (5)

Ask Copilot

Related papers

Contributing institutions

Related topics

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

The Use of Big Data in Education

Profile image of ATHANASIOS DRIGAS

This paper is a study on the use of Big Data in Education. Analyzed how the Big Data and Open Data technology can actually involve to education. Furthermore how big mounts of unused data can benefit and improve education. Providing some new tools and methods bypassing the traditional difficulties and open a new way of education.

RELATED PAPERS

ehsan heidari

Revista Diadorim

Fabiano Tadeu Grazioli

Gustavo Assi

Carlos Eduardo Medina-De la Garza

Chuansheng Chen

SISLav - Italian Society of Labour History - Società Italiana di Storia del Lavoro

Anirban Bhattacharya

A. Putri Inayah Yatof

Journal of Architecture and Human Experience

Vincentia Reni Vitasurya

International Journal of Health Planning and Management

Jennie Popay

Mariana Martinez Wilderom Chagas

BMC Nutrition

Gretel Pelto

AIP Conference Proceedings

Michael Garner

Hacettepe Üniversitesi Türkiyat Araştırmaları (HÜTAD)

Advances in Weed Science

Eliana Fernandes Borsato

Asian Journal of Pharmaceutical and Clinical Research

Ayodeji Ajayi

Svenja Tams

Elvina Azimova

Springer eBooks

Maxim Khlopov

Anna Maria Pezzella

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • Systematic review
  • Open access
  • Published: 19 February 2024

‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice

  • Annette Boaz   ORCID: orcid.org/0000-0003-0557-1294 1 ,
  • Juan Baeza 2 ,
  • Alec Fraser   ORCID: orcid.org/0000-0003-1121-1551 2 &
  • Erik Persson 3  

Implementation Science volume  19 , Article number:  15 ( 2024 ) Cite this article

Metrics details

The gap between research findings and clinical practice is well documented and a range of strategies have been developed to support the implementation of research into clinical practice. The objective of this study was to update and extend two previous reviews of systematic reviews of strategies designed to implement research evidence into clinical practice.

We developed a comprehensive systematic literature search strategy based on the terms used in the previous reviews to identify studies that looked explicitly at interventions designed to turn research evidence into practice. The search was performed in June 2022 in four electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched from January 2010 up to June 2022 and applied no language restrictions. Two independent reviewers appraised the quality of included studies using a quality assessment checklist. To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes.

We identified 32 reviews conducted between 2010 and 2022. The reviews are mainly of multi-faceted interventions ( n  = 20) although there are reviews focusing on single strategies (ICT, educational, reminders, local opinion leaders, audit and feedback, social media and toolkits). The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Furthermore, a lot of nuance lies behind these headline findings, and this is increasingly commented upon in the reviews themselves.

Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been identified. We need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of research perspectives (including social science) in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed.

Peer Review reports

Contribution to the literature

Considerable time and money is invested in implementing and evaluating strategies to increase the implementation of research into clinical practice.

The growing body of evidence is not providing the anticipated clear lessons to support improved implementation.

Instead what is needed is better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice.

This would involve a more central role in implementation science for a wider range of perspectives, especially from the social, economic, political and behavioural sciences and for greater use of different types of synthesis, such as realist synthesis.

Introduction

The gap between research findings and clinical practice is well documented and a range of interventions has been developed to increase the implementation of research into clinical practice [ 1 , 2 ]. In recent years researchers have worked to improve the consistency in the ways in which these interventions (often called strategies) are described to support their evaluation. One notable development has been the emergence of Implementation Science as a field focusing explicitly on “the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice” ([ 3 ] p. 1). The work of implementation science focuses on closing, or at least narrowing, the gap between research and practice. One contribution has been to map existing interventions, identifying 73 discreet strategies to support research implementation [ 4 ] which have been grouped into 9 clusters [ 5 ]. The authors note that they have not considered the evidence of effectiveness of the individual strategies and that a next step is to understand better which strategies perform best in which combinations and for what purposes [ 4 ]. Other authors have noted that there is also scope to learn more from other related fields of study such as policy implementation [ 6 ] and to draw on methods designed to support the evaluation of complex interventions [ 7 ].

The increase in activity designed to support the implementation of research into practice and improvements in reporting provided the impetus for an update of a review of systematic reviews of the effectiveness of interventions designed to support the use of research in clinical practice [ 8 ] which was itself an update of the review conducted by Grimshaw and colleagues in 2001. The 2001 review [ 9 ] identified 41 reviews considering a range of strategies including educational interventions, audit and feedback, computerised decision support to financial incentives and combined interventions. The authors concluded that all the interventions had the potential to promote the uptake of evidence in practice, although no one intervention seemed to be more effective than the others in all settings. They concluded that combined interventions were more likely to be effective than single interventions. The 2011 review identified a further 13 systematic reviews containing 313 discrete primary studies. Consistent with the previous review, four main strategy types were identified: audit and feedback; computerised decision support; opinion leaders; and multi-faceted interventions (MFIs). Nine of the reviews reported on MFIs. The review highlighted the small effects of single interventions such as audit and feedback, computerised decision support and opinion leaders. MFIs claimed an improvement in effectiveness over single interventions, although effect sizes remained small to moderate and this improvement in effectiveness relating to MFIs has been questioned in a subsequent review [ 10 ]. In updating the review, we anticipated a larger pool of reviews and an opportunity to consolidate learning from more recent systematic reviews of interventions.

This review updates and extends our previous review of systematic reviews of interventions designed to implement research evidence into clinical practice. To identify potentially relevant peer-reviewed research papers, we developed a comprehensive systematic literature search strategy based on the terms used in the Grimshaw et al. [ 9 ] and Boaz, Baeza and Fraser [ 8 ] overview articles. To ensure optimal retrieval, our search strategy was refined with support from an expert university librarian, considering the ongoing improvements in the development of search filters for systematic reviews since our first review [ 11 ]. We also wanted to include technology-related terms (e.g. apps, algorithms, machine learning, artificial intelligence) to find studies that explored interventions based on the use of technological innovations as mechanistic tools for increasing the use of evidence into practice (see Additional file 1 : Appendix A for full search strategy).

The search was performed in June 2022 in the following electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched for articles published since the 2011 review. We searched from January 2010 up to June 2022 and applied no language restrictions. Reference lists of relevant papers were also examined.

We uploaded the results using EPPI-Reviewer, a web-based tool that facilitated semi-automation of the screening process and removal of duplicate studies. We made particular use of a priority screening function to reduce screening workload and avoid ‘data deluge’ [ 12 ]. Through machine learning, one reviewer screened a smaller number of records ( n  = 1200) to train the software to predict whether a given record was more likely to be relevant or irrelevant, thus pulling the relevant studies towards the beginning of the screening process. This automation did not replace manual work but helped the reviewer to identify eligible studies more quickly. During the selection process, we included studies that looked explicitly at interventions designed to turn research evidence into practice. Studies were included if they met the following pre-determined inclusion criteria:

The study was a systematic review

Search terms were included

Focused on the implementation of research evidence into practice

The methodological quality of the included studies was assessed as part of the review

Study populations included healthcare providers and patients. The EPOC taxonomy [ 13 ] was used to categorise the strategies. The EPOC taxonomy has four domains: delivery arrangements, financial arrangements, governance arrangements and implementation strategies. The implementation strategies domain includes 20 strategies targeted at healthcare workers. Numerous EPOC strategies were assessed in the review including educational strategies, local opinion leaders, reminders, ICT-focused approaches and audit and feedback. Some strategies that did not fit easily within the EPOC categories were also included. These were social media strategies and toolkits, and multi-faceted interventions (MFIs) (see Table  2 ). Some systematic reviews included comparisons of different interventions while other reviews compared one type of intervention against a control group. Outcomes related to improvements in health care processes or patient well-being. Numerous individual study types (RCT, CCT, BA, ITS) were included within the systematic reviews.

We excluded papers that:

Focused on changing patient rather than provider behaviour

Had no demonstrable outcomes

Made unclear or no reference to research evidence

The last of these criteria was sometimes difficult to judge, and there was considerable discussion amongst the research team as to whether the link between research evidence and practice was sufficiently explicit in the interventions analysed. As we discussed in the previous review [ 8 ] in the field of healthcare, the principle of evidence-based practice is widely acknowledged and tools to change behaviour such as guidelines are often seen to be an implicit codification of evidence, despite the fact that this is not always the case.

Reviewers employed a two-stage process to select papers for inclusion. First, all titles and abstracts were screened by one reviewer to determine whether the study met the inclusion criteria. Two papers [ 14 , 15 ] were identified that fell just before the 2010 cut-off. As they were not identified in the searches for the first review [ 8 ] they were included and progressed to assessment. Each paper was rated as include, exclude or maybe. The full texts of 111 relevant papers were assessed independently by at least two authors. To reduce the risk of bias, papers were excluded following discussion between all members of the team. 32 papers met the inclusion criteria and proceeded to data extraction. The study selection procedure is documented in a PRISMA literature flow diagram (see Fig.  1 ). We were able to include French, Spanish and Portuguese papers in the selection reflecting the language skills in the study team, but none of the papers identified met the inclusion criteria. Other non- English language papers were excluded.

figure 1

PRISMA flow diagram. Source: authors

One reviewer extracted data on strategy type, number of included studies, local, target population, effectiveness and scope of impact from the included studies. Two reviewers then independently read each paper and noted key findings and broad themes of interest which were then discussed amongst the wider authorial team. Two independent reviewers appraised the quality of included studies using a Quality Assessment Checklist based on Oxman and Guyatt [ 16 ] and Francke et al. [ 17 ]. Each study was rated a quality score ranging from 1 (extensive flaws) to 7 (minimal flaws) (see Additional file 2 : Appendix B). All disagreements were resolved through discussion. Studies were not excluded in this updated overview based on methodological quality as we aimed to reflect the full extent of current research into this topic.

The extracted data were synthesised using descriptive and narrative techniques to identify themes and patterns in the data linked to intervention strategies, targeted behaviours, study settings and study outcomes.

Thirty-two studies were included in the systematic review. Table 1. provides a detailed overview of the included systematic reviews comprising reference, strategy type, quality score, number of included studies, local, target population, effectiveness and scope of impact (see Table  1. at the end of the manuscript). Overall, the quality of the studies was high. Twenty-three studies scored 7, six studies scored 6, one study scored 5, one study scored 4 and one study scored 3. The primary focus of the review was on reviews of effectiveness studies, but a small number of reviews did include data from a wider range of methods including qualitative studies which added to the analysis in the papers [ 18 , 19 , 20 , 21 ]. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. In this section, we discuss the different EPOC-defined implementation strategies in turn. Interestingly, we found only two ‘new’ approaches in this review that did not fit into the existing EPOC approaches. These are a review focused on the use of social media and a review considering toolkits. In addition to single interventions, we also discuss multi-faceted interventions. These were the most common intervention approach overall. A summary is provided in Table  2 .

Educational strategies

The overview identified three systematic reviews focusing on educational strategies. Grudniewicz et al. [ 22 ] explored the effectiveness of printed educational materials on primary care physician knowledge, behaviour and patient outcomes and concluded they were not effective in any of these aspects. Koota, Kääriäinen and Melender [ 23 ] focused on educational interventions promoting evidence-based practice among emergency room/accident and emergency nurses and found that interventions involving face-to-face contact led to significant or highly significant effects on patient benefits and emergency nurses’ knowledge, skills and behaviour. Interventions using written self-directed learning materials also led to significant improvements in nurses’ knowledge of evidence-based practice. Although the quality of the studies was high, the review primarily included small studies with low response rates, and many of them relied on self-assessed outcomes; consequently, the strength of the evidence for these outcomes is modest. Wu et al. [ 20 ] questioned if educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes. Although based on evaluation projects and qualitative data, their results also suggest that positive changes on patient outcomes can be made following the implementation of specific evidence-based approaches (or projects). The differing positive outcomes for educational strategies aimed at nurses might indicate that the target audience is important.

Local opinion leaders

Flodgren et al. [ 24 ] was the only systemic review focusing solely on opinion leaders. The review found that local opinion leaders alone, or in combination with other interventions, can be effective in promoting evidence‐based practice, but this varies both within and between studies and the effect on patient outcomes is uncertain. The review found that, overall, any intervention involving opinion leaders probably improves healthcare professionals’ compliance with evidence-based practice but varies within and across studies. However, how opinion leaders had an impact could not be determined because of insufficient details were provided, illustrating that reporting specific details in published studies is important if diffusion of effective methods of increasing evidence-based practice is to be spread across a system. The usefulness of this review is questionable because it cannot provide evidence of what is an effective opinion leader, whether teams of opinion leaders or a single opinion leader are most effective, or the most effective methods used by opinion leaders.

Pantoja et al. [ 26 ] was the only systemic review focusing solely on manually generated reminders delivered on paper included in the overview. The review explored how these affected professional practice and patient outcomes. The review concluded that manually generated reminders delivered on paper as a single intervention probably led to small to moderate increases in adherence to clinical recommendations, and they could be used as a single quality improvement intervention. However, the authors indicated that this intervention would make little or no difference to patient outcomes. The authors state that such a low-tech intervention may be useful in low- and middle-income countries where paper records are more likely to be the norm.

ICT-focused approaches

The three ICT-focused reviews [ 14 , 27 , 28 ] showed mixed results. Jamal, McKenzie and Clark [ 14 ] explored the impact of health information technology on the quality of medical and health care. They examined the impact of electronic health record, computerised provider order-entry, or decision support system. This showed a positive improvement in adherence to evidence-based guidelines but not to patient outcomes. The number of studies included in the review was low and so a conclusive recommendation could not be reached based on this review. Similarly, Brown et al. [ 28 ] found that technology-enabled knowledge translation interventions may improve knowledge of health professionals, but all eight studies raised concerns of bias. The De Angelis et al. [ 27 ] review was more promising, reporting that ICT can be a good way of disseminating clinical practice guidelines but conclude that it is unclear which type of ICT method is the most effective.

Audit and feedback

Sykes, McAnuff and Kolehmainen [ 29 ] examined whether audit and feedback were effective in dementia care and concluded that it remains unclear which ingredients of audit and feedback are successful as the reviewed papers illustrated large variations in the effectiveness of interventions using audit and feedback.

Non-EPOC listed strategies: social media, toolkits

There were two new (non-EPOC listed) intervention types identified in this review compared to the 2011 review — fewer than anticipated. We categorised a third — ‘care bundles’ [ 36 ] as a multi-faceted intervention due to its description in practice and a fourth — ‘Technology Enhanced Knowledge Transfer’ [ 28 ] was classified as an ICT-focused approach. The first new strategy was identified in Bhatt et al.’s [ 30 ] systematic review of the use of social media for the dissemination of clinical practice guidelines. They reported that the use of social media resulted in a significant improvement in knowledge and compliance with evidence-based guidelines compared with more traditional methods. They noted that a wide selection of different healthcare professionals and patients engaged with this type of social media and its global reach may be significant for low- and middle-income countries. This review was also noteworthy for developing a simple stepwise method for using social media for the dissemination of clinical practice guidelines. However, it is debatable whether social media can be classified as an intervention or just a different way of delivering an intervention. For example, the review discussed involving opinion leaders and patient advocates through social media. However, this was a small review that included only five studies, so further research in this new area is needed. Yamada et al. [ 31 ] draw on 39 studies to explore the application of toolkits, 18 of which had toolkits embedded within larger KT interventions, and 21 of which evaluated toolkits as standalone interventions. The individual component strategies of the toolkits were highly variable though the authors suggest that they align most closely with educational strategies. The authors conclude that toolkits as either standalone strategies or as part of MFIs hold some promise for facilitating evidence use in practice but caution that the quality of many of the primary studies included is considered weak limiting these findings.

Multi-faceted interventions

The majority of the systematic reviews ( n  = 20) reported on more than one intervention type. Some of these systematic reviews focus exclusively on multi-faceted interventions, whilst others compare different single or combined interventions aimed at achieving similar outcomes in particular settings. While these two approaches are often described in a similar way, they are actually quite distinct from each other as the former report how multiple strategies may be strategically combined in pursuance of an agreed goal, whilst the latter report how different strategies may be incidentally used in sometimes contrasting settings in the pursuance of similar goals. Ariyo et al. [ 35 ] helpfully summarise five key elements often found in effective MFI strategies in LMICs — but which may also be transferrable to HICs. First, effective MFIs encourage a multi-disciplinary approach acknowledging the roles played by different professional groups to collectively incorporate evidence-informed practice. Second, they utilise leadership drawing on a wide set of clinical and non-clinical actors including managers and even government officials. Third, multiple types of educational practices are utilised — including input from patients as stakeholders in some cases. Fourth, protocols, checklists and bundles are used — most effectively when local ownership is encouraged. Finally, most MFIs included an emphasis on monitoring and evaluation [ 35 ]. In contrast, other studies offer little information about the nature of the different MFI components of included studies which makes it difficult to extrapolate much learning from them in relation to why or how MFIs might affect practice (e.g. [ 28 , 38 ]). Ultimately, context matters, which some review authors argue makes it difficult to say with real certainty whether single or MFI strategies are superior (e.g. [ 21 , 27 ]). Taking all the systematic reviews together we may conclude that MFIs appear to be more likely to generate positive results than single interventions (e.g. [ 34 , 45 ]) though other reviews should make us cautious (e.g. [ 32 , 43 ]).

While multi-faceted interventions still seem to be more effective than single-strategy interventions, there were important distinctions between how the results of reviews of MFIs are interpreted in this review as compared to the previous reviews [ 8 , 9 ], reflecting greater nuance and debate in the literature. This was particularly noticeable where the effectiveness of MFIs was compared to single strategies, reflecting developments widely discussed in previous studies [ 10 ]. We found that most systematic reviews are bounded by their clinical, professional, spatial, system, or setting criteria and often seek to draw out implications for the implementation of evidence in their areas of specific interest (such as nursing or acute care). Frequently this means combining all relevant studies to explore the respective foci of each systematic review. Therefore, most reviews we categorised as MFIs actually include highly variable numbers and combinations of intervention strategies and highly heterogeneous original study designs. This makes statistical analyses of the type used by Squires et al. [ 10 ] on the three reviews in their paper not possible. Further, it also makes extrapolating findings and commenting on broad themes complex and difficult. This may suggest that future research should shift its focus from merely examining ‘what works’ to ‘what works where and what works for whom’ — perhaps pointing to the value of realist approaches to these complex review topics [ 48 , 49 ] and other more theory-informed approaches [ 50 ].

Some reviews have a relatively small number of studies (i.e. fewer than 10) and the authors are often understandably reluctant to engage with wider debates about the implications of their findings. Other larger studies do engage in deeper discussions about internal comparisons of findings across included studies and also contextualise these in wider debates. Some of the most informative studies (e.g. [ 35 , 40 ]) move beyond EPOC categories and contextualise MFIs within wider systems thinking and implementation theory. This distinction between MFIs and single interventions can actually be very useful as it offers lessons about the contexts in which individual interventions might have bounded effectiveness (i.e. educational interventions for individual change). Taken as a whole, this may also then help in terms of how and when to conjoin single interventions into effective MFIs.

In the two previous reviews, a consistent finding was that MFIs were more effective than single interventions [ 8 , 9 ]. However, like Squires et al. [ 10 ] this overview is more equivocal on this important issue. There are four points which may help account for the differences in findings in this regard. Firstly, the diversity of the systematic reviews in terms of clinical topic or setting is an important factor. Secondly, there is heterogeneity of the studies within the included systematic reviews themselves. Thirdly, there is a lack of consistency with regards to the definition and strategies included within of MFIs. Finally, there are epistemological differences across the papers and the reviews. This means that the results that are presented depend on the methods used to measure, report, and synthesise them. For instance, some reviews highlight that education strategies can be useful to improve provider understanding — but without wider organisational or system-level change, they may struggle to deliver sustained transformation [ 19 , 44 ].

It is also worth highlighting the importance of the theory of change underlying the different interventions. Where authors of the systematic reviews draw on theory, there is space to discuss/explain findings. We note a distinction between theoretical and atheoretical systematic review discussion sections. Atheoretical reviews tend to present acontextual findings (for instance, one study found very positive results for one intervention, and this gets highlighted in the abstract) whilst theoretically informed reviews attempt to contextualise and explain patterns within the included studies. Theory-informed systematic reviews seem more likely to offer more profound and useful insights (see [ 19 , 35 , 40 , 43 , 45 ]). We find that the most insightful systematic reviews of MFIs engage in theoretical generalisation — they attempt to go beyond the data of individual studies and discuss the wider implications of the findings of the studies within their reviews drawing on implementation theory. At the same time, they highlight the active role of context and the wider relational and system-wide issues linked to implementation. It is these types of investigations that can help providers further develop evidence-based practice.

This overview has identified a small, but insightful set of papers that interrogate and help theorise why, how, for whom, and in which circumstances it might be the case that MFIs are superior (see [ 19 , 35 , 40 ] once more). At the level of this overview — and in most of the systematic reviews included — it appears to be the case that MFIs struggle with the question of attribution. In addition, there are other important elements that are often unmeasured, or unreported (e.g. costs of the intervention — see [ 40 ]). Finally, the stronger systematic reviews [ 19 , 35 , 40 , 43 , 45 ] engage with systems issues, human agency and context [ 18 ] in a way that was not evident in the systematic reviews identified in the previous reviews [ 8 , 9 ]. The earlier reviews lacked any theory of change that might explain why MFIs might be more effective than single ones — whereas now some systematic reviews do this, which enables them to conclude that sometimes single interventions can still be more effective.

As Nilsen et al. ([ 6 ] p. 7) note ‘Study findings concerning the effectiveness of various approaches are continuously synthesized and assembled in systematic reviews’. We may have gone as far as we can in understanding the implementation of evidence through systematic reviews of single and multi-faceted interventions and the next step would be to conduct more research exploring the complex and situated nature of evidence used in clinical practice and by particular professional groups. This would further build on the nuanced discussion and conclusion sections in a subset of the papers we reviewed. This might also support the field to move away from isolating individual implementation strategies [ 6 ] to explore the complex processes involving a range of actors with differing capacities [ 51 ] working in diverse organisational cultures. Taxonomies of implementation strategies do not fully account for the complex process of implementation, which involves a range of different actors with different capacities and skills across multiple system levels. There is plenty of work to build on, particularly in the social sciences, which currently sits at the margins of debates about evidence implementation (see for example, Normalisation Process Theory [ 52 ]).

There are several changes that we have identified in this overview of systematic reviews in comparison to the review we published in 2011 [ 8 ]. A consistent and welcome finding is that the overall quality of the systematic reviews themselves appears to have improved between the two reviews, although this is not reflected upon in the papers. This is exhibited through better, clearer reporting mechanisms in relation to the mechanics of the reviews, alongside a greater attention to, and deeper description of, how potential biases in included papers are discussed. Additionally, there is an increased, but still limited, inclusion of original studies conducted in low- and middle-income countries as opposed to just high-income countries. Importantly, we found that many of these systematic reviews are attuned to, and comment upon the contextual distinctions of pursuing evidence-informed interventions in health care settings in different economic settings. Furthermore, systematic reviews included in this updated article cover a wider set of clinical specialities (both within and beyond hospital settings) and have a focus on a wider set of healthcare professions — discussing both similarities, differences and inter-professional challenges faced therein, compared to the earlier reviews. These wider ranges of studies highlight that a particular intervention or group of interventions may work well for one professional group but be ineffective for another. This diversity of study settings allows us to consider the important role context (in its many forms) plays on implementing evidence into practice. Examining the complex and varied context of health care will help us address what Nilsen et al. ([ 6 ] p. 1) described as, ‘society’s health problems [that] require research-based knowledge acted on by healthcare practitioners together with implementation of political measures from governmental agencies’. This will help us shift implementation science to move, ‘beyond a success or failure perspective towards improved analysis of variables that could explain the impact of the implementation process’ ([ 6 ] p. 2).

This review brings together 32 papers considering individual and multi-faceted interventions designed to support the use of evidence in clinical practice. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been conducted. As a whole, this substantial body of knowledge struggles to tell us more about the use of individual and MFIs than: ‘it depends’. To really move forwards in addressing the gap between research evidence and practice, we may need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of perspectives, especially from the social, economic, political and behavioural sciences in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed. Harvey et al. [ 53 ] suggest that when context is likely to be critical to implementation success there are a range of primary research approaches (participatory research, realist evaluation, developmental evaluation, ethnography, quality/ rapid cycle improvement) that are likely to be appropriate and insightful. While these approaches often form part of implementation studies in the form of process evaluations, they are usually relatively small scale in relation to implementation research as a whole. As a result, the findings often do not make it into the subsequent systematic reviews. This review provides further evidence that we need to bring qualitative approaches in from the periphery to play a central role in many implementation studies and subsequent evidence syntheses. It would be helpful for systematic reviews, at the very least, to include more detail about the interventions and their implementation in terms of how and why they worked.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Before and after study

Controlled clinical trial

Effective Practice and Organisation of Care

High-income countries

Information and Communications Technology

Interrupted time series

Knowledge translation

Low- and middle-income countries

Randomised controlled trial

Grol R, Grimshaw J. From best evidence to best practice: effective implementation of change in patients’ care. Lancet. 2003;362:1225–30. https://doi.org/10.1016/S0140-6736(03)14546-1 .

Article   PubMed   Google Scholar  

Green LA, Seifert CM. Translation of research into practice: why we can’t “just do it.” J Am Board Fam Pract. 2005;18:541–5. https://doi.org/10.3122/jabfm.18.6.541 .

Eccles MP, Mittman BS. Welcome to Implementation Science. Implement Sci. 2006;1:1–3. https://doi.org/10.1186/1748-5908-1-1 .

Article   PubMed Central   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:2–14. https://doi.org/10.1186/s13012-015-0209-1 .

Article   Google Scholar  

Waltz TJ, Powell BJ, Matthieu MM, Damschroder LJ, et al. Use of concept mapping to characterize relationships among implementation strategies and assess their feasibility and importance: results from the Expert Recommendations for Implementing Change (ERIC) study. Implement Sci. 2015;10:1–8. https://doi.org/10.1186/s13012-015-0295-0 .

Nilsen P, Ståhl C, Roback K, et al. Never the twain shall meet? - a comparison of implementation science and policy implementation research. Implementation Sci. 2013;8:2–12. https://doi.org/10.1186/1748-5908-8-63 .

Rycroft-Malone J, Seers K, Eldh AC, et al. A realist process evaluation within the Facilitating Implementation of Research Evidence (FIRE) cluster randomised controlled international trial: an exemplar. Implementation Sci. 2018;13:1–15. https://doi.org/10.1186/s13012-018-0811-0 .

Boaz A, Baeza J, Fraser A, European Implementation Score Collaborative Group (EIS). Effective implementation of research into practice: an overview of systematic reviews of the health literature. BMC Res Notes. 2011;4:212. https://doi.org/10.1186/1756-0500-4-212 .

Article   PubMed   PubMed Central   Google Scholar  

Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, et al. Changing provider behavior – an overview of systematic reviews of interventions. Med Care. 2001;39 8Suppl 2:II2–45.

Google Scholar  

Squires JE, Sullivan K, Eccles MP, et al. Are multifaceted interventions more effective than single-component interventions in changing health-care professionals’ behaviours? An overview of systematic reviews. Implement Sci. 2014;9:1–22. https://doi.org/10.1186/s13012-014-0152-6 .

Salvador-Oliván JA, Marco-Cuenca G, Arquero-Avilés R. Development of an efficient search filter to retrieve systematic reviews from PubMed. J Med Libr Assoc. 2021;109:561–74. https://doi.org/10.5195/jmla.2021.1223 .

Thomas JM. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.

Effective Practice and Organisation of Care (EPOC). The EPOC taxonomy of health systems interventions. EPOC Resources for review authors. Oslo: Norwegian Knowledge Centre for the Health Services; 2016. epoc.cochrane.org/epoc-taxonomy . Accessed 9 Oct 2023.

Jamal A, McKenzie K, Clark M. The impact of health information technology on the quality of medical and health care: a systematic review. Health Inf Manag. 2009;38:26–37. https://doi.org/10.1177/183335830903800305 .

Menon A, Korner-Bitensky N, Kastner M, et al. Strategies for rehabilitation professionals to move evidence-based knowledge into practice: a systematic review. J Rehabil Med. 2009;41:1024–32. https://doi.org/10.2340/16501977-0451 .

Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J Clin Epidemiol. 1991;44:1271–8. https://doi.org/10.1016/0895-4356(91)90160-b .

Article   CAS   PubMed   Google Scholar  

Francke AL, Smit MC, de Veer AJ, et al. Factors influencing the implementation of clinical guidelines for health care professionals: a systematic meta-review. BMC Med Inform Decis Mak. 2008;8:1–11. https://doi.org/10.1186/1472-6947-8-38 .

Jones CA, Roop SC, Pohar SL, et al. Translating knowledge in rehabilitation: systematic review. Phys Ther. 2015;95:663–77. https://doi.org/10.2522/ptj.20130512 .

Scott D, Albrecht L, O’Leary K, Ball GDC, et al. Systematic review of knowledge translation strategies in the allied health professions. Implement Sci. 2012;7:1–17. https://doi.org/10.1186/1748-5908-7-70 .

Wu Y, Brettle A, Zhou C, Ou J, et al. Do educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes? A systematic review. Nurse Educ Today. 2018;70:109–14. https://doi.org/10.1016/j.nedt.2018.08.026 .

Yost J, Ganann R, Thompson D, Aloweni F, et al. The effectiveness of knowledge translation interventions for promoting evidence-informed decision-making among nurses in tertiary care: a systematic review and meta-analysis. Implement Sci. 2015;10:1–15. https://doi.org/10.1186/s13012-015-0286-1 .

Grudniewicz A, Kealy R, Rodseth RN, Hamid J, et al. What is the effectiveness of printed educational materials on primary care physician knowledge, behaviour, and patient outcomes: a systematic review and meta-analyses. Implement Sci. 2015;10:2–12. https://doi.org/10.1186/s13012-015-0347-5 .

Koota E, Kääriäinen M, Melender HL. Educational interventions promoting evidence-based practice among emergency nurses: a systematic review. Int Emerg Nurs. 2018;41:51–8. https://doi.org/10.1016/j.ienj.2018.06.004 .

Flodgren G, O’Brien MA, Parmelli E, et al. Local opinion leaders: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD000125.pub5 .

Arditi C, Rège-Walther M, Durieux P, et al. Computer-generated reminders delivered on paper to healthcare professionals: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2017. https://doi.org/10.1002/14651858.CD001175.pub4 .

Pantoja T, Grimshaw JM, Colomer N, et al. Manually-generated reminders delivered on paper: effects on professional practice and patient outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD001174.pub4 .

De Angelis G, Davies B, King J, McEwan J, et al. Information and communication technologies for the dissemination of clinical practice guidelines to health professionals: a systematic review. JMIR Med Educ. 2016;2:e16. https://doi.org/10.2196/mededu.6288 .

Brown A, Barnes C, Byaruhanga J, McLaughlin M, et al. Effectiveness of technology-enabled knowledge translation strategies in improving the use of research in public health: systematic review. J Med Internet Res. 2020;22:e17274. https://doi.org/10.2196/17274 .

Sykes MJ, McAnuff J, Kolehmainen N. When is audit and feedback effective in dementia care? A systematic review. Int J Nurs Stud. 2018;79:27–35. https://doi.org/10.1016/j.ijnurstu.2017.10.013 .

Bhatt NR, Czarniecki SW, Borgmann H, et al. A systematic review of the use of social media for dissemination of clinical practice guidelines. Eur Urol Focus. 2021;7:1195–204. https://doi.org/10.1016/j.euf.2020.10.008 .

Yamada J, Shorkey A, Barwick M, Widger K, et al. The effectiveness of toolkits as knowledge translation strategies for integrating evidence into clinical care: a systematic review. BMJ Open. 2015;5:e006808. https://doi.org/10.1136/bmjopen-2014-006808 .

Afari-Asiedu S, Abdulai MA, Tostmann A, et al. Interventions to improve dispensing of antibiotics at the community level in low and middle income countries: a systematic review. J Glob Antimicrob Resist. 2022;29:259–74. https://doi.org/10.1016/j.jgar.2022.03.009 .

Boonacker CW, Hoes AW, Dikhoff MJ, Schilder AG, et al. Interventions in health care professionals to improve treatment in children with upper respiratory tract infections. Int J Pediatr Otorhinolaryngol. 2010;74:1113–21. https://doi.org/10.1016/j.ijporl.2010.07.008 .

Al Zoubi FM, Menon A, Mayo NE, et al. The effectiveness of interventions designed to increase the uptake of clinical practice guidelines and best practices among musculoskeletal professionals: a systematic review. BMC Health Serv Res. 2018;18:2–11. https://doi.org/10.1186/s12913-018-3253-0 .

Ariyo P, Zayed B, Riese V, Anton B, et al. Implementation strategies to reduce surgical site infections: a systematic review. Infect Control Hosp Epidemiol. 2019;3:287–300. https://doi.org/10.1017/ice.2018.355 .

Borgert MJ, Goossens A, Dongelmans DA. What are effective strategies for the implementation of care bundles on ICUs: a systematic review. Implement Sci. 2015;10:1–11. https://doi.org/10.1186/s13012-015-0306-1 .

Cahill LS, Carey LM, Lannin NA, et al. Implementation interventions to promote the uptake of evidence-based practices in stroke rehabilitation. Cochrane Database Syst Rev. 2020. https://doi.org/10.1002/14651858.CD012575.pub2 .

Pedersen ER, Rubenstein L, Kandrack R, Danz M, et al. Elusive search for effective provider interventions: a systematic review of provider interventions to increase adherence to evidence-based treatment for depression. Implement Sci. 2018;13:1–30. https://doi.org/10.1186/s13012-018-0788-8 .

Jenkins HJ, Hancock MJ, French SD, Maher CG, et al. Effectiveness of interventions designed to reduce the use of imaging for low-back pain: a systematic review. CMAJ. 2015;187:401–8. https://doi.org/10.1503/cmaj.141183 .

Bennett S, Laver K, MacAndrew M, Beattie E, et al. Implementation of evidence-based, non-pharmacological interventions addressing behavior and psychological symptoms of dementia: a systematic review focused on implementation strategies. Int Psychogeriatr. 2021;33:947–75. https://doi.org/10.1017/S1041610220001702 .

Noonan VK, Wolfe DL, Thorogood NP, et al. Knowledge translation and implementation in spinal cord injury: a systematic review. Spinal Cord. 2014;52:578–87. https://doi.org/10.1038/sc.2014.62 .

Albrecht L, Archibald M, Snelgrove-Clarke E, et al. Systematic review of knowledge translation strategies to promote research uptake in child health settings. J Pediatr Nurs. 2016;31:235–54. https://doi.org/10.1016/j.pedn.2015.12.002 .

Campbell A, Louie-Poon S, Slater L, et al. Knowledge translation strategies used by healthcare professionals in child health settings: an updated systematic review. J Pediatr Nurs. 2019;47:114–20. https://doi.org/10.1016/j.pedn.2019.04.026 .

Bird ML, Miller T, Connell LA, et al. Moving stroke rehabilitation evidence into practice: a systematic review of randomized controlled trials. Clin Rehabil. 2019;33:1586–95. https://doi.org/10.1177/0269215519847253 .

Goorts K, Dizon J, Milanese S. The effectiveness of implementation strategies for promoting evidence informed interventions in allied healthcare: a systematic review. BMC Health Serv Res. 2021;21:1–11. https://doi.org/10.1186/s12913-021-06190-0 .

Zadro JR, O’Keeffe M, Allison JL, Lembke KA, et al. Effectiveness of implementation strategies to improve adherence of physical therapist treatment choices to clinical practice guidelines for musculoskeletal conditions: systematic review. Phys Ther. 2020;100:1516–41. https://doi.org/10.1093/ptj/pzaa101 .

Van der Veer SN, Jager KJ, Nache AM, et al. Translating knowledge on best practice into improving quality of RRT care: a systematic review of implementation strategies. Kidney Int. 2011;80:1021–34. https://doi.org/10.1038/ki.2011.222 .

Pawson R, Greenhalgh T, Harvey G, et al. Realist review–a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy. 2005;10Suppl 1:21–34. https://doi.org/10.1258/1355819054308530 .

Rycroft-Malone J, McCormack B, Hutchinson AM, et al. Realist synthesis: illustrating the method for implementation research. Implementation Sci. 2012;7:1–10. https://doi.org/10.1186/1748-5908-7-33 .

Johnson MJ, May CR. Promoting professional behaviour change in healthcare: what interventions work, and why? A theory-led overview of systematic reviews. BMJ Open. 2015;5:e008592. https://doi.org/10.1136/bmjopen-2015-008592 .

Metz A, Jensen T, Farley A, Boaz A, et al. Is implementation research out of step with implementation practice? Pathways to effective implementation support over the last decade. Implement Res Pract. 2022;3:1–11. https://doi.org/10.1177/26334895221105585 .

May CR, Finch TL, Cornford J, Exley C, et al. Integrating telecare for chronic disease management in the community: What needs to be done? BMC Health Serv Res. 2011;11:1–11. https://doi.org/10.1186/1472-6963-11-131 .

Harvey G, Rycroft-Malone J, Seers K, Wilson P, et al. Connecting the science and practice of implementation – applying the lens of context to inform study design in implementation research. Front Health Serv. 2023;3:1–15. https://doi.org/10.3389/frhs.2023.1162762 .

Download references

Acknowledgements

The authors would like to thank Professor Kathryn Oliver for her support in the planning the review, Professor Steve Hanney for reading and commenting on the final manuscript and the staff at LSHTM library for their support in planning and conducting the literature search.

This study was supported by LSHTM’s Research England QR strategic priorities funding allocation and the National Institute for Health and Care Research (NIHR) Applied Research Collaboration South London (NIHR ARC South London) at King’s College Hospital NHS Foundation Trust. Grant number NIHR200152. The views expressed are those of the author(s) and not necessarily those of the NIHR, the Department of Health and Social Care or Research England.

Author information

Authors and affiliations.

Health and Social Care Workforce Research Unit, The Policy Institute, King’s College London, Virginia Woolf Building, 22 Kingsway, London, WC2B 6LE, UK

Annette Boaz

King’s Business School, King’s College London, 30 Aldwych, London, WC2B 4BG, UK

Juan Baeza & Alec Fraser

Federal University of Santa Catarina (UFSC), Campus Universitário Reitor João Davi Ferreira Lima, Florianópolis, SC, 88.040-900, Brazil

Erik Persson

You can also search for this author in PubMed   Google Scholar

Contributions

AB led the conceptual development and structure of the manuscript. EP conducted the searches and data extraction. All authors contributed to screening and quality appraisal. EP and AF wrote the first draft of the methods section. AB, JB and AF performed result synthesis and contributed to the analyses. AB wrote the first draft of the manuscript and incorporated feedback and revisions from all other authors. All authors revised and approved the final manuscript.

Corresponding author

Correspondence to Annette Boaz .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: appendix a., additional file 2: appendix b., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Boaz, A., Baeza, J., Fraser, A. et al. ‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice. Implementation Sci 19 , 15 (2024). https://doi.org/10.1186/s13012-024-01337-z

Download citation

Received : 01 November 2023

Accepted : 05 January 2024

Published : 19 February 2024

DOI : https://doi.org/10.1186/s13012-024-01337-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation
  • Interventions
  • Clinical practice
  • Research evidence
  • Multi-faceted

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

big data in education research papers

Our next-generation model: Gemini 1.5

Feb 15, 2024

The model delivers dramatically enhanced performance, with a breakthrough in long-context understanding across modalities.

SundarPichai_2x.jpg

A note from Google and Alphabet CEO Sundar Pichai:

Last week, we rolled out our most capable model, Gemini 1.0 Ultra, and took a significant step forward in making Google products more helpful, starting with Gemini Advanced . Today, developers and Cloud customers can begin building with 1.0 Ultra too — with our Gemini API in AI Studio and in Vertex AI .

Our teams continue pushing the frontiers of our latest models with safety at the core. They are making rapid progress. In fact, we’re ready to introduce the next generation: Gemini 1.5. It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute.

This new generation also delivers a breakthrough in long-context understanding. We’ve been able to significantly increase the amount of information our models can process — running up to 1 million tokens consistently, achieving the longest context window of any large-scale foundation model yet.

Longer context windows show us the promise of what is possible. They will enable entirely new capabilities and help developers build much more useful models and applications. We’re excited to offer a limited preview of this experimental feature to developers and enterprise customers. Demis shares more on capabilities, safety and availability below.

Introducing Gemini 1.5

By Demis Hassabis, CEO of Google DeepMind, on behalf of the Gemini team

This is an exciting time for AI. New advances in the field have the potential to make AI more helpful for billions of people over the coming years. Since introducing Gemini 1.0 , we’ve been testing, refining and enhancing its capabilities.

Today, we’re announcing our next-generation model: Gemini 1.5.

Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in our approach, building upon research and engineering innovations across nearly every part of our foundation model development and infrastructure. This includes making Gemini 1.5 more efficient to train and serve, with a new Mixture-of-Experts (MoE) architecture.

The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra , our largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.

Gemini 1.5 Pro comes with a standard 128,000 token context window. But starting today, a limited group of developers and enterprise customers can try it with a context window of up to 1 million tokens via AI Studio and Vertex AI in private preview.

As we roll out the full 1 million token context window, we’re actively working on optimizations to improve latency, reduce computational requirements and enhance the user experience. We’re excited for people to try this breakthrough capability, and we share more details on future availability below.

These continued advances in our next-generation models will open up new possibilities for people, developers and enterprises to create, discover and build using AI.

Context lengths of leading foundation models

Highly efficient architecture

Gemini 1.5 is built upon our leading research on Transformer and MoE architecture. While a traditional Transformer functions as one large neural network, MoE models are divided into smaller "expert” neural networks.

Depending on the type of input given, MoE models learn to selectively activate only the most relevant expert pathways in its neural network. This specialization massively enhances the model’s efficiency. Google has been an early adopter and pioneer of the MoE technique for deep learning through research such as Sparsely-Gated MoE , GShard-Transformer , Switch-Transformer, M4 and more.

Our latest innovations in model architecture allow Gemini 1.5 to learn complex tasks more quickly and maintain quality, while being more efficient to train and serve. These efficiencies are helping our teams iterate, train and deliver more advanced versions of Gemini faster than ever before, and we’re working on further optimizations.

Greater context, more helpful capabilities

An AI model’s “context window” is made up of tokens, which are the building blocks used for processing information. Tokens can be entire parts or subsections of words, images, videos, audio or code. The bigger a model’s context window, the more information it can take in and process in a given prompt — making its output more consistent, relevant and useful.

Through a series of machine learning innovations, we’ve increased 1.5 Pro’s context window capacity far beyond the original 32,000 tokens for Gemini 1.0. We can now run up to 1 million tokens in production.

This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words. In our research, we’ve also successfully tested up to 10 million tokens.

Complex reasoning about vast amounts of information

1.5 Pro can seamlessly analyze, classify and summarize large amounts of content within a given prompt. For example, when given the 402-page transcripts from Apollo 11’s mission to the moon, it can reason about conversations, events and details found across the document.

Reasoning across a 402-page transcript: Gemini 1.5 Pro Demo

Gemini 1.5 Pro can understand, reason about and identify curious details in the 402-page transcripts from Apollo 11’s mission to the moon.

Better understanding and reasoning across modalities

1.5 Pro can perform highly-sophisticated understanding and reasoning tasks for different modalities, including video. For instance, when given a 44-minute silent Buster Keaton movie , the model can accurately analyze various plot points and events, and even reason about small details in the movie that could easily be missed.

Multimodal prompting with a 44-minute movie: Gemini 1.5 Pro Demo

Gemini 1.5 Pro can identify a scene in a 44-minute silent Buster Keaton movie when given a simple line drawing as reference material for a real-life object.

Relevant problem-solving with longer blocks of code

1.5 Pro can perform more relevant problem-solving tasks across longer blocks of code. When given a prompt with more than 100,000 lines of code, it can better reason across examples, suggest helpful modifications and give explanations about how different parts of the code works.

Problem solving across 100,633 lines of code | Gemini 1.5 Pro Demo

Gemini 1.5 Pro can reason across 100,000 lines of code giving helpful solutions, modifications and explanations.

Enhanced performance

When tested on a comprehensive panel of text, code, image, audio and video evaluations, 1.5 Pro outperforms 1.0 Pro on 87% of the benchmarks used for developing our large language models (LLMs). And when compared to 1.0 Ultra on the same benchmarks, it performs at a broadly similar level.

Gemini 1.5 Pro maintains high levels of performance even as its context window increases. In the Needle In A Haystack (NIAH) evaluation, where a small piece of text containing a particular fact or statement is purposely placed within a long block of text, 1.5 Pro found the embedded text 99% of the time, in blocks of data as long as 1 million tokens.

Gemini 1.5 Pro also shows impressive “in-context learning” skills, meaning that it can learn a new skill from information given in a long prompt, without needing additional fine-tuning. We tested this skill on the Machine Translation from One Book (MTOB) benchmark, which shows how well the model learns from information it’s never seen before. When given a grammar manual for Kalamang , a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person learning from the same content.

As 1.5 Pro’s long context window is the first of its kind among large-scale models, we’re continuously developing new evaluations and benchmarks for testing its novel capabilities.

For more details, see our Gemini 1.5 Pro technical report .

Extensive ethics and safety testing

In line with our AI Principles and robust safety policies, we’re ensuring our models undergo extensive ethics and safety tests. We then integrate these research learnings into our governance processes and model development and evaluations to continuously improve our AI systems.

Since introducing 1.0 Ultra in December, our teams have continued refining the model, making it safer for a wider release. We’ve also conducted novel research on safety risks and developed red-teaming techniques to test for a range of potential harms.

In advance of releasing 1.5 Pro, we've taken the same approach to responsible deployment as we did for our Gemini 1.0 models, conducting extensive evaluations across areas including content safety and representational harms, and will continue to expand this testing. Beyond this, we’re developing further tests that account for the novel long-context capabilities of 1.5 Pro.

Build and experiment with Gemini models

We’re committed to bringing each new generation of Gemini models to billions of people, developers and enterprises around the world responsibly.

Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI . Read more about this on our Google for Developers blog and Google Cloud blog .

We’ll introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model.

Early testers can try the 1 million token context window at no cost during the testing period, though they should expect longer latency times with this experimental feature. Significant improvements in speed are also on the horizon.

Developers interested in testing 1.5 Pro can sign up now in AI Studio, while enterprise customers can reach out to their Vertex AI account team.

Learn more about Gemini’s capabilities and see how it works .

Get more stories from Google in your inbox.

Your information will be used in accordance with Google's privacy policy.

Done. Just one step more.

Check your inbox to confirm your subscription.

You are already subscribed to our newsletter.

You can also subscribe with a different email address .

Related stories

What is a long context window.

MSC_Keyword_Cover (3)

How AI can strengthen digital security

Shield

Working together to address AI risks and opportunities at MSC

AI Evergreen 1 (1)

How we’re partnering with the industry, governments and civil society to advance AI

NWSL_Pixel_Hero

Pixel is now the Official Mobile Phone of the National Women’s Soccer League

Bard_Gemini_Hero

Bard becomes Gemini: Try Ultra 1.0 and a new mobile app today

Let’s stay in touch. Get the latest news from Google in your inbox.

Help | Advanced Search

Quantum Physics

Title: simulator demonstration of large scale variational quantum algorithm on hpc cluster.

Abstract: Advances in quantum simulator technology is increasingly required because research on quantum algorithms is becoming more sophisticated and complex. State vector simulation utilizes CPU and memory resources in computing nodes exponentially with respect to the number of qubits; furthermore, in a variational quantum algorithm, the large number of repeated runs by classical optimization is also a heavy load. This problem has been addressed by preparing numerous computing nodes or simulation frameworks that work effectively. This study aimed to accelerate quantum simulation using two newly proposed methods: to efficiently utilize limited computational resources by adjusting the ratio of the MPI and distributed processing parallelism corresponding to the target problem settings and to slim down the Hamiltonian by considering the effect of accuracy on the calculation result. Ground-state energy calculations of fermionic model were performed using variational quantum eigensolver (VQE) on an HPC cluster with up to 1024 FUJITSU Processor A64FX connected to each other by InfiniBand; the processor is also used on supercomputer Fugaku. We achieved 200 times higher speed over VQE simulations and demonstrated 32 qubits ground-state energy calculations in acceptable time. This result indicates that > 30 qubit state vector simulations can be realistically utilized to further research on variational quantum algorithms.

Submission history

Access paper:.

  • Download PDF
  • Other Formats

References & Citations

  • INSPIRE HEP
  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Google’s new Gemini model can analyze an hour-long video — but few people can use it

big data in education research papers

Last October, a research paper published by a Google data scientist, the CTO of Databricks Matei Zaharia and UC Berkeley professor Pieter Abbeel posited a way to allow GenAI models — i.e. models along the lines of OpenAI’s GPT-4 and ChatGPT — to ingest far more data than was previously possible. In the study, the co-authors demonstrated that, by removing a major memory bottleneck for AI models, they could enable models to process millions of words as opposed to hundreds of thousands — the maximum of the most capable models at the time.

AI research moves fast, it seems.

Today, Google announced the release of Gemini 1.5 Pro, the newest member of its Gemini family of GenAI models. Designed to be a drop-in replacement for Gemini 1.0 Pro (which formerly went by “Gemini Pro 1.0” for reasons known only to Google’s labyrinthine marketing arm), Gemini 1.5 Pro is improved in a number of areas compared with its predecessor, perhaps most significantly in the amount of data that it can process.

Gemini 1.5 Pro can take in ~700,000 words, or ~30,000 lines of code — 35x the amount Gemini 1.0 Pro can handle. And — the model being multimodal — it’s not limited to text. Gemini 1.5 Pro can ingest up to 11 hours of audio or an hour of video in a variety of different languages.

Google Gemini 1.5 Pro

Image Credits: Google

To be clear, that’s an upper bound.

The version of Gemini 1.5 Pro available to most developers and customers starting today (in a limited preview) can only process ~100,000 words at once. Google’s characterizing the large-data-input Gemini 1.5 Pro as “experimental,” allowing only developers approved as part of a private preview to pilot it via the company’s GenAI dev tool AI Studio . Several customers using Google’s Vertex AI  platform also have access to the large-data-input Gemini 1.5 Pro — but not all.

Still, VP of research at Google DeepMind Oriol Vinyals heralded it as an achievement.

“When you interact with [GenAI] models, the information you’re inputting and outputting becomes the context, and the longer and more complex your questions and interactions are, the longer the context the model needs to be able to deal with gets,” Vinyals said during a press briefing. “We’ve unlocked long context in a pretty massive way.”

Big context

A model’s context, or context window, refers to input data (e.g. text) that the model considers before generating output (e.g. additional text). A simple question — “Who won the 2020 U.S. presidential election?” — can serve as context, as can a movie script, email or e-book.

Models with small context windows tend to “forget” the content of even very recent conversations, leading them to veer off topic — often in problematic ways. This isn’t necessarily so with models with large contexts. As an added upside, large-context models can better grasp the narrative flow of data they take in and generate more contextually rich responses — hypothetically, at least.

There have been other attempts at — and experiments on — models with atypically large context windows.

AI startup Magic claimed last summer to have developed a large language model (LLM) with a 5 million-token context window. Two papers in the past year detail model architectures ostensibly capable of scaling to a million tokens — and beyond. (“Tokens” are subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.”) And recently, a group of scientists hailing from Meta, MIT and Carnegie Mellon developed a technique that they say removes the constraint on model context window size altogether.

But Google is the first to make a model with a context window of this size commercially available, beating the previous leader Anthropic’s 200,000-token context window — if a private preview counts as commercially available.

Google Gemini 1.5 Pro

Gemini 1.5 Pro’s maximum context window is 1 million tokens, and the version of the model more widely available has a 128,000-token context window, the same as OpenAI’s GPT-4 Turbo .

So what can one accomplish with a 1 million-token context window? Lots of things, Google promises — like analyzing a whole code library, “reasoning across” lengthy documents like contracts, holding long conversations with a chatbot and analyzing and comparing content in videos.

During the briefing, Google showed two prerecorded demos of Gemini 1.5 Pro with the 1 million-token context window enabled.

In the first, the demonstrator asked Gemini 1.5 Pro to search the transcript of the Apollo 11 moon landing telecast — which comes to around 402 pages — for quotes containing jokes, and then to find a scene in the telecast that looked similar to a pencil sketch. In the second, the demonstrator told the model to search for scenes in “Sherlock Jr.,” the Buster Keaton film, going by descriptions and another sketch.

Google Gemini 1.5 Pro

Gemini 1.5 Pro successfully completed all the tasks asked of it, but not particularly quickly. Each took between ~20 seconds and a minute to process — far longer than, say, the average ChatGPT query.

Google Gemini 1.5 Pro

Vinyals says that the latency will improve as the model’s optimized. Already, the company’s testing a version of Gemini 1.5 Pro with a 10 million-token context window.

“The latency aspect [is something] we’re … working to optimize — this is still in an experimental stage, in a research stage,” he said. “So these issues I would say are present like with any other model.”

Me, I’m not so sure latency that poor will be attractive to many folks — much less paying customers. Having to wait minutes at a time to search across a video doesn’t sound pleasant — or very scalable in the near term. And I’m concerned how the latency manifests in other applications, like chatbot conversations and analyzing codebases. Vinyals didn’t say — which doesn’t instill much confidence.

My more optimistic colleague Frederic Lardinois pointed out that the overall time savings might just make the thumb twiddling worth it. But I think it’ll depend very much on the use case. For picking out a show’s plot points? Perhaps not. But for finding the right screengrab from a movie scene you only hazily recall? Maybe.

Other improvements

Beyond the expanded context window, Gemini 1.5 Pro brings other, quality-of-life upgrades to the table.

Google’s claiming that — in terms of quality — Gemini 1.5 Pro is “comparable” to the current version of Gemini Ultra, Google’s flagship GenAI model, thanks to a new architecture comprised of smaller, specialized “expert” models. Gemini 1.5 Pro essentially breaks down tasks into multiple subtasks and then delegates them to the appropriate expert models, deciding which task to delegate based on its own predictions.

MoE isn’t novel — it’s been around in some form for years. But its efficiency and flexibility has made it an increasingly popular choice among model vendors (see: the model powering Microsoft’s language translation services).

Now, “comparable quality” is a bit of a nebulous descriptor. Quality where it concerns GenAI models, especially multimodal ones, is hard to quantify — doubly so when the models are gated behind private previews that exclude the press. For what it’s worth, Google claims that Gemini 1.5 Pro performs at a “broadly similar level” compared to Ultra on the benchmarks the company uses to develop LLMs while  outperforming Gemini 1.0 Pro on 87% of those benchmarks. ( I’ll note that outperforming Gemini 1.0 Pro is a low bar .)

Pricing is a big question mark.

During the private preview, Gemini 1.5 Pro with the 1 million-token context window will be free to use, Google says. But the company plans to introduce pricing tiers in the near future that start at the standard 128,000 context window and scale up to 1 million tokens.

I have to imagine the larger context window won’t come cheap — and Google didn’t allay fears by opting not to reveal pricing during the briefing. If pricing’s in line with Anthropic’s , it could cost $8 per million prompt tokens and $24 per million generated tokens. But perhaps it’ll be lower; stranger things have happened! We’ll have to wait and see.

I wonder, too, about the implications for the rest of the models in the Gemini family, chiefly Gemini Ultra. Can we expect Ultra model upgrades roughly aligned with Pro upgrades? Or will there always be — as there is now — an awkward period where the available Pro models are superior performance-wise to the Ultra models, which Google’s still marketing as the top of the line in its Gemini portfolio?

Chalk it up to teething issues if you’re feeling charitable. If you’re not, call it like it is: darn confusing.

IMAGES

  1. (PDF) Big Data and E-Learning in Education

    big data in education research papers

  2. (PDF) Re-Imaging Learning Environments USE OF BIG DATA IN EDUCATION

    big data in education research papers

  3. Aga Khan University & East African Institute: Big Data Can Look at the

    big data in education research papers

  4. (PDF) A Comparative Study on Big Data Applications in Higher Education

    big data in education research papers

  5. (PDF) Big data in education: a state of the art, limitations, and

    big data in education research papers

  6. (PDF) Tendency to Use Big Data in Education Based on Its Opportunities

    big data in education research papers

COMMENTS

  1. Big data in education: a state of the art, limitations, and future

    It has been found that the current studies covered four main research themes under big data in education, mainly, learner's behavior and performance, modelling and educational data warehouse, improvement in the educational system, and integration of big data into the curriculum.

  2. Education big data and learning analytics: a bibliometric analysis

    Introduction Big data in education has become a trend in recent years (Wang, 2016 ). The current era involves the creation and use of an enormous volume of data. Big data is a result of...

  3. The Research Trend of Big Data in Education and the Impact of Teacher

    Educational Big Data (EBD) is currently faced with an unprecedented recognition of existing educational psychology, with technological platforms playing an increasingly vital role in the adaptation of current approaches toward technology-based programs.

  4. Educational Big Data: Predictions, Applications and Challenges

    1. Introduction In recent times, big data has brought persuasive changes in educational systems. The proliferation of educational data has generated new opportunities and challenges in the field of educational big data.

  5. Mining Big Data in Education: Affordances and Challenges

    Mining Big Data in Education: Affordances and Challenges - Christian Fischer, Zachary A. Pardos, Ryan Shaun Baker, Joseph Jay Williams, Padhraic Smyth, Renzhe Yu, Stefan Slater, Rachel Baker, Mark Warschauer, 2020 Review of Research in Education Impact Factor: 6.4 5-Year Impact Factor: 9.0 JOURNAL HOMEPAGE SUBMIT PAPER Open access Research article

  6. Educational Big Data: Predictions, Applications and Challenges

    According to the data properties of educational big data, educational big data can be characterized by the five key features: volume, velocity, variety, value and verity. Hundreds of millions of educational data are generated each day from millions of schools globally, representing the volume feature. The growth rate represents the velocity ...

  7. Challenges and Future Directions of Big Data and Artificial

    Introduction. The purpose of this position paper is to present current status, opportunities, and challenges of big data and AI in education. The work has originated from the opinions and panel discussion minutes of an international conference on big data and AI in education (The International Learning Sciences Forum, 2019), where prominent researchers and experts from different disciplines ...

  8. A decade of research into the application of big data and analytics in

    1 Citation Explore all metrics Abstract The need for data-driven decision-making primarily motivates interest in analysing Big Data in higher education. Although there has been considerable research on the value of Big Data in higher education, its application to address critical issues within the sector is still limited.

  9. [PDF] Big data in education: a state of the art, limitations, and

    A systematic review on big data in education is conducted in order to explore the trends, classify the research themes, and highlight the limitations and provide possible future directions in the domain. Big data is an essential aspect of innovation which has recently gained major attention from both academics and practitioners. Considering the importance of the education sector, the current ...

  10. Big Data Application in Education: Overview

    This study uses the [] approach to locate relevant literature, which consists of three phases: planning, doing the review, and recording the results of the review.First, in order to define the articles, including the keyword "Big Data for Education," this research relies on a search in the Saudi Digital Library SDL with the years of publication customized from 2015 to 2020.

  11. Big Data in Higher Education: Research Methods and Analytics ...

    The broader term data science, which can be applied to many types and kinds of data big and small, captures a theoretical and methodological sea change occurring in educational and social science research methods that is situated apart from or perhaps between traditional qualitative and quantitative methods (Gibson and Ifenthaler 2017; Gibson an...

  12. Big Data and data science: A critical review of issues for educational

    This paper identifies a wide range of critical issues that researchers need to consider when working with Big Data in education.

  13. (PDF) Big data in education: a state of the art, limitations, and

    It has been found that the current studies covered four main research themes under big data in education, mainly, learner's behavior and performance, modelling and educational data...

  14. (PDF) Big Data Analytics in Education:: Big Challenges and Big

    Big Data Analytics in Education:: Big Challenges and Big Opportunities Authors: Sieglinde Jornitz Laura C. Engel George Washington University Bernard P. Veldkamp University of Twente Kim...

  15. Big Data Analytics for Smart Education

    The research implements Python programming language on big education data. In addition, the research adopted an exploratory research design to identify the complexities and requirements of big data in the education field. Published in: 2021 IEEE 6th International Conference on Computing, Communication and Automation (ICCCA) Article #:

  16. PDF Education big data and learning analytics: a bibliometric analysis

    Introduction ig data in education has become a trend in recent years (Wang, 2016). The current era involves the creation and use Bof an enormous volume of data. Big data is a result of...

  17. Big data technology in education: Advantages, implementations, and

    BDT plays an essential role in optimizing education intelligence by facilitating institutions, management, educators, and learners improved quality of education, enhanced learning experience,...

  18. The Research Trend of Big Data in Education and the Impact of Teacher

    Published online 2021 Oct 27. doi: 10.3389/fpsyg.2021.753388 PMCID: PMC8578735 PMID: 34777150 The Research Trend of Big Data in Education and the Impact of Teacher Psychology on Educational Development During COVID-19: A Systematic Review and Future Perspective Jia Li † and Yuhong Jiang * , †

  19. Big data in education: a state of the art, limitations, and future

    Abstract: Due to SARS-CoV-2 pandemic, higher education institutions are challenged to continue providing quality teaching, consulting, and research production through virtual education environments. In this context, a large volume of data is being generated, and technologies such as big data analytics are needed to create opportunities for open innovation by obtaining valuable knowledge.

  20. PDF Big data in education: a state of the art, limitations, and future

    So far, several review studies have been conducted in the big data realm. Mikalef et al. (2018) conducted a systematic literature review study that focused on big data an-alytics capabilities in the firm. Mohammad & Torabi (2018), in their review study on big data, observed the emerging trends of big data in the oil and gas industry. Further-

  21. [PDF] Big Data in Higher Education: Research Methods and Analytics

    Laumakis, and Stephen A. Schellenberg examined data from a large, public university in the Western United States. In the article Balancing Student Success: Assessing Supplemental Instruction through Coarsened Exact Matching the team used data including student demographics, performance, and participation in supplemental programs to evaluate the efficacy of supplemental instruction. The ...

  22. The Use of Big Data in Education

    This paper is a study on the use of Big Data in Education. Analyzed how the Big Data and Open Data technology can actually involve to education. ... We review the literature of the research about big data in education in the time interval from 2010 to 2020 then review the process of big educational data mining, the tools, and the applications ...

  23. How technology is reinventing K-12 education

    Study finds public pension plans on shaky ground. New research calls attention to a huge funding gap and growing risk exposure, raising alarms about the long-term viability of government pensions.

  24. [2402.06196] Large Language Models: A Survey

    Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data, as predicted by scaling laws \\cite{kaplan2020scaling ...

  25. Big Data Application and its Impact on Education

    ... The integration of multiple educational systems and the growth of massive open online course (MOOC) systems have resulted in the gathering of a significant quantity of educational data by...

  26. 'It depends': what 86 systematic reviews tell us about what strategies

    To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes. We identified 32 reviews conducted between 2010 and 2022.

  27. Introducing Gemini 1.5, Google's next-generation AI model

    Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in our approach, building upon research and engineering innovations across nearly every part of our foundation model development and infrastructure. This includes making Gemini 1.5 more efficient to train and serve, with a new Mixture-of-Experts (MoE) architecture.

  28. [2402.11878] Simulator Demonstration of Large Scale Variational Quantum

    Download PDF Abstract: Advances in quantum simulator technology is increasingly required because research on quantum algorithms is becoming more sophisticated and complex. State vector simulation utilizes CPU and memory resources in computing nodes exponentially with respect to the number of qubits; furthermore, in a variational quantum algorithm, the large number of repeated runs by classical ...

  29. (PDF) Impact of Big Data in Education Management

    PDF | On Apr 3, 2021, Shamik Palit and others published Impact of Big Data in Education Management | Find, read and cite all the research you need on ResearchGate

  30. Google's new Gemini model can analyze an hour-long video -- but few

    Last October, a research paper published by a Google data scientist, the CTO of Databricks Matei Zaharia and UC Berkeley professor Pieter Abbeel posited a way to allow GenAI models — i.e. models ...