Ðóñ Eng Cn Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Modern Education
Reference:

Methods of data mining and educational analytics

Shirinkina Elena Viktorovna

PhD in Economics

Docent, the department of Management and Business, Surgut State University

628412, Russia, Tyumenskaya oblast', g. Surgut, ul. Lenina, 1, kab. 510

shirinkina86@yandex.ru
Other publications by this author
 

 

DOI:

10.25136/2409-8736.2022.1.37582

Received:

19-02-2022


Published:

06-05-2022


Abstract: The relevance of the study is due to the fact that there are currently more questions than specific answers on the topic in the context of intellectual analysis of educational data: how it is done, for what and how we can use it, what metrics to include in the sample and how to make forecasts. Undoubtedly, in the coming years there will be a transition from discussions to the practical implementation of educational analytics in educational processes. The purpose of the study is to systematize the methods of intellectual analysis of educational data in the context of the difference between educational analytics and pedagogical diagnostics and other methods of data collection. The results of the study will help to build a learning strategy and combine the objectives of the training program with the effectiveness of the educational process and the expected results from the students.    In this regard, the author considers the types of educational analytics. The scientific novelty of the research lies in the systematization of the areas of research interests related to data mining in education and educational analytics. It is proved that educational analytics in combination with intellectual analysis of educational data makes it possible to develop accurate models that characterize the behavior of students, their properties, weaknesses and strengths of content and interaction with it, team and group dynamics. The practical significance of the study lies in the fact that the methods considered will allow to assess the current state of the training system or program, predict the desired results and draw up a roadmap of planned changes. For pedagogical designers and methodologists, the presented methods will become the foundation for optimizing the program. Thanks to the presented methods, students receive the most relevant, engaging and meaningful educational experience.


Keywords:

education, digitalization, educational analytics, educational trends, intellectual analysis, training, continuing education, skills, educational data, effectiveness

This article is automatically translated. You can find original text of the article here.

Introduction

                The relevance of the study is due to the fact that there are currently more questions than specific answers on the topic in the context of intellectual analysis of educational data: how it is done, for what and how we can use it, what metrics to include in the sample and how to make forecasts. Undoubtedly, in the coming years there will be a transition from discussions to the practical implementation of educational analytics in educational processes.

The subject of the study is the educational process. The purpose of the study is to systematize the methods of intellectual analysis of educational data in the context of the difference between educational analytics and pedagogical diagnostics and other methods of data collection. The results of the study will help to build a learning strategy and combine the objectives of the training program with the effectiveness of the educational process and the expected results from the students.    In this regard, the author considers the types of educational analytics.

The scientific novelty of the research lies in the systematization of areas of research interests related to data mining in education and educational analytics. It is proved that educational analytics in combination with intellectual analysis of educational data makes it possible to develop accurate models that characterize the behavior of students, their properties, weaknesses and strengths of content and interaction with it, team and group dynamics.

Two sets of methods are widely used in the analysis of educational data:

- Learning Analytics;

- intellectual analysis of educational data (Educational Data Mining, EDM).

Learning analytics, or educational analytics, is the measurement, collection, analysis and presentation of data about students and the educational environment in order to understand the features of learning and maximize its optimization.

Intellectual analysis of educational data (EDM) is a range of methods for studying data and searching for hidden patterns in them, used for the purpose of making decisions in the field of education. Educational Data Mining includes both typical for any Data Mining (data mining) methods (clustering, classification, regression, rule search) and specific for the field of education, for example, psychometric models.

To generate and test hypotheses, it is necessary to determine the appropriate metrics. In addition, it is necessary to determine what data will be needed, where they will come from and for what analytical purposes they will be collected.

Learning analytics together with EDM helps to identify relationships in complex data and hidden knowledge, which, with a competent approach, help to improve the learning process. The main difference between these two sets of metrics is that in educational analytics, insights and findings are generated primarily by a specialist analyst, while in Educational Data Mining, many insights become the "unexpected" result of using automated algorithms.

The results of the study will help to build a learning strategy and combine the objectives of the training program with the effectiveness of the educational process and the expected results from the students.

The practical and theoretical significance of the study lies in the fact that the methods considered will allow us to assess the current state of the training system or program, predict the desired results and draw up a roadmap of planned changes. For pedagogical designers and methodologists, the presented methods will become the foundation for optimizing the program. Thanks to the presented methods, students receive the most relevant, engaging and meaningful educational experience.

 

 

Research methodology

The empirical basis of the research was the works of David Niemi "Learning Analytics in Education" [6], Carl Anderson "Creating a Data-Driven Organization" [9], Dirk Ifenthal, Dana-Christine Ma, Jane Yin-Kim Yau "Utilizing Learning Analytics to Support Study Success" [4], as well as research University 20.35 "Intellectual analysis of educational data" [8].

In the 2000s, an information technology infrastructure and tools appeared that allow processing and analyzing large amounts of data (Big Data). Very quickly, this technical opportunity developed into a socio-economic phenomenon that seriously affected business, healthcare, manufacturing, the space industry, biotechnology, marketing and many other areas of life, including education.

The revolutionary nature of Big Data can be explained by three classical characteristics that are attributed to processes related to big data: this is not only Volume, but also Velocity, as well as Variety. By speed we mean the ability to process incoming data as quickly as possible, almost in real time, and by diversity - the sources of this data, different in quality and properties [15,16].

What did the emergence of big data mean for the education sector? Thanks to the emergence of educational information systems and Big Data technologies, for the first time in history, pedagogy got a chance to quickly, continuously and in full register an extensive array of observations of the learning process, behavior and academic performance of students.

The gap between the identification of the problem and its solution has decreased many times; the possibilities for interpretation and analysis of the received information have significantly expanded. Moreover, educational analytics systems have made it possible to automate many routine processes, identify problems at an early stage and act proactively.

There are several categories of data that are typically used for educational analytics purposes. The list below is not exhaustive, since the list of data sources adapts to the goals and needs of a particular organization [1,10]:

1. Administrative data - information about the teacher, the methodologist, the availability and support options, the author's experience, the subject of the program or course, etc.

2. Preferred learning media or genres - retrospective indicators of the media or genres preferred for the student in cases where the choice was possible, for example, the duration of the video, the listening capacity of the podcast, the readability of the longrid, etc.

3. Interaction with educational resources - indicators of interaction during training, including the manner of navigation, answers to exercises and tests, the number of attempts, types of mistakes made, time characteristics associated with the student's activities during training events.

4. Past activity - retrospective indicators of the student's past activity, revealing the assimilation of ideas, skills or competencies at the current moment.

5. Time history - indicators of the nearest context, representing the time history of the student's actions, data about which is available on a particular day.

6. Social indicators - an indicator of the student's interaction with other students and the teacher in the learning process or with recorded speech (with all its various properties, for example, semantic content, prosody, etc.).

7. Demographic information - indicators of the peripheral context: region, age, gender, level of training, etc.

8. Social connections - indicators of the immediate environment: the number of connections, their strength and activity.

9. Type of thinking - data from a questionnaire or self-report on how a student establishes a link between his strategic efforts during training and the development of competencies, as well as how the individual learning process takes place.

10. Emotional state - psychophysiological indicators related to learning, for example, emotional state, sleep quality, nutrition indicators.

The modern learning process involves the use of a variety of educational technologies and platforms, most of which are equipped with their own means of collecting statistics and analytics. Let's list in general some data sources [7,11,12]:

- LMS (Learning Management System, learning Management System), for example Moodle, Canvas, iSpring, open EdX, etc.;

- LXP (Learning Experience Platform, learning experience platform, or aggregators of educational projects), for example, Graduated, EdCast;

- TMS (Training Management System, corporate training management system), for example, Training Space;

- video conferencing services, such as Zoom, Skype, MS Teams;

- webinar platforms, for example Webinar.ru;

- survey services, for example, Kahoot!, Quizizz, Slido, StartExam;

- courses created through constructors, for example, using Articulate 360, iSpring Suite, Adobe Captivate;

- VR and AR platforms, for example Uptale, Microsoft Mixed Reality.

Each of these sources has an internal analytics system, which complicates centralized data collection and subsequent analysis. This problem is partially solved by the LRS — Learning Record Store, a repository of educational records. Some LRS systems use the xAPI standard to transfer data from various sources and store them in a single format.

xAPI is an open specification that describes the format for transmitting statistics between the provider of educational activity (for example, a course, website, mobile application) and the LRS database. It is important here that the provider of educational activity gives the data in the format provided by xAPI.

There is a concept of Total Learning Architecture (TLA), which is being developed within the framework of the ADL Initiative program of the US Department of Defense (authors of the SCORM standard). The TLA includes a set of technical specifications to create future learning ecosystems. One of these specifications is xAPI, a standard for transmitting data about completed training. This standard is supported by many popular LMS systems (WebTutor, Moodle), course designers (Articulate Storyline, iSpring) and other educational services.

The basic unit of the xAPI standard is the statement. A statement is a record in JSON format about an action of the form "The user has learned lesson 1". The number of statements can reach several billion. To collect data in xAPI format, you need a special storage — Learning Record Store (LRS). LRS validates incoming data for compliance with the standard, processes, sorts and provides for the formation of visual customizable dashboards in specialized educational analytics systems or universal BI tools (PowerBI, Tableau, Qlik).

Some LRS have built-in learning analytics tools, are equipped with additional applications (for example, to launch e-courses) or are part of the LMS. Popular LRS systems are well-established in the educational analytics market.

External educational analytics platforms and educational record repositories provide extensive opportunities for data aggregation from a variety of sources, their analysis and visualization. It should be borne in mind that when using such solutions, dedicated resources are required for installation, support, and most importantly, scaling.

The most common methods of data analysis in education, systematized by the author on the basis of a review of domestic and foreign studies, are presented in Table 1.

Table 1

The most common methods of data analysis in education

Method

Goal

Application

Intelligent analysis of cause-and-effect relationships

Identify cause-and-effect relationships

To understand what behavioral traits of listeners lead to learning, failure, refusal of training, etc.

Clustering

Identify groups with similar properties

Combine training material or students in a group based on the patterns found

Finding and identifying relationships using machine learning models

Use an existing model as a component for the current analysis

Identify the relationship between listener characteristics or contextual variables and their behavior. Implement psychometric frameworks into a machine learning model

Data distillation (data purification for human evaluation)

Present the data in a form suitable for visualization and interaction with them

Provide methodologists with data that makes it convenient to track and analyze the activity of students and their progress in the program

Knowledge tracking

To assess the mastery of knowledge using logs of students' responses to tasks and a cognitive model in which the relationship between tasks and skills required to solve them is built

Track the dynamics of progress

Non-negative matrix decomposition

Based on the students' scores for the test tasks, create a matrix of positive numbers, which is then decomposed into two matrices — with tasks and skills to simplify the analysis

Evaluate the skills of listeners

Identification of emissions

Identify atypical listeners

Determine which listeners are not keeping up with the pace of the program or, conversely, are significantly ahead of it

Forecasting methods

To make a conclusion about the target variable based on combinations of other variables using classification, regression, density estimation and other methods

Predict the progress of the listener and determine his behavior model

Intelligent process analysis

Define a process based on event logs

To draw conclusions about the behavior of students

Intelligent analysis of relationships

To study the relationships between variables and identify patterns using methods of searching for associative rules, correlation analysis, intelligent analysis of cause-and-effect relationships

To identify the relationship between the behavioral patterns of listeners and the difficulties they experience in learning

Statistical methods

Describe what is happening and draw conclusions using mathematical statistics

Analyze, interpret and draw conclusions from the data of educational analytics

Social Media Analysis

Analyze the social relationships between the elements of the network

To study the structure and relationships in group activities, as well as the interaction of students in the educational chat

Intelligent text analysis

Extracting data from text information

Analyze the contents of chats, homework, and other text digital trail of listeners

Visualization

Display data graphically

Present the data in a visually understandable way

 

Modern platforms allow you to collect data on a variety of user actions, so for the purposes of educational analytics, information is available about literally every mouse movement.

The areas of research interests related to data mining in education and educational analytics are presented in Table 2.

Area of research interest

Goal

Analysis of learning theories

To analyze how learning theories and learning analytics can be integrated into educational research

Analysis of pedagogical strategies

Analyze the application and effectiveness of pedagogical strategies using data mining and educational analytics

Program code analysis

Apply data mining methods to programming courses and code written by students

Group training and teamwork

Analyze the dynamics of group work and predict team results

Intellectual analysis of the educational program

Analyze the structure of the program, the evaluation system and other components of the program in order to improve it

Dashboard analysis

Apply visualization techniques to analyze the collected digital footprint

Deep Learning

Apply multi-layer neural networks for data mining purposes

Detection of cause-and-effect relationships

Identify cause-and-effect relationships in the training data

Early detection systems for problems

Predict the behavior of students that may lead to academic failure, in order to prevent

Analysis of emotional states

To study the influence of emotional states on learning

Assessment of the effectiveness of interventions

Evaluate the effectiveness of actions taken based on these decisions, etc.

Methods of constructing features (Feature Engineering)

Automatically identify significant features in the data for training the model

Game analytics

Apply data mining and visualization techniques to player interactions and choices they make during game learning

Student models

Develop data models that describe listeners and are convenient for processing and use

Measuring the results of self-study

Apply data mining techniques to evaluate the results of self-study

Multimodal learning analysis

Apply machine learning methods and use data obtained from wearable devices and sensors to generate insights about learning in different contexts

Personalized feedback

Automatic or semi-automatic feedback generation to involve students in the learning process

Identification of the relationship (positive/negative)

Automatic detection of emotional coloring, tonality in training content and content created by listeners

Transfer Learning

Develop machine learning models that can be transferred to different contexts and spheres

Identification of trajectories

The study of processes within the course and the study of the logs of the learning system to identify possible sustainable educational trajectories

Semantic analysis of texts

Application of text data analysis tools to information from chats, social networks, homework, essays, etc.

               

                As a rule, changes in the training program are made for three complementary purposes:

                1. To improve the listener's experience (the student's point of view). Does the program allow to achieve the goal set by the listener (to acquire new skills, change the specialty, etc.)? is the material suitable for the level of training of the listener?

2. To optimize the curriculum (the point of view of the methodologist / pedagogical designer). How successfully do students achieve their learning goals? Which modules need improvement?

3. To improve the efficiency of the organization (the point of view of the training customer). Are the resources invested in the program paying off in the long run?

The selected goal (or goals) determines the relevant data sources and indicators whose changes will be monitored and evaluated. Contexts in which different metrics are used may overlap in goals, so it is necessary to clearly compare metrics with goals.

A metric is just a numerical indicator, and in isolation from the context and goals, it remains only a figure. The metric turns into a flag that signals the need to change something, confirm the effectiveness or, conversely, refute the hypothesis, thanks to analytical work.

Metrics of learning effectiveness help to assess the quality of educational content and its individual parts, the level of training of teachers and presenters, the progress of listeners.

Examples of educational effectiveness metrics [2,3,5]:

1. Listener Loyalty Index (NPS, Net Promoter Score). Calculation algorithm: you ask the course participants one question: "Estimate from 1 to 10, with what probability you will recommend our course to colleagues or acquaintances." The metric is calculated as the difference between the shares of supporters and critics: from the number of people who chose numbers from 9 to 10 (that is, supporters), you subtract the number of people who chose numbers from 1 to 6 (critics). Neutral-minded people who chose points 7-8 are not taken into account.

2. Listener Satisfaction Index (CSI, Customer Satisfaction Index). It is recommended to monitor after each interaction with the target audience: after the webinar, the completed module of the course. This indicator allows you to understand which elements of the course turned out best, and which should be improved. Calculation algorithm: evaluation of any user interaction with the entire educational program or with individual elements (exercises, theoretical part, teacher's work, etc.). The scale for evaluation can be different: from 1 to 10 or just an answer in the "yes/no" format.

3. The level of profitability (COR, Completion Rate). Shows the percentage of completed training from the number of enrolled/started. Calculation algorithm: it is necessary to take the number of those students who have successfully completed training and divide it by the number of those who have started training. You will get the percentage of students who completed the course to the end.

4. Churn Rate. Demonstrates what percentage of trainees refused to study during a certain period of time.

Questions that such metrics help answer:

? What components of the program cause difficulties?

? What elements of the program cause dissatisfaction?

? How fast are the students progressing in the program?

Measuring such indicators is extremely valuable for optimizing the program, but such data is usually difficult to collect automatically ? they come from the support and support department, the career center, from teachers' reports, feedback questionnaires.

Consider the analytics of best practices [8,13].

Case study of the University 20.35 "How to measure soft skills based on a digital footprint".

At the University, 20.35 is often faced with the need to evaluate group activities ? for example, the result of teamwork on a project or the joint work of students in the "startup as a diploma" format. At the same time, it is necessary to be able to single out the individual contribution of each participant from the data array of group activities. There is one universal problem with universal competencies: it is not always clear what exactly and how to measure. Unlike most skills related to a specific tool or the performance of a specific task, the assessment of soft skills is not so simple. Such competencies are manifested only in activities together with other skills. In addition, few soft skills are so clearly defined that all analysts and methodologists unanimously accept such a definition.

Getting definitions by expert means, no matter how professional the expert's opinion may be, is not always the most effective approach. It does not allow achieving a consensus decision with a large number of experts. Instead, we use the "ontological proximity" approach, which allows us to identify what is meant by a particular soft skill in each case.

A fairly large amount of data has been collected from job descriptions, events, resumes, and articles. Thus, the automatically updated data set includes 150 million job descriptions across Russia and the world.

With the help of semantic analysis of texts using artificial intelligence, an updated description of the professional field was obtained, including functional positions, labor functions, subject areas and skills, including soft skills.

Case "Yandex. Practicum": the metric of cognitive capacity and [13,14].

Educational product "Yandex. Practicum" has a hierarchical structure (see Fig. 1):

Fig. 1. The structure of the educational product [3.13].

Each lesson has educational goals, the achievement of which is checked with the help of tasks. Low?level elements ? tasks - are measured by several metrics:

? difficulty (the proportion of students who completed the task correctly among those who completed this task);

? discriminativeness (differences in the solvability of the task among strong and weak students, or the ability of the task to divide listeners into strong and weak);

? the attempt index (differences in the solvability of the task from attempt to attempt; this index only works in simulators).

The methodologist sees these coefficients in his dashboard and evaluates which tasks need to be processed. For example, it may be necessary to remove tasks with zero solvability, simplify extremely difficult tasks, improve tasks with low discriminativity and simulator tasks with a zero attempt index.

This is a good, but not a visual tool: methodologists are loaded and it can be difficult to evaluate each problem task by numerical coefficients, so a system of visual and understandable text markers has been introduced based on these coefficients.

Skypro case: Metrics for a new profession training product [13,17].

Educational products for obtaining a new profession (vocational training) are provided by many market players: universities, organizations of secondary vocational and additional education, EduTech companies, online universities, etc. [2,5] What metrics can be used to assess the success of retraining of students who worked as miners, confectioners or drove a tram yesterday? Simple solutions are not suitable: for example, it is impossible to take profitability as the main metric, because the goal of the listener is not to reach the end of the course, but to get a new job in a new specialty.

The result is important to the listener of the program with a request for a change of profession ? getting a job for a clear period of time with a clear income. To measure and track such a result, such indicators as:

? the speed of receiving an offer;

? number of offers;

? starting salary.

When developing and assembling programs for each profession:

? the job market is analyzed (the number and dynamics of opening/closing positions) and data on starting salaries of employees of initial positions;

? a list of necessary competencies is compiled both on the analysis of open positions and on the results of 30-50 interviews with HR and hiring managers;

? together with market experts, a program is compiled and validated.

This affects the content of the program: business tasks relevant to the market appeared in it in order to immerse newcomers in the industry in the market and specifics, and career consultations, training interviews, help with resume preparation, retrospectives with graduates after the first responses and interviews were added at the exit. As a result, it is advisable to focus on three sets of metrics: internal metrics of the product, metrics of the effectiveness of the educational program and market metrics for employment and recall of students.

 

 

               

Conclusions

1.       Educational analytics combined with intellectual analysis of educational data makes it possible to develop accurate models that characterize the behavior of students, their properties, weaknesses and strengths of content and interaction with it, team and group dynamics. Data collection for these purposes is carried out from a variety of sources, which requires additional resources for the organization of storage and access to this data, their subsequent processing for analytical purposes. This problem is partially solved by the storage of educational records — LRS, however, without specific knowledge in the field of engineering and data analysis, in any case, it is impossible to do.

2.       The study presents metrics of the effectiveness of an educational product. These indicators help to assess the quality of educational content and its individual parts, the level of training of teachers and presenters, the progress of listeners.

3. External platforms of educational analytics and repositories of educational records provide ample opportunities for data aggregation from various sources, their analysis and visualization. It should be borne in mind that when using such solutions, dedicated resources are required for installation, support, and most importantly, scaling.

4. In the learning strategy, it gives confidence that the decisions taken will correspond to the strategic goals of the educational organization. For those responsible for training, educational analytics in conjunction with the intellectual analysis of educational data provides a framework for prioritization, decision-making and effective allocation of resources.

5. For pedagogical designers and methodologists, the presented methods will become the foundation for optimizing the program. Thanks to the presented methods, students receive the most relevant, engaging and meaningful educational experience.

References
1. Amaeva L.A. (2017) Comparative analysis of data mining methods. Innovative science, 2-1, 27-29.
2. Vilkova K.A., Zakharova U.S. (2020) Educational analytics in traditional education: its role and results. University management: practice and analysis, 24, 3, 59-76.
3. Datsun N.N., Urazaeva L.Yu. (2017) Promising areas of application for learning analytics. Scientific notes of the IUO RAO, 1 (61), 43-46.
4. Dirk Ifentala, Dana-Christine Ma, Jane Yin-Kim Yau. Utilizing Learning Analytics to Support Study Success, 2019. Retrieved from: http://sber.me/?p=292fN
5. Dubovik O.V. (2017) Pedagogical design in Russian education. Education. The science. Innovation: Southern Dimension,5-6 (46), 59-67.
6. David Niemi. Learning Analytics in Education, 2018. Retrieved from: http://sber.me/?p=kBPrb
7. Ivanova I.A. (2018) Research of resources of the corporate portal in the management of personnel involvement. Management of personnel and intellectual resources in Russia, 7, 1, 27-33.
8. Educational Data Mining, online course, University 20.35. Retrieved from: http://sber.me/?p=2RZbZ
9. Carl Anderson. Creating a Data-Driven Organization, 2015. Retrieved from: http://sber.me/?p=G6p4S
10. Klyachko T.L. Challenges of professional education. Retrieved from: http://www.ifap.ru/library/book557.pdf
11. Corporations teach specialists to survive in new conditions. Retrieved from: https://plus.rbc.ru/news/5f4bc8e27a8aa901222dbcc1
12. Nikolaev N.A. (2016) Improving the efficiency of labor of personnel of small enterprises based on increasing involvement in the organization and development of corporate culture // Human Progress,2,2,2.
13. Sverdlov M.B. (2021) Educational analytics: managing an educational organization and creating content based on data, 2021. Retrieved from: http://sber.me/?p=LPG6h
14. Creative Cognition and Brain Network Dynamics. Retrieved from: http://sber.me/?p=tpBRN
15. Kausar S., Oyelere SS, Salal Ya.K., Hussain S., Cifci M.A., Hilcenko S., Iqbal MS, Zhu W., Xu H. Mining smart learning analytics data using ensemble classifiers (2020)International Journal of Emerging Technologies in Learning. 2020,15 (12),81-102.
16. LinkedIn. Workplace Learning Report. Retrieved from: http://sber.me/?p=gH64M
17. Robust prediction of individual creative ability from brain functional connectivity. Retrieved from: http://sber.me/?p=dMN61

First Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The subject of the study. The article is devoted to the enumeration of data analysis methods. At the same time, in accordance with the title, the methods of data mining and educational analytics should be presented in the text. At the same time, the author has not even formulated definitions of the terminological units "data mining" and "educational analytics". Research methodology. The author lists well-known facts and judgments. The presented text is not the result of a scientific study. Relevance. The chosen research topic is relevant due to the fact that intelligent data processing methods can significantly accelerate various technological processes in organizations of various types of economic activity of various forms of ownership. These data processing methods can be useful both in commercial organizations and in non-profit organizations. This may be of particular interest to public authorities of the Russian Federation and local governments, including in the context of data processing in monitoring the achievement of national development goals of the Russian Federation, defined by the Decree of the President of the Russian Federation dated 07/21/2020. Scientific novelty. There is no scientific novelty in the peer-reviewed scientific article. The author exclusively declaratively retells well-known material. The practical and theoretical significance of the presented text is also missing. Style, structure, content. The style of presentation is partly scientific. Despite the fact that there are no journalistic and everyday statements, the proposals are not formulated clearly enough (for example, the first sentence sounds like "In addition, it is necessary to determine what data will be needed, where they will come from and for what analytical purposes they will be collected." The structure of the article is not built correctly and thoughtfully enough. In particular, the introduction does not substantiate the relevance of the study, setting the goals and objectives of the study. The article begins with the definition of research methods. It seems that first the subject of the study is formulated, the relevance of the problem under consideration is justified, the goal is set and the tasks of the study are determined, and then the tools necessary to solve the designated problem are selected. The text of the article states "Research methodology", but nothing is written. Most likely, the author postponed filling in the text of this introduction block for some time, and later forgot to return to it. It seems that the article submitted for review should be as proofread as possible. The main part of the article is a retelling of well-known facts without any author's assessment and discussion. There is no analysis, problems are not identified on the research topic under consideration, and proposals for their solution are not being developed. Bibliography. The author has compiled a bibliographic list of 11 sources, which seems insufficient for the chosen research topic. It is recommended to further study domestic and foreign scientific publications, especially in 2021. Moreover, the author needs to pay attention to the analysis of numerical material published in various statistical databases, which allows to substantiate the author's judgments and conclusions (perhaps that is why they are missing in the reviewed article). Appeal to opponents. The text of the article not only lacks any scientific discussion and discussion of the results of scientific research by domestic and foreign authors, but also contains links to sources from the generated list of references. Conclusions, the interest of the readership. Taking into account the above, the author needs to do a lot of work to correct this article: 1) to form an introduction with the inclusion of all key elements (relevance of the study, the subject of the study, the purpose and objectives of the study, the methodology of the study, the practical and theoretical significance of the study; clarification of the tools, taking into account the definition of the above elements of the introduction) 2) to get acquainted with the research of other authors (both those that are already listed in the bibliographic list, and to consider other scientific publications, including the year of issue 2021) 3) conduct a comprehensive study of the selected problem, taking into account the selected tools 4) interpret the results obtained, identifying possible causes and consequences 5) compare the results obtained with the results of other authors, identify possible causes of deviations and discrepancies 6) form a set of author's recommendations to solve the identified problems. Due to the lack of scientific novelty and inconsistency of the requirements for scientific papers, the reviewed article can be approved for publication after qualitative revision. Taking into account the relevance of the research, taking into account these comments, the article may be of potential interest to both the scientific community and practitioners (moreover, in absolutely all fields of activity, since currently it is impossible to imagine activities without the use of machine tools for data processing).

Second Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The reviewed article discusses approaches to the application of data mining and educational analytics methods. The author rightly associates the relevance of the article with the need to move from discussing the possibilities of intellectual analysis of educational data to the practical implementation of educational analytics in educational processes. As elements of the scientific novelty of the conducted research, the article names the systematization of areas of research interests related to data mining in education and educational analytics, as well as the justification for the possibility of developing accurate models characterizing the behavior of students, their properties, weaknesses and strengths of content and interaction with it, team and group dynamics. Highlighting two sets of methods in the analysis of educational data: educational analytics and intellectual analysis of educational data, the author explains the differences between them and notes that in order to generate and test hypotheses, it is necessary to determine the appropriate metrics, determine what data will be needed, where they will come from and for what analytical purposes they will be collected. The authors of the article believe that thanks to the advent of educational information systems and Big Data technologies, pedagogy has a chance to quickly, continuously and fully register an extensive array of observations of the learning process, behavior and academic performance of students. The article highlights 10 typical categories of data used in educational analytics, provides an overview of educational technologies and platforms that have their own means of collecting statistics and analytics. The most common methods of data analysis in education are presented in a separate table, which reflects the name of each of the 14 methods, its purpose and scope of application. The following table shows the areas of research interest (21 areas in total) and their goals. Further, the article examines the metrics of learning effectiveness that help to assess the quality of educational content and its individual parts, the level of teacher training, the progress of students: the index of student loyalty, the index of student satisfaction, the level of profitability (before graduation), the outflow indicator, and also an overview of the analytics of best practices (University 20.35, Yandex. Practicum", Skypro). In conclusion, the conclusions are integrated into 5 points. The bibliographic list of the article includes 17 sources – publications of domestic and foreign scientists, Internet resources. There are address references to the sources listed in the list of references in the text. It is worth noting the shortcomings of the material submitted for review and making some comments. Firstly, the article does not highlight such a section generally accepted in modern scientific articles as "Results and their discussion" – it is not separated from the sections available in the article: Introduction, Research Methodology, Conclusions, Bibliography. Secondly, table 2 should be drawn up in accordance with the accepted rules – it is not titled. Thirdly, the methods considered in the article are not illustrated by any examples of actually constructed models of intellectual analysis of educational data, although, of course, the presented material is focused on this and generally contributes to the transition from discussions to the practical implementation of data mining and educational analytics. The reviewed material corresponds to the subject of the journal "Modern Education", contains practically significant research results, is focused on assessing the current state of educational programs and training systems, and their improvement. The reviewed material is recommended for publication.