Ðóñ Eng Cn Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Administrative and municipal law
Reference:

Machine Learning and Big Data for Optimization of Administrative Law (Computing Experience)

Trofimov Egor Viktorovich

ORCID: 0000-0003-4585-8820

Doctor of Law

Deputy Director for Science, St. Petersburg Institute (Branch) of the All-Russian State University of Justice

199178, Russia, g. Saint Petersburg, 10-ya liniya V.O., 19, lit. A, kab. 36

diterihs@mail.ru
Other publications by this author
 

 
Metsker Oleg Gennad'evich

ORCID: 0000-0003-3427-7932

PhD in Technical Science

Researcher

199178, Russia, g. Saint Petersburg, 10-liniya V.O., 19 lit. A

olegmetsker@gmail.com
Other publications by this author
 

 

DOI:

10.7256/2454-0595.2022.4.39081

EDN:

IHYLJY

Received:

31-10-2022


Published:

07-11-2022


Abstract: The subject of the research is the methods of its analysis and optimization based on indicators developed in the field of regulatory administrative and legal regulation. A qualitative assessment of the optimization of legislation is shown by the example of the decree of the Governor of St. Petersburg dated 07.09.2015 No. 61-pg, which defines the main directions of public administration of socio-economic phenomena and processes in St. Petersburg. A comparison of the indicators approved by this resolution, which serve the purposes of socio-economic development and administrative and legal regulation, with statistical socio-economic indicators will demonstrate how optimal regulatory regulation is. This optimality is assessed by the compliance of normative indicators (goals) with the most significant ones (for migration flows in inner-city municipalities) statistical indicators identified on large data sets by machine learning methods. Machine learning on large data sets made it possible to identify two of the most significant indicators of them — the goals of socio-economic development and regulatory regulation (the costs of landscaping and the costs of holding local holidays and sporting events), as well as to identify a statistical indicator that is not recognized as a goal of territorial development (environmental protection costs). The results obtained made it possible to identify the most important areas of activity of higher levels of public authority corresponding to the significance of indicators for the migration flow: preschool and school education, healthcare for children and elderly citizens, creation of an accessible (comfortable) environment for them. The results obtained are of methodological importance, since they have the potential to use numerical statistical indicators, and can be useful for evaluating the optimization of regulation and legal (regulatory) policy. Machine learning based on big data in the social, demographic, economic and environmental fields can become an important tool for optimizing administrative legislation and public administration.


Keywords:

law, artificial intelligence, methodology, digital state, big data, machine learning, statistics, indicator, administrative law, legislation

This article is automatically translated. You can find original text of the article here.

1. IntroductionThe spheres of public administration and administrative and legal regulation are extremely extensive, diverse and complex.

They accumulate a significant amount of socio-legal interactions, they cover a significant (if not the largest) share of all socio-legal phenomena and processes. These circumstances have always created serious difficulties for the development of optimal administrative and legal regulation, since the human mind is not able to collect in a single intellectual process, to realize and analyze a huge array of information about the phenomena and processes occurring in this area.

The realities of today, associated with the accumulation of data and the development of computing power, allow us to begin solving problems of optimizing administrative and legal regulation using computer methods and technologies focused on working with big data. Such methods and technologies are suitable not only for processing large amounts of information, but also for detecting complex (implicit) connections between phenomena and processes that are inaccessible to search and substantiation by "manual" methods.

The introduction of high—performance computing and big data into the sphere of public administration and legal regulation is the next stage in the digital transformation of the state and law, which scientists and practitioners in Russia and abroad are working on. Research and development based on big data in this area requires interdisciplinary integration, and therefore remain extremely rare, and their results are still modest. A systematic review of the development of computer systems and methods in legal research and legal practice is made by the authors in a separate work [1], here it is worth mentioning only some recent Russian works.

Thus, there is a well-known experience of using big data of search queries on regional crime from Yandex Internet repositories for analytical purposes using the GMDH method, which showed a fairly high (94-96%) accuracy, revealed according to official statistics [2]. However, this study was not focused on legal goal—setting, and its methodological status — substitutive or complementary to the traditional legal methodology - remained, unfortunately, uncertain.

The opposite example is the experience of theoretical consideration of the problems of interpretation of the results of big data analysis in legal research [3]. This work, conceived as a legal study, turned out to be abstracted from the methodological and technological (computer) side of the issue. Computer methods and technologies were not analyzed by the authors, who largely relied on commercial (advertising) information from non-scientific sources. As a result, the authors' ignorance about the computer aspect of the problem led to rather sharp conclusions about the need to oppose legal regulation to those high-tech solutions that are developed on the basis of big data, as well as to the authors' dubious theses about the non-interpretability, closeness, non-discursiveness and retrospectivity of generating automated solutions that contradict decades of experience in development and scientific research in the "law & AI" segment, as well as an extensive layer of world computer and interdisciplinary scientific literature.

In 2022, at the X St. Petersburg International Legal Forum, the results of the Megafon experiment were presented at three judicial precincts of magistrates of the Belgorod region. The experiment consisted in an attempt to automate the processing of applications for the issuance of a court order, including the formation of accounting and statistical cards and drafts of court orders themselves [4]. Despite the optimism of the authors, who claimed to reduce the time for filling out the case card by 96% and for preparing a judicial act by 84%, the chairman of the Belgorod Regional Court O. Y. Uskov, who oversaw the experiment from the judicial system, drew attention to the fact that these advantages in practical terms were offset by the need for the same (if not large) labor costs for checking machine results and manually correcting numerous errors. Despite the absence of big data in this experiment and the focus on integrating search and management functions, this experiment was still conducted on the basis of machine learning technologies (including text recognition and structuring) and it should be generally considered successful, especially given the positive foreign (Spanish [5], Italian [6], British [7], etc.) experience in developing similar legal content management systems.

In 2021, the Ministry of Justice of Russia announced testing of the automated examination system of regulatory legal acts [8]. As of 2022, this system was implemented in the NPCI under the Ministry of Justice of Russia with the functionality for automatic detection of corruption-causing factors. However, this development is mainly related to the search task (identification of duplications, unacceptable elements, etc.), and therefore, despite the well-known practical value, consisting in the advantages of automating a number of intellectual operations, it has not revolutionized legal research and legal practice.

2. Problem and purposeIntegration of computer methodology with methods and tasks of legal sciences is a fundamental scientific problem.

In a series of previously published works based on the results of computational experiments in the field of administrative-tort and criminal law, the authors developed and tested interdisciplinary methodological approaches for automated analysis and qualitative assessment of legal regulation based on mathematical and socio-legal indicators and seem promising for further search for solutions to this fundamental problem.

At the same time, despite the importance of these two protective areas (administrative-tort and criminal), they are usually considered not so problematic due to their compactness and high degree of systematization. The relevant codes (the Administrative Code of the Russian Federation, the Criminal Code of the Russian Federation, the Code of Criminal Procedure of the Russian Federation) and the practice of their application are given great attention by the legislator, law enforcement and the scientific community. On the contrary, administrative legislation as such, numbering hundreds of thousands (or rather, about 3 million, if the municipal level is included in them) of existing regulatory legal acts, and the practice of its application in both the regulatory and protective spheres are seen as too massive and heterogeneous to begin their automated processing. Nevertheless, it is the complex nature of socio-legal relationships, which is clearly manifested in the field of public administration and administrative and legal regulation, that forces us to turn to computational experiments in this area, taking into account the developments obtained on more studied administrative-tort and criminal law material.

The purpose of this work is to further develop and test an indicator approach to the qualitative assessment of the optimization of legislation, including an assessment of the applicability of the interdisciplinary methodology previously developed on administrative-tort and criminal law material.

This article presents the results of computational experiments aimed at developing methods of analysis and optimization of regulatory administrative and legal regulation based on indicators. The most important task at this stage of the study, in contrast to the experiments performed earlier (in 2020-2021), was the use as social indicators not a limited set of goals of a pronounced legal nature [9, p. 18, 20], but a wide range of socio-legal goals based on indicators of socio-economic statistics that accumulate large arrays of numerical data contain the potential for a transition from a qualitative assessment of legal regulation to a quantitative analysis.

3. Methods and materialsThe study was based on the interdisciplinary (computer-legal) methodology developed by the authors on the basis of the indicator approach for the qualitative assessment of the optimization of legal regulation, including the dogmatic method, system analysis and expert assessments, as well as computer methods (data collection, purification and preprocessing, natural language processing, markup, normalization and data mining, machine learning) [10].

The study was conducted in the subject area of administrative and legal regulation of a vast complex of socio-economic phenomena and processes of territorial development of the region and the comfortable urban environment of the city of federal significance. The authors proceeded from the fact that higher positive migration reflects the socio-economic attractiveness of the region and that there is no single universally recognized combination of indicators that allow determining the degree of influence of various factors on migration and assessing the migration attractiveness of the region [11, pp. 421-422]. Indicators of the migration attractiveness of the urban environment by various scientists include, for example, public health [12], inner-city traffic [13], social life [14], well-being of residents [15], planning, construction and design of housing [16], satisfaction with neighbors and housing [17], physical security [18], widespread use of information and communication technologies [19].

For a qualitative assessment of the optimization of legislation, the resolution of the Governor of St. Petersburg No. 61-pg dated 07.09.2015 "On monitoring the social and economic development of inner-city municipalities of St. Petersburg and evaluating the effectiveness of local self-government bodies of inner-city municipalities of St. Petersburg" was taken, since it is this normative legal act that defines the main directions of public administration of socio-economic phenomena and processes in the federal city of St. Petersburg. The said resolution approved the indicators on the basis of which the annual monitoring of the social and economic development of inner-city municipalities of St. Petersburg and the evaluation of the effectiveness of the activities of local self-government bodies of inner-city municipalities of St. Petersburg is carried out.

The comparison of the indicators approved by this resolution, which serve the purposes of socio-economic development and administrative and legal regulation, with statistical socio-economic indicators should demonstrate how optimal the established regulatory administrative and legal regulation is. This optimality is assessed by the compliance of normative indicators (goals) with the most significant ones (for migration flows in inner-city municipalities) statistical indicators identified on large data sets by machine learning methods.

To conduct the study, statistical data were collected from the Unified Interdepartmental Information and Statistical System (EMISS), a dataset was formed from the values of 20 indicators of the structure of the migration flow, population size and density, 568 indicators of economic entities and 1444 indicators of municipal districts characterizing all areas of development of the municipal district, including activities in the field of culture, construction, communications, business, transport, ecology, in the context of 111 inner-city municipalities of the federal city of St. Petersburg for 4 years (2017-2020).

The data were measured by the values "municipality" and "year" (rows), the indicator was indicated as a column. The sample was divided into a test (20%) and a training (80%), after which a gradient boosting model was trained using the XGBoostRegression method based on the above statistical indicators for the purpose of "internal migration growth". Regularization was performed to improve the model with the best XGBRegressor parameters: base_score=0.5, booster=None, colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1, importance_type='gain', interaction_constraints=None, learning_rate=0.300000012, max_delta_step=0 , max_depth=6, min_child_weight=1, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=0, num_parallel_tree=1, random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method=None, validate_parameters=False, verbosity= None. RMSE (414.73) and R-Squared (0.74) were calculated, as well as the significance of predictors using the F1-score metrics (for classification of metric error) and the Shapley index (for impact assessment).

4. ResultsThe significance of predictors (statistical indicators) using the F1-score metrics (30 predictors) and the Shapley index (20 predictors) is shown in Figures 1 and 2, respectively.

Among the main indicators contributing to the migration model, there are indicators of the contribution of the municipal district to fixed assets, environmental indicators, the cost per square meter, law enforcement costs, which indicates the conscious nature of migration of the population to municipalities in which housing and communal services are developing, ecology and law enforcement are provided. It is worth noting that the amount of housing space entered is not included in the top 20 migration indicators.

 

Fig. 1. Significance of predictors (statistical indicators) using F1-score metrics for internal migration growth.

 

The results of a comprehensive analysis of predictors obtained by the Shapley index, among the 20 most significant statistical indicators affecting the migration attractiveness of inner-city municipalities of St. Petersburg, include only three characteristics of the activity of municipalities:

— environmental protection costs;

— investments in fixed assets;

— activities in the field of culture, sports, leisure and entertainment.

Due to the specifics of budget expenditures allowed for municipalities, investments in fixed assets mean expenses for the improvement of residential neighborhoods (construction of playgrounds, installation of public sports simulators, etc.), and activities in the field of culture, sports, leisure and entertainment - expenses for street celebrations and sporting events for the population. Thus, the Shapley index made it possible to identify the three most important targets for internal and external migrants: cleanliness, landscaping and leisure activities.

An interesting fact found in the results of calculating the significance of predictors based on the Shapley index is the identification of the indicator "women 0-4 [years]", which indicates an increased importance in the migration flow of girls under 4 years old, who statistically aggregate complex (implicit) links with the spectrum of migration factors (socio-economic indicators of the development of the territory). Also in this interpretation, you can see the structure of the flow, which consists of older men from the CIS countries and women over 90 from the regions of Russia. Thus, the main needs in the most popular municipal districts are medical care for these socio-demographic groups, although indicators related to medical care are absent among the most significant predictors.

 

Fig. 2. Significance of predictors (statistical indicators) using SHAP value for internal migration growth.

 

The official indicators used as the goals of regulatory regulation and socio-economic development of territories approved for evaluating the effectiveness of municipalities by the Decree of the Governor of St. Petersburg dated 07.09.2015 No. 61-pg include 16 positions, which in general are as follows:

— execution of the municipality's budget;

— expenses for the maintenance of municipal employees;

— the amount of contracts concluded with the winners of competitive procedures;

— expenses for landscaping;

— transfer of orphans to guardianship;

— expenses for local holidays and sporting events;

— the percentage of the population who took part in local holidays and sports events;

— circulation of the municipal newspaper.

Machine learning on large data sets made it possible to identify two of the most significant indicators of them — the goals of socio-economic development and regulatory regulation (the costs of landscaping and the costs of holding local holidays and sporting events), as well as to identify a statistical indicator that is not recognized as a goal of territorial development (environmental protection costs).

In addition, the attractiveness of municipalities depends on a whole set of factors that are identified:

— directly (cleanliness, landscaping and organization of leisure activities of residents in the municipality);

— due to the subsequent interpretation of the significance of predictors concerning the characteristics of the migration flow as a target indicator of the analysis (for example, the high significance of the migration of girls from 0 to 4 years).

Considering that the goals of regulatory regulation are formed taking into account the level of public authority, which has its own competence (in this case, issues of local significance and transferred separate state powers of the Russian Federation and the subject of the Russian Federation) and budgeting (in this case, fixed sources of income and expenses of local budgets of inner-city municipalities), a qualitative assessment optimization of regulatory regulation also includes the issue of differentiation of subjects of competence and powers between levels of public authority. The regulatory and legal regulation of the goals of socio-economic development of inner-city municipalities of St. Petersburg should correlate not only with the activity of municipalities, but also with the activity of higher levels of public authority due to the limitation of the powers of municipalities to carry out and finance a number of activities. For example, municipalities in St. Petersburg do not have the competence and budget for healthcare, preschool and school education, so setting appropriate goals may be justified, but evaluating the activities of municipalities in this area is unjustified.

The data obtained make it possible to determine the most important areas of activity of higher levels of public authority, corresponding to the significance of predictors from the characteristics of the migration flow: preschool and school education, healthcare for children and senior citizens, creating an accessible (comfortable) environment for them.

5. Conclusion

An integral array of official statistical indicators, as well as primary data forming these indicators, allows us to identify priority socio-legal goals, in this case, the main indicators (factors) affecting the attractiveness of the territory for the population, as well as socio—demographic groups that require increased attention when regulating and administering the quality of life in an urban environment: to such groups include children and elderly people who need appropriate medical care, education, leisure, good ecology, landscaping and special conditions in an urban environment for movement.

The results obtained are of methodological importance, since they have the potential to use numerical statistical indicators, and can be useful for evaluating the optimization of regulatory regulation and legal (regulatory) policy. Machine learning based on big data in the social, demographic, economic and environmental fields can become an important tool for optimizing administrative legislation and public administration.

References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

A REVIEW of an article on the topic "Optimization of administrative legislation based on machine learning and big data technologies (experience of computational experiments)". The subject of the study. The article proposed for review is devoted to topical issues of changes and improvements in administrative legislation in connection with the development of modern technologies. As stated in the work itself, "This article presents the results of computational experiments aimed at developing methods of analysis and optimization of regulatory administrative and legal regulation based on indicators." The subject of the study was the norms of legislation, empirical data, and opinions of scientists. Research methodology. The purpose of the study is explicitly stated in the article. As stated, "The purpose of this work is to further develop and test an indicator approach to the qualitative assessment of the optimization of legislation, including an assessment of the applicability of an interdisciplinary methodology previously developed on administrative-tort and criminal law material." Based on the set goals and objectives, the author has chosen the methodological basis of the study. In particular, the author uses a set of general scientific methods of cognition: analysis, synthesis, analogy, deduction, induction, and others. In particular, the methods of analysis and synthesis made it possible to generalize and separate the conclusions of various scientific approaches to the proposed topic, as well as to draw specific conclusions from empirical data. The authors propose their own methodology. In particular, it is noted that "The research was based on the interdisciplinary (computer-legal) methodology of qualitative assessment of optimization of legal regulation developed by the authors on the basis of an indicator approach, including the dogmatic method, system analysis and expert assessments, as well as computer methods (data collection, purification and preprocessing, natural language processing, markup, normalization and data mining, machine learning)". The most important role was played by special legal methods. In particular, the author actively applied the formal legal method, which made it possible to analyze and interpret the norms of current legislation (legal acts). For example, the following conclusion of the author: "For a qualitative assessment of the optimization of legislation, the decree of the Governor of St. Petersburg No. 61-pg dated 07.09.2015 "On monitoring the social and economic development of inner-city municipalities of St. Petersburg and evaluating the effectiveness of local government bodies of inner-city municipalities of St. Petersburg" was taken, since it is this normative legal act that defines the main directions of public administration of socio-economic phenomena and processes in the federal city of St. Petersburg. The said resolution approved the indicators on the basis of which the annual monitoring of the social and economic development of inner-city municipalities of St. Petersburg and the assessment of the effectiveness of local government bodies of inner-city municipalities of St. Petersburg is carried out." Thus, the methodology chosen by the author is fully adequate to the purpose of the study, allows you to study all aspects of the topic in its entirety. Relevance. The relevance of the stated issues is beyond doubt. There are both theoretical and practical aspects of the significance of the proposed topic. From the point of view of theory, the topic of optimizing administrative legislation based on machine learning and big data technologies is complex and ambiguous. It is usually solved only on the basis of legal research. In the same paper, theoretical results related to the experience of computational experiments are presented, which increases the importance and relevance of the presented research. On the practical side, it should be recognized that there is a need to improve the practice of the activities of bodies on the application of administrative legislation, which can be carried out, inter alia, based on the results of the above study. Thus, scientific research in the proposed field should only be welcomed. Scientific novelty. The scientific novelty of the proposed article is beyond doubt. Firstly, it is expressed in the author's specific conclusions. Among them, for example, is the following conclusion: "An integral array of official statistical indicators, as well as primary data forming these indicators, allows us to identify priority socio-legal goals, in this case, the main indicators (factors) affecting the attractiveness of the territory to the population, as well as socio—demographic groups that require increased attention when regulating and the administration of the quality of life in an urban environment: such groups include children and the elderly who need appropriate medical care, education, leisure, good ecology, landscaping and special conditions in an urban environment for movement. The results obtained are of methodological importance, since they have the potential to use numerical statistical indicators, and can be useful for evaluating the optimization of regulatory regulation and legal (regulatory) policy. Machine learning based on big data in the social, demographic, economic and environmental fields can become an important tool for optimizing administrative legislation and public administration." These and other theoretical conclusions can be used in further scientific research. Secondly, the author offers original generalizations of practice, which can be used by legislators and specialists in the field under study. Thus, the materials of the article may be of particular interest to the scientific community in terms of contributing to the development of science. Style, structure, content. The subject of the article corresponds to the specialization of the journal "Administrative and Municipal Law", as it is devoted to legal problems related to the improvement of administrative legislation. The content of the article fully corresponds to the title, as the author considered the stated problems and achieved the research goal. The quality of the presentation of the study and its results should be recognized as fully positive. The subject, objectives, methodology and main results of the study follow directly from the text of the article. The design of the work generally meets the requirements for this kind of work. No significant violations of these requirements were found. Bibliography. The quality of the literature used should be highly appreciated. The author actively uses the literature presented by authors from Russia and abroad (Trofimov E. V., Metzker O. G., Rogotskaya S., Storozhenko A., Casanovas P., Binefa X., Gracia C., Teodoro E., Galera N., Bl?zquez M., Poblet M., Carrabina J. and others). A large number of sources in foreign languages should be noted. Thus, the works of the above authors correspond to the research topic, have a sign of sufficiency, and contribute to the disclosure of various aspects of the topic. Appeal to opponents. The author conducted a serious analysis of the current state of the problem under study. All quotations of scientists are accompanied by author's comments. That is, the author shows different points of view on the problem and tries to argue for a more correct one in his opinion. Conclusions, the interest of the readership. The conclusions are fully logical, as they are obtained using a generally accepted methodology. The article may be of interest to the readership in terms of the presence in it of the systematic positions of the author in relation to the issues stated with the research topic. Based on the above, summing up all the positive and negative sides of the article, "I recommend publishing"