Library
|
Your profile |
Administrative and municipal law
Reference:
Trofimov E.V., Metsker O.G.
Machine Learning and Big Data for Optimization of Administrative Law (Computing Experience)
// Administrative and municipal law.
2022. ¹ 4.
P. 12-24.
DOI: 10.7256/2454-0595.2022.4.39081 EDN: IHYLJY URL: https://en.nbpublish.com/library_read_article.php?id=39081
Machine Learning and Big Data for Optimization of Administrative Law (Computing Experience)
DOI: 10.7256/2454-0595.2022.4.39081EDN: IHYLJYReceived: 31-10-2022Published: 07-11-2022Abstract: The subject of the research is the methods of its analysis and optimization based on indicators developed in the field of regulatory administrative and legal regulation. A qualitative assessment of the optimization of legislation is shown by the example of the decree of the Governor of St. Petersburg dated 07.09.2015 No. 61-pg, which defines the main directions of public administration of socio-economic phenomena and processes in St. Petersburg. A comparison of the indicators approved by this resolution, which serve the purposes of socio-economic development and administrative and legal regulation, with statistical socio-economic indicators will demonstrate how optimal regulatory regulation is. This optimality is assessed by the compliance of normative indicators (goals) with the most significant ones (for migration flows in inner-city municipalities) statistical indicators identified on large data sets by machine learning methods. Machine learning on large data sets made it possible to identify two of the most significant indicators of them — the goals of socio-economic development and regulatory regulation (the costs of landscaping and the costs of holding local holidays and sporting events), as well as to identify a statistical indicator that is not recognized as a goal of territorial development (environmental protection costs). The results obtained made it possible to identify the most important areas of activity of higher levels of public authority corresponding to the significance of indicators for the migration flow: preschool and school education, healthcare for children and elderly citizens, creation of an accessible (comfortable) environment for them. The results obtained are of methodological importance, since they have the potential to use numerical statistical indicators, and can be useful for evaluating the optimization of regulation and legal (regulatory) policy. Machine learning based on big data in the social, demographic, economic and environmental fields can become an important tool for optimizing administrative legislation and public administration. Keywords: law, artificial intelligence, methodology, digital state, big data, machine learning, statistics, indicator, administrative law, legislationThis article is automatically translated. 1. IntroductionThe spheres of public administration and administrative and legal regulation are extremely extensive, diverse and complex. They accumulate a significant amount of socio-legal interactions, they cover a significant (if not the largest) share of all socio-legal phenomena and processes. These circumstances have always created serious difficulties for the development of optimal administrative and legal regulation, since the human mind is not able to collect in a single intellectual process, to realize and analyze a huge array of information about the phenomena and processes occurring in this area. The realities of today, associated with the accumulation of data and the development of computing power, allow us to begin solving problems of optimizing administrative and legal regulation using computer methods and technologies focused on working with big data. Such methods and technologies are suitable not only for processing large amounts of information, but also for detecting complex (implicit) connections between phenomena and processes that are inaccessible to search and substantiation by "manual" methods. The introduction of high—performance computing and big data into the sphere of public administration and legal regulation is the next stage in the digital transformation of the state and law, which scientists and practitioners in Russia and abroad are working on. Research and development based on big data in this area requires interdisciplinary integration, and therefore remain extremely rare, and their results are still modest. A systematic review of the development of computer systems and methods in legal research and legal practice is made by the authors in a separate work [1], here it is worth mentioning only some recent Russian works. Thus, there is a well-known experience of using big data of search queries on regional crime from Yandex Internet repositories for analytical purposes using the GMDH method, which showed a fairly high (94-96%) accuracy, revealed according to official statistics [2]. However, this study was not focused on legal goal—setting, and its methodological status — substitutive or complementary to the traditional legal methodology - remained, unfortunately, uncertain. The opposite example is the experience of theoretical consideration of the problems of interpretation of the results of big data analysis in legal research [3]. This work, conceived as a legal study, turned out to be abstracted from the methodological and technological (computer) side of the issue. Computer methods and technologies were not analyzed by the authors, who largely relied on commercial (advertising) information from non-scientific sources. As a result, the authors' ignorance about the computer aspect of the problem led to rather sharp conclusions about the need to oppose legal regulation to those high-tech solutions that are developed on the basis of big data, as well as to the authors' dubious theses about the non-interpretability, closeness, non-discursiveness and retrospectivity of generating automated solutions that contradict decades of experience in development and scientific research in the "law & AI" segment, as well as an extensive layer of world computer and interdisciplinary scientific literature. In 2022, at the X St. Petersburg International Legal Forum, the results of the Megafon experiment were presented at three judicial precincts of magistrates of the Belgorod region. The experiment consisted in an attempt to automate the processing of applications for the issuance of a court order, including the formation of accounting and statistical cards and drafts of court orders themselves [4]. Despite the optimism of the authors, who claimed to reduce the time for filling out the case card by 96% and for preparing a judicial act by 84%, the chairman of the Belgorod Regional Court O. Y. Uskov, who oversaw the experiment from the judicial system, drew attention to the fact that these advantages in practical terms were offset by the need for the same (if not large) labor costs for checking machine results and manually correcting numerous errors. Despite the absence of big data in this experiment and the focus on integrating search and management functions, this experiment was still conducted on the basis of machine learning technologies (including text recognition and structuring) and it should be generally considered successful, especially given the positive foreign (Spanish [5], Italian [6], British [7], etc.) experience in developing similar legal content management systems. In 2021, the Ministry of Justice of Russia announced testing of the automated examination system of regulatory legal acts [8]. As of 2022, this system was implemented in the NPCI under the Ministry of Justice of Russia with the functionality for automatic detection of corruption-causing factors. However, this development is mainly related to the search task (identification of duplications, unacceptable elements, etc.), and therefore, despite the well-known practical value, consisting in the advantages of automating a number of intellectual operations, it has not revolutionized legal research and legal practice. 2. Problem and purposeIntegration of computer methodology with methods and tasks of legal sciences is a fundamental scientific problem. In a series of previously published works based on the results of computational experiments in the field of administrative-tort and criminal law, the authors developed and tested interdisciplinary methodological approaches for automated analysis and qualitative assessment of legal regulation based on mathematical and socio-legal indicators and seem promising for further search for solutions to this fundamental problem. At the same time, despite the importance of these two protective areas (administrative-tort and criminal), they are usually considered not so problematic due to their compactness and high degree of systematization. The relevant codes (the Administrative Code of the Russian Federation, the Criminal Code of the Russian Federation, the Code of Criminal Procedure of the Russian Federation) and the practice of their application are given great attention by the legislator, law enforcement and the scientific community. On the contrary, administrative legislation as such, numbering hundreds of thousands (or rather, about 3 million, if the municipal level is included in them) of existing regulatory legal acts, and the practice of its application in both the regulatory and protective spheres are seen as too massive and heterogeneous to begin their automated processing. Nevertheless, it is the complex nature of socio-legal relationships, which is clearly manifested in the field of public administration and administrative and legal regulation, that forces us to turn to computational experiments in this area, taking into account the developments obtained on more studied administrative-tort and criminal law material. The purpose of this work is to further develop and test an indicator approach to the qualitative assessment of the optimization of legislation, including an assessment of the applicability of the interdisciplinary methodology previously developed on administrative-tort and criminal law material. This article presents the results of computational experiments aimed at developing methods of analysis and optimization of regulatory administrative and legal regulation based on indicators. The most important task at this stage of the study, in contrast to the experiments performed earlier (in 2020-2021), was the use as social indicators not a limited set of goals of a pronounced legal nature [9, p. 18, 20], but a wide range of socio-legal goals based on indicators of socio-economic statistics that accumulate large arrays of numerical data contain the potential for a transition from a qualitative assessment of legal regulation to a quantitative analysis. 3. Methods and materialsThe study was based on the interdisciplinary (computer-legal) methodology developed by the authors on the basis of the indicator approach for the qualitative assessment of the optimization of legal regulation, including the dogmatic method, system analysis and expert assessments, as well as computer methods (data collection, purification and preprocessing, natural language processing, markup, normalization and data mining, machine learning) [10]. The study was conducted in the subject area of administrative and legal regulation of a vast complex of socio-economic phenomena and processes of territorial development of the region and the comfortable urban environment of the city of federal significance. The authors proceeded from the fact that higher positive migration reflects the socio-economic attractiveness of the region and that there is no single universally recognized combination of indicators that allow determining the degree of influence of various factors on migration and assessing the migration attractiveness of the region [11, pp. 421-422]. Indicators of the migration attractiveness of the urban environment by various scientists include, for example, public health [12], inner-city traffic [13], social life [14], well-being of residents [15], planning, construction and design of housing [16], satisfaction with neighbors and housing [17], physical security [18], widespread use of information and communication technologies [19]. For a qualitative assessment of the optimization of legislation, the resolution of the Governor of St. Petersburg No. 61-pg dated 07.09.2015 "On monitoring the social and economic development of inner-city municipalities of St. Petersburg and evaluating the effectiveness of local self-government bodies of inner-city municipalities of St. Petersburg" was taken, since it is this normative legal act that defines the main directions of public administration of socio-economic phenomena and processes in the federal city of St. Petersburg. The said resolution approved the indicators on the basis of which the annual monitoring of the social and economic development of inner-city municipalities of St. Petersburg and the evaluation of the effectiveness of the activities of local self-government bodies of inner-city municipalities of St. Petersburg is carried out. The comparison of the indicators approved by this resolution, which serve the purposes of socio-economic development and administrative and legal regulation, with statistical socio-economic indicators should demonstrate how optimal the established regulatory administrative and legal regulation is. This optimality is assessed by the compliance of normative indicators (goals) with the most significant ones (for migration flows in inner-city municipalities) statistical indicators identified on large data sets by machine learning methods. To conduct the study, statistical data were collected from the Unified Interdepartmental Information and Statistical System (EMISS), a dataset was formed from the values of 20 indicators of the structure of the migration flow, population size and density, 568 indicators of economic entities and 1444 indicators of municipal districts characterizing all areas of development of the municipal district, including activities in the field of culture, construction, communications, business, transport, ecology, in the context of 111 inner-city municipalities of the federal city of St. Petersburg for 4 years (2017-2020). The data were measured by the values "municipality" and "year" (rows), the indicator was indicated as a column. The sample was divided into a test (20%) and a training (80%), after which a gradient boosting model was trained using the XGBoostRegression method based on the above statistical indicators for the purpose of "internal migration growth". Regularization was performed to improve the model with the best XGBRegressor parameters: base_score=0.5, booster=None, colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1, importance_type='gain', interaction_constraints=None, learning_rate=0.300000012, max_delta_step=0 , max_depth=6, min_child_weight=1, missing=nan, monotone_constraints=None, n_estimators=100, n_jobs=0, num_parallel_tree=1, random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method=None, validate_parameters=False, verbosity= None. RMSE (414.73) and R-Squared (0.74) were calculated, as well as the significance of predictors using the F1-score metrics (for classification of metric error) and the Shapley index (for impact assessment). 4. ResultsThe significance of predictors (statistical indicators) using the F1-score metrics (30 predictors) and the Shapley index (20 predictors) is shown in Figures 1 and 2, respectively. Among the main indicators contributing to the migration model, there are indicators of the contribution of the municipal district to fixed assets, environmental indicators, the cost per square meter, law enforcement costs, which indicates the conscious nature of migration of the population to municipalities in which housing and communal services are developing, ecology and law enforcement are provided. It is worth noting that the amount of housing space entered is not included in the top 20 migration indicators.
Fig. 1. Significance of predictors (statistical indicators) using F1-score metrics for internal migration growth.
The results of a comprehensive analysis of predictors obtained by the Shapley index, among the 20 most significant statistical indicators affecting the migration attractiveness of inner-city municipalities of St. Petersburg, include only three characteristics of the activity of municipalities: — environmental protection costs; — investments in fixed assets; — activities in the field of culture, sports, leisure and entertainment. Due to the specifics of budget expenditures allowed for municipalities, investments in fixed assets mean expenses for the improvement of residential neighborhoods (construction of playgrounds, installation of public sports simulators, etc.), and activities in the field of culture, sports, leisure and entertainment - expenses for street celebrations and sporting events for the population. Thus, the Shapley index made it possible to identify the three most important targets for internal and external migrants: cleanliness, landscaping and leisure activities. An interesting fact found in the results of calculating the significance of predictors based on the Shapley index is the identification of the indicator "women 0-4 [years]", which indicates an increased importance in the migration flow of girls under 4 years old, who statistically aggregate complex (implicit) links with the spectrum of migration factors (socio-economic indicators of the development of the territory). Also in this interpretation, you can see the structure of the flow, which consists of older men from the CIS countries and women over 90 from the regions of Russia. Thus, the main needs in the most popular municipal districts are medical care for these socio-demographic groups, although indicators related to medical care are absent among the most significant predictors.
Fig. 2. Significance of predictors (statistical indicators) using SHAP value for internal migration growth.
The official indicators used as the goals of regulatory regulation and socio-economic development of territories approved for evaluating the effectiveness of municipalities by the Decree of the Governor of St. Petersburg dated 07.09.2015 No. 61-pg include 16 positions, which in general are as follows: — execution of the municipality's budget; — expenses for the maintenance of municipal employees; — the amount of contracts concluded with the winners of competitive procedures; — expenses for landscaping; — transfer of orphans to guardianship; — expenses for local holidays and sporting events; — the percentage of the population who took part in local holidays and sports events; — circulation of the municipal newspaper. Machine learning on large data sets made it possible to identify two of the most significant indicators of them — the goals of socio-economic development and regulatory regulation (the costs of landscaping and the costs of holding local holidays and sporting events), as well as to identify a statistical indicator that is not recognized as a goal of territorial development (environmental protection costs). In addition, the attractiveness of municipalities depends on a whole set of factors that are identified: — directly (cleanliness, landscaping and organization of leisure activities of residents in the municipality); — due to the subsequent interpretation of the significance of predictors concerning the characteristics of the migration flow as a target indicator of the analysis (for example, the high significance of the migration of girls from 0 to 4 years). Considering that the goals of regulatory regulation are formed taking into account the level of public authority, which has its own competence (in this case, issues of local significance and transferred separate state powers of the Russian Federation and the subject of the Russian Federation) and budgeting (in this case, fixed sources of income and expenses of local budgets of inner-city municipalities), a qualitative assessment optimization of regulatory regulation also includes the issue of differentiation of subjects of competence and powers between levels of public authority. The regulatory and legal regulation of the goals of socio-economic development of inner-city municipalities of St. Petersburg should correlate not only with the activity of municipalities, but also with the activity of higher levels of public authority due to the limitation of the powers of municipalities to carry out and finance a number of activities. For example, municipalities in St. Petersburg do not have the competence and budget for healthcare, preschool and school education, so setting appropriate goals may be justified, but evaluating the activities of municipalities in this area is unjustified. The data obtained make it possible to determine the most important areas of activity of higher levels of public authority, corresponding to the significance of predictors from the characteristics of the migration flow: preschool and school education, healthcare for children and senior citizens, creating an accessible (comfortable) environment for them. 5. Conclusion An integral array of official statistical indicators, as well as primary data forming these indicators, allows us to identify priority socio-legal goals, in this case, the main indicators (factors) affecting the attractiveness of the territory for the population, as well as socio—demographic groups that require increased attention when regulating and administering the quality of life in an urban environment: to such groups include children and elderly people who need appropriate medical care, education, leisure, good ecology, landscaping and special conditions in an urban environment for movement. The results obtained are of methodological importance, since they have the potential to use numerical statistical indicators, and can be useful for evaluating the optimization of regulatory regulation and legal (regulatory) policy. Machine learning based on big data in the social, demographic, economic and environmental fields can become an important tool for optimizing administrative legislation and public administration. References
1. Trofimov, E.V., & Metsker, O.G. (2020). Application of computer techniques and systems in the study of law, intellectual analysis and modeling of legal activity: a systematic review. Proceedings of the Institute for System Programming of RAS, 32(3), 147–170, doi: 10.15514/ISPRAS-2020-32(3)-13.
2. Boldyreva, A., Alexandrov, M., Koshulko, O., & Sobolevskiy, O. (2017). Internet queries as a tool for analysis of regional police work and forecast of crimes in regions. Lecture Notes in Computer Science, 10061, 290–302, doi: 10.1007/978-3-319-62434-1_25. 3. Tikhomirov, Yu.A., Kashanin, A.V., Churakov, V.D., Osipova, P.M., Sklyar, V.D., & Grishina, D.A. (2021). Study of the problems of interpretation of the results of big data analysis in legal research: a final report on research. Moscow: Higher School of Economics. Reg. no. 222021800507-9. 4. Rogotskaya, S., & Storozhenko, A. (2022, July 1). Judicial activism should not go beyond the principle of competitiveness // Federal Chamber of Lawyers of the Russian Federation [Website]. URL: https://fparf.ru/news/fpa/sudebnyy-aktivizm-ne-dolzhen-vykhodit-za-predely-printsipa-sostyazatelnosti/ 5. Casanovas, P., Binefa, X., Gracia, C., Teodoro, E., Galera, N., Blázquez, M., Poblet, M., Carrabina, J., Monton, M., Montero, C., Serrano, J., & López-Cobo, J.M. (2009). The e-sentencias prototype: a procedural ontology for legal multimedia applications in the Spanish civil courts. In J. Breuker, P. Casanovas, M. C. A. Klein, E. Francesconi (Eds.), Law, Ontologies and the Semantic Web: Channelling the Legal Information Flood (pp. 199–219). Amsterdam: IOS Press. 6. Boella, G., Di Caro, L., Humphreys, L., Robaldo, L., Rossi, P., & van der Torre, L. (2016). Eunomos, a legal document and knowledge management system for the Web to provide relevant, reliable and up-to-date information on the law. Artifi¬cial Intelligence and Law, 24 (3), 245–283, doi: 10.1007/s10506-016-9184-3. 7. García-Constantino, M., Atkinson, K., Bollegala, D., Chapman, K., Coenen, F., Roberts, C., & Robson, K. (2017). CLIEL: context-based information extraction from commercial law documents. Proceedings of the 16th International Conference on Artificial Intelligence and Law (ICAIL’17), London, United Kingdom, June 12–16, 2017 (pp. 79–87). N.Y.: Association for Computing Machinery. 8. The Ministry of Justice intends to use artificial intelligence for the examination of laws. TASS [Website], 2021, May 19. URL: https://tass.ru/obschestvo/11415055?utm_source=google.com&utm_medium=organic&utm_campaign=google.com&utm_referrer=google.com 9. Trofimov, E.V., & Metsker, O.G. (2020). An methodology for the qualitative assessment of optimization of legislation and law enforcement based on the analysis of big data of administrative offenses cases. Law and Politics, 10, 10–26, doi: 10.7256/2454-0706.2020.10.34250. 10. Trofimov, E.V., & Metsker, O.G. (2021). Methodology for the qualitative assessment of the legal optimization (data mining and machine learning on judgment big data in cases of administrative offenses and criminal cases) [monograph]. Saint Petersburg: Saint Petersburg Institute (Branch) of the All-Russian State University of Justice. doi: 10.47645/9785604572863. 11. Yangirova, E.I., Kandaurova, I.R., & Musin, U.R. (2018). Migration attractiveness of the region. Moscow Economic Journal, 4, 420–429, doi: 10.24411/2413-046Õ-2018-14014. 12. Zhang, R., Zhang, C.-Q., & Rhodes, R.E. (2021). The pathways linking objectively-measured greenspace exposure and mental health: A systematic review of observational studies. Environmental Research, 198 (6), 111233, doi: 10.1016/j.envres.2021.111233. 13. Jin, J. (2019). The effects of labor market spatial structure and the built environment on commuting behavior: Considering spatial effects and self-selection. Cities, 95, 102392, doi: 10.1016/j.cities.2019.102392. 14. Boessen, A., Hipp, J.R., Butts, C.T., Nagle, N.N., & Smith, E.J. (2018). The built environment, spatial scale, and social networks: Do land uses matter for personal network structure? Environment and Planning B: Urban Analytics and City Science, 45 (3), 400–416, doi: 10.1177/2399808317690158. 15. Mouratidis, K. (2018). Built environment and social well-being: How does urban form affect social life and personal relationships? Cities, 74, 7–20, doi: 10.1016/j.cities.2017.10.020. 16. Foster, S., Hooper, P., Knuiman, M., Bull, F., & Giles-Corti, B. (2016). Are liveable neighbourhoods safer neighbourhoods? Testing the rhetoric on new urbanism and safety from crime in Perth, Western Australia. Social Science and Medicine, 164, 150–157, doi: 10.1016/j.socscimed.2015.04.013. 17. Mouratidis, K. (2020). Commute satisfaction, neighborhood satisfaction, and housing satisfaction as predictors of subjective well-being and indicators of urban livability. Travel Behaviour and Society, 21, 265–278, doi: 10.1016/j.tbs.2020.07.006. 18. Lee, K.-Y. (2021). Relationship between physical environment satisfaction, neighborhood satisfaction, and quality of life in Gyeonggi, Korea. Land, 10 (7), 663–675, doi: 10.3390/land10070663. 19. Nevado-Pena, D., Lopez-Ruiz, V.-R., & Alfaro-Navarro, J.-L. (2019). Improving quality of life perception with ICT use and technological capacity in Europe. Technological Forecasting and Social Change, 148, 119734, doi: 10.1016/j.techfore.2019.119734.
Peer Review
Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
|