Makarova I.L., Ignatenko A.M., Kopyrin A.S. —
Detection and interpretation of erroneous data in statistical analysis of consumption of energy resources
// Software systems and computational methods. – 2021. – ¹ 3.
– P. 40 - 51.
DOI: 10.7256/2454-0714.2021.3.36564
URL: https://en.e-notabene.ru/itmag/article_36564.html
Read the article
Abstract: Monitoring and analysis of consumption of energy resources in various contexts, as well as measuring of parameters (indicators) in time are of utmost importance for the modern economy. This work is dedicated to examination and interpretation of the anomalies of collecting data on consumption of energy resources (on the example of gas consumption) in the municipal formation. Gas consumption is important for the socioeconomic sphere of cities. Unauthorized connections are the key reason for non-technological waste of the resource. The traditional methods of detection of stealing of gas are ineffective and time-consuming. The modern technologies of data analysis would allow detecting and interpreting the anomalies of consumption, as well as forming the lists for checking the objects for unauthorized connections. The author’s special contribution lies in application of the set of statistical methods aimed at processing and identification of anomalies in energy consumption of a municipal formation. It is worth noting that the use of such technologies requires the development of effective algorithms and implementation of automation and machine learning algorithms. The new perspective upon time-series data facilitates identification of anomalies, optimization of decision-making, etc. These processes can be automated. The presented methodology tested on time-series data that describes the consumption of gas can be used for a broader range of tasks. The research can be combined with the methods of knowledge discovery and deep learning algorithms.
Kopyrin A.S., Makarova I.L. —
Algorithm for preprocessing and unification of time series based on machine learning for data structuring
// Software systems and computational methods. – 2020. – ¹ 3.
– P. 40 - 50.
DOI: 10.7256/2454-0714.2020.3.33958
URL: https://en.e-notabene.ru/itmag/article_33958.html
Read the article
Abstract: The subject of the research is the process of collecting and preliminary preparation of data from heterogeneous sources. Economic information is heterogeneous and semi-structured or unstructured in nature. Due to the heterogeneity of the primary documents, as well as the human factor, the initial statistical data may contain a large amount of noise, as well as records, the automatic processing of which may be very difficult. This makes preprocessing dynamic input data an important precondition for discovering meaningful patterns and domain knowledge, and making the research topic relevant.Data preprocessing is a series of unique tasks that have led to the emergence of various algorithms and heuristic methods for solving preprocessing tasks such as merge and cleanup, identification of variablesIn this work, a preprocessing algorithm is formulated that allows you to bring together into a single database and structure information on time series from different sources. The key modification of the preprocessing method proposed by the authors is the technology of automated data integration.The technology proposed by the authors involves the combined use of methods for constructing a fuzzy time series and machine lexical comparison on the thesaurus network, as well as the use of a universal database built using the MIVAR concept.The preprocessing algorithm forms a single data model with the ability to transform the periodicity and semantics of the data set and integrate data that can come from various sources into a single information bank.