Trub I. —
On approximation of the output of a probabilistic model of hierarchical bit indices
// Software systems and computational methods. – 2018. – ¹ 4.
– P. 102 - 113.
DOI: 10.7256/2454-0714.2018.4.27809
URL: https://en.e-notabene.ru/itmag/article_27809.html
Read the article
Abstract: The subject of the study is a probabilistic model of hierarchical bit indexes of databases. The object of the study is the output of the model — a three-parameter discrete distribution of the number of indexes for implementing queries to the database, parametrized by the intensity of recording records in the database, the average query length, and the size of a large index. The author considers such aspects of the topic as the choice of a hypothesis from known theoretical distributions, a method for testing a hypothesis, selection of functions for approximating the dependence of the expectation on the third parameter, selection of a function for approximating the dependence of the minimum point of the expectation for the third parameter from the first two. The study of such dependencies is explained by the fact that the optimal choice of the third parameter is the goal of the designer, and the first two are the initial data of the model. The methodology of the research is the methods of mathematical statistics, in particular, the estimation of parameters and the Pearson criterion of testing hypotheses, methods for constructing the best approximations, in particular, the method of least squares, the theory of curves of the third order. The main conclusions of the study: the best approximation for the studied family of distributions is the Polya distribution; The best approximations for the dependence of the expectation on the third parameter are the Bacon-Watts model and the heat capacity model. A special contribution of the author to the study of the topic is the derivation of an empirical formula that has practical significance. It allows the designer on the basis of the first two parameters at once, without using cumbersome calculations on the model, to obtain an approximate optimal value of the third parameter and thus construct an index of the database of the optimal size. The novelty of the research lies in obtaining approximate dependencies for a new type of distribution that cannot be described by a closed formula.
Trub I., Trub N. —
The model of hierarchical indexes of databases with decision making and its comparison with the minimax model
// Software systems and computational methods. – 2018. – ¹ 1.
– P. 18 - 36.
DOI: 10.7256/2454-0714.2018.1.25369
URL: https://en.e-notabene.ru/itmag/article_25369.html
Read the article
Abstract: The subject of the study is the concept of hierarchical bitmap-indexes proposed by the authors. In order to improve the processing performance of queries on the time filter, the indices are supported not only for the values of the basic unit of time, but also for arbitrary larger multiple units. The object of the study is to construct a probabilistic model that makes it possible to evaluate the effectiveness of decision making: what bitwise operation to apply at the next level of the hierarchy when constructing the resulting sample is a disjunction or an exclusive OR. The author focuses on justifying the validity of the model and comparing the results with the previously constructed minimax model, in which the decision was made according to a pre-established rule and did not depend on the current state of the system. The methodology of the study is probability theory, methods multicriteria optimization and computational experiment, as well as related methods of intuitive evaluation of the likelihood of the results. Main conclusions of the study: an analytical model of the dynamic selection of an index operation has been constructed and verified; It is shown that the proposed discipline of choice gives higher productivity in comparison with the minimax model and software is developed to obtain a numerical estimate of this difference; a model for estimating the costs of dynamic decision making and a weight function that allows one to evaluate the efficiency of the model with decision making and to choose one of the two models is proposed for this or that choice of weights.
Trub I. —
Probabilistic model of hierarchical database indexes
// Software systems and computational methods. – 2017. – ¹ 4.
– P. 15 - 31.
DOI: 10.7256/2454-0714.2017.4.24437
URL: https://en.e-notabene.ru/itmag/article_24437.html
Read the article
Abstract: The subject of the study is the concept of hierarchical bitmap-indexes proposed by the author. It is that in order to improve the processing performance of queries on the time filter, the indices are supported not only for the values of the basic unit of time, but also for arbitrary larger multiple units. The object of the study is the construction of an analytic probability model of such indices for the particular case of the exponential distribution of a random stream of recording records in a database. The author focuses on such an aspect as the calculation of the discrete distribution of the number of indices involved in the processing of the query. The methodology of the study is probability theory, combinatorial methods, measure theory, computational experiment. In addition, it is shown that the latest concepts of the theory of cellular automata, such as the Zaitsev's neighborhood, can be used to study the features of the proposed model. The main results of the work can be formulated as follows: introduced an original, intuitive concept of building indexes; new, meaningful optimization problems for selecting a hierarchical index system are formulated; a mathematical model is constructed and verified, allowing to estimate the efficiency of using the chosen hierarchy of indices. It is shown that in the limiting case the model naturally tends to a set of fractal nature, in particular, one of the varieties of Cantor dust, for which the formula for calculating its Hausdorff-Besicovitch dimension is derived through the application parameters of the initial problem.
Trub I. —
Numerical Modelling of the General Task of Bitmap-Indices Distribution
// Software systems and computational methods. – 2017. – ¹ 3.
– P. 35 - 53.
DOI: 10.7256/2454-0714.2017.3.22952
URL: https://en.e-notabene.ru/itmag/article_22952.html
Read the article
Abstract: The subject of the research is the mathematical model in the form of recurrent integral relation system that describes the distribution of unit intervals where at least one random stream event with the arbitrary distribution function has happened. The author of the article examines numerous aspects of the numerical implementation of this system such as the Laplace transformation access method, numerical integration near discontinuity points, stability of calculations, validity check results, and particularities of real number machine arithmetic. Trub pays special attention to the connection between calculation data and semantics of the applied problem which solution is represented by these data. The methodology of the research is based on the probability theory (distribution types and qualities), numerical mathematics methods (numerical integration, interpolation, Laplace transformation), software implementation of the mathematical model and computing experiment conduction. The main conclusions of this research is the validity and numerical implementability of the mathematical model created by the author of the article as well as substantiation of the numerical solution for arbitrary distribution of the random stream events distribution. The novelty of the research is caused by the fact that the author develops a numerical solution of the bitmap-indices distribution for Weibull distribution, gamma distribution, logarithmically normal distribution, etc., and analyzes dependencies of different kinds such as the indices density function and average number of indices for the specified interval length.
Trub I. —
Analytical probabilistic modeling of bitmap-indexes
// Software systems and computational methods. – 2016. – ¹ 4.
– P. 315 - 323.
DOI: 10.7256/2454-0714.2016.4.21091
Read the article
Abstract: The study is devoted to bitmap-indexes as a tool of improving efficiency of processing search queries and reporting in the current database. The subject of research is mathematical model of dependence of the number of indexes, required to build a sample that meets the request, on the intensity of adding records to the database and query the specified range of values. This characteristic is most significant for evaluating query processing performance because it determines the number of disjunction operations on bit strings, required to get a result set. This problem arose entirely from the practical needs due to the critical impact of the speed of building of reports on customer value commercial products - database applications. The methodology of this study is probabilistic analytical modeling based on representations of the original data in the form of Poisson process and the use of the apparatus of mathematical analysis (integral calculus and summation rows) to get the final results. The novelty of the research is to develop a suggested mathematical model, which allows to put a wide range of problems of the analysis and optimization. The problem is solved – the author presents the formula for the distribution of the number of indexes, and the average number of indexes in a single query. For each result author evaluated reliability on the basis of an alternative approach or plausible reasoning. The paper sets the tasks of constructing a probabilistic model for the distribution of any type of query processing and optimization using hierarchical bitmap-indexes. It should be noted, that formulated problem and the results obtained have an independent theoretical value within the queuing theory without regard to the application area.