Library
|
Your profile |
Software systems and computational methods
Reference:
Zelenskii A.A., Gribkov A.A.
Configuration of memory-oriented motion control system
// Software systems and computational methods.
2024. № 3.
P. 12-25.
DOI: 10.7256/2454-0714.2024.3.71073 EDN: TTQBBA URL: https://en.nbpublish.com/library_read_article.php?id=71073
Configuration of memory-oriented motion control system
DOI: 10.7256/2454-0714.2024.3.71073EDN: TTQBBAReceived: 19-06-2024Published: 25-07-2024Abstract: The paper investigates the possibilities of configuring the control cycle, i.e., determining the distribution of time intervals required for the execution of individual control operations across execution threads, which ensures the realizability of control. The object of research in this article are control systems with object-oriented architecture, assuming a combined vertical-horizontal integration of functional blocks and modules that distribute all control tasks among themselves. This architecture is realized by means of an actor instrumental model using metaprogramming. Such control systems are best at reducing control cycle time by performing computational and other control operations in parallel. Several approaches to control cycle configuration are considered: without optimization, with combinatorial optimization in time, with combinatorial optimization in system resources. Also, achieving a near-optimal configuration can be achieved by using adaptive configuration. Research shows that the control system cycle configuration problem has several solutions. Practical obtaining a solution to the configuration problem in the case of combinatorial optimization is associated with significant difficulties due to the high algorithmic complexity of the problem and a large amount of required computations, rapidly growing as the number of operations at the stages of the control cycle. A possible means of overcoming these difficulties is the use of stochastic methods, which sharply reduce the required amount of computation. Also, a significant reduction in the complexity of the task of configuring the control system cycle can be achieved by using adaptive configuration, which has two variants of realization. The first variant is the real-time configuration of the control system cycle. The second variant is the determination of quasi-optimal configuration on the basis of multiple configurations with different initial data and subsequent comparison of the obtained results. Keywords: control system, memory-oriented, configuration, optimization, loop, elements, control operations, execution threads, adaptive, sorting methodsThis article is automatically translated. Introduction The problem of real-time motion control, which is necessary for modern industrial robots, machine tools and other technological equipment, consists of several interrelated components. The central one is the problem of control feasibility, which consists in determining (and ensuring) the possibility of motion control with the speed necessary to ensure a given accuracy in reproducing the speed and trajectory of movement. The research conducted by the authors has shown [1, 2] that the problem of control feasibility causes extremely stringent requirements for the performance of control systems for complex objects. Such complex control objects, in particular, include precision technological equipment with a large number of simultaneously controlled axes (five axes or more). One of the specific features of real-time motion control is the requirement for a high-speed control system, rather than high performance. The performance of a control system (like any computer) is defined as the average number of computational operations (or other control operations) per unit of time, and this calculation is usually performed for sufficiently large amounts of calculations (control operations) over a significant time interval. In most cases, real–time control is implemented in the form of a relatively small set of operations, however, this set of operations must be performed in an extremely short time interval - a control cycle that does not exceed one millisecond, and for the most complex control objects it is reduced to microseconds. As a result, it becomes critical for the realizability of control to ensure the performance parameter of the control system – the magnitude, the inverse duration of the control cycle. In a more precise formulation, performance is the number of units of complexity of an object serviced by the control system per unit of time. According to the methodology developed by the authors and co-authors, the complexity of the control object depends on the number of types of elements and the average number of elements of the same type in the system, the number of types of connections and the average number of connections of the same type in the system, the number of controlled parameters by which the state of an individual element of the system is described, as well as the number of monitored states of the controlled parameter [1]. The speed of the motion control system for a complex object is limited by the action of three groups of factors, figuratively called the power wall, the memory wall and the frequency wall [3]. In order to practically overcome these three "walls" and increase the speed of the motion control system as a result, three tasks must be solved that have a solution by improving the architecture of computing machines: firstly, it is necessary to reduce the volume of the processed data stream through the use of parallel computing; secondly, it is necessary to increase the data transfer rate between the elements of the computer due to the use of in-memory processing (PIM) [4] or near-memory processing (NMC) [5]; thirdly, it is necessary to eliminate queues when accessing the same memory of several computing devices at the same time by physically dividing memory between devices. The memory-oriented architecture of the motion control system allows solving these three tasks, in which data does not move between the processor and memory during calculations, but remains in memory, into which the processor is integrated. The key difference between a memory-oriented architecture and a traditional (processor-oriented) one is the rejection of the management organization, in which it is divided into several levels (for example, strategic, tactical and executive). Instead, there is a combined vertical-horizontal integration of functional blocks and modules that distribute all management tasks among themselves. At the software and algorithmic level, the implementation of a memory-oriented architecture is provided by an actor instrumental model in which each actor (element of the actor model) corresponds to an element of the motion control system, which is reflected in the form of a delay and a control system cycle. Moreover, this element can be either real, corresponding to a functional module as part of a motion control system, or virtual, formed in memory (shared or local for a separate functional module) to perform a private computing task, data conversion or generation of control commands for peripheral devices. The condition of maximum autonomy of actors asynchronously exchanging data in the process of joint implementation of the control function corresponds to the software implementation of a motion control system based on metaprogramming [6]. In this case, each of the actors is generated as necessary in the management process in the form of a separate program that runs in shared memory or in the local memory of a separate functional module. The practical solution to the problem of the feasibility of controlling the movement of a complex object is reduced to parallelizing the execution of control operations and, accordingly, to determining the optimal mutual distribution of time spent on individual operations, taking into account the established algorithmic constraints on the sequence of their execution. Note that the existence of Amdahl's law is connected with the presence of such restrictions [7]. The objective of this article is to solve this problem of optimal allocation of time costs for individual operations within the management cycle. In addition, it is necessary to determine possible algorithms for the practical implementation of optimization calculations in the event that their volume turns out to be significant.
Approaches to configuring the control system cycle Figure 1 shows a diagram of the optimized cycle of the industrial robot control system [2]. Determining the optimal mutual allocation of time spent on individual operations is a task of configuring the cycle of a motion control system. This task can be solved on the basis of three main approaches, each of which, according to the authors, has the right to exist.
Figure 1. Diagram of the optimized SU cycle of an industrial robot: t 111, t 121, t 131, t 141, t 151 — delays of the communication network when transmitting data from sensors and sensors to the sensing modules; t 112 — delay of the vision system, STZ (without INS); t 113 — delay of the artificial neural network (INS) when processing data for STZ; t 122 — delay of the speech recognition system, CPR (without INS); t 123 — delay of the INS when processing data for CPR; t 132, t 142, t 152 — delays of data processing modules from position, speed sensors, etc.; t 211, t 221 — delays of the core: interpreter (not in every cycle) and delay in processing data from the sensing modules; t 311 — delay of the transformation module; t 312 — delay of the interpolation module; t 313 — delay of the equidistant correction module; t 321 — delay of the preview module; t 411 — delay of the acceleration-braking module; t 511, t 521, t 531, t 541 — delays of regulators of executive devices; | | — delay in writing /reading data in shared memory.
The first approach, let's call it configuration without optimization, is based on reproducing a given configuration of the control cycle with a known (determined on the basis of the developer's expert assessment) distribution of elements (delays in processing operations) into groups and execution streams. An example of the implementation of this approach is the diagram of the optimized cycle of an industrial robot control system shown in Fig. 1. If (to obtain a simpler representation) the delays of reading/writing data in shared memory are included in the delays of the corresponding elements, then the cycle duration of the control system in this case is determined as follows:
where u max is the number of groups of operations in the control cycle; p u max is the number of parallel data processing streams in the uth group; m up is the number of elements in the pth parallel data processing streams in the uth group. The resources required to implement this configuration must be no less than the total resources required by all elements in all execution threads at each point in the control cycle. The second approach, let's call it a configuration with combinatorial time optimization, is based on configuring a control cycle from a set of elements divided into groups (determined by the sequence and interconnectedness of elements within the control algorithm), but not distributed across execution streams. During the optimization process, such a distribution is established among the execution streams, in which the duration of the control cycle will be minimal. Let's consider the order of optimization of the time management cycle. The uth group of operations is formed from a set of elements A u = {a u 1,a u 2,...a uj,...a un(u)}, each of which a ui = {t ui,w ui} is characterized by the time t ui required to perform a control operation corresponding to this element, and the w ui resources necessary to implement this element. The required amount of memory and/or computing power (the number of elementary operations required to perform the control operation in question, per unit of time of its execution) can be considered as element resources. The elements of a ui cannot be used in any order. The pairwise definition of the order of the elements is given in the form of a matrix of relations
in which each element k ij ∊ {-1,0,1} defines a sequence of elements i and j: "-1" – j then i, "0" – the order does not matter, "1" – i then j. From the elements of the u-th group, a sample is formed for a separate p-th stream, in which each of its elements is identified with an element from the set of the u-th group:
moreover, the elements are not repeated
and are implemented in the correct order with respect to all elements preceding in time in all (previously defined) flows, including their own,
where the value of j d for the flow d is determined based on the condition i(a uj(p)) is the value of the element number in the initial set of the uth group corresponding to the element a uj(p) in the sample for the pth stream. If K u = -1, then all the elements already included in the selection (from 1 to j) are rearranged based on condition (5). Each element in the sample for the pth stream, as in the original set of the uth group, is determined by the required time and resources:
The composition of the sample for the pth stream of the uth group is a set of m up elements
moreover, the sum of the elements of all streams is equal to the number of elements nu in the original set of the uth group and the elements in different streams (sets) are not repeated:
The latter condition is provided automatically due to the sequential (without omissions) inclusion of elements from the initial set of the u-th group in the samples for the 1st, 2nd and subsequent streams. To implement further optimization, it is necessary to set the function of the total resources used by elements in all flows of the u-th group at a given time:
where the number of the corresponding element j is determined from the expression Now it is possible to determine the duration of the part of the control cycle corresponding to the execution of the operations of the u-th group in the form of a set A u with a given sequence of elements:
moreover, the condition must be fulfilled It should be noted that dependencies (9) and (10) do not imply the possibility of arbitrary delays between elements to change combinations of elements of different streams to minimize the resources required. To obtain the minimum processing time for the operations of the u-th group, it is necessary to conduct a similar analysis for the set A u with other sequences of elements (the number of permutation options N u = n u!) and other values of the number of elements in the streams. The number of permutation options grows rapidly as the number of elements in the set increases. For example, if for n u = 10 (implemented when combining some related elements, see Fig. 1) we have an acceptable value of N u = n u! = 3.6 × 10 6, then for n u = 15 we have N u = n u! = 1.3 × 10 12, and for n u = 20 we have N u = n u! = 2.4 × 10 1 8. The values of the number of elements in the streams for group u can be set as a set
where p max is the number of threads in the uth group, and The number of variants of the set M uj is the number of ordered partitions of the number n u of elements in the original set of the uth group and is equal to j max = 2 n u-1, which in the case of n u = 10 gives j max = 512, in the case of n u = 15 – j max = 1.6 × 10 4, and in the case of n u = 20 – j max = 5.2 × 10 5. Minimum time for processing operations of the u-th group:
where M uj is the jth (from j max) variant of the set M uj, and uj is the ith (from N u) permutation of the original set A u. Based on the above dependencies, the number of options given by τ u(A ui,M ui) is equal to N u · j max = n u! · 2 n u-1 and is, depending on the number of elements in the initial set of the uth group, from 1.8 × 10 9 at n u = 10 to 2.1 × 10 16 at n u = 15 and 1.2 × 10 24 at n u = 20. The latter number of options is too large and cannot be analyzed in practice. The maximum complexity of this optimization algorithm corresponds to the notation O(n!*2 n–1*n 2), where n 2 is the multiplier associated with sorting the sample by iterating from the original set of the uth group. This type of complexity has a critical negative impact on the practical feasibility of a complete search of all possible options. This situation is typical for combinatorial optimization problems. A means of solving this combinatorial optimization problem with a large number of elements in a group is the use of stochastic methods with random generation of a sequence of elements in the initial set A u for the uth group and a set M uj of values of the number of elements in the flows of this group. To increase the efficiency of random sampling, it is possible to purposefully ensure a significant difference in the sequences of elements, divide groups into streams with a comparable number of elements, limit the variability of the number of streams based on the available resources for the entire group and the average values of the resources of individual elements, collect streams from the elements of the initial set not sequentially, but in parallel (alternately adding elements to different streams) and etc. As a result, N u and j max will not depend on n u (they will be approximately constant, set based on the expert assessment of the developer), the complexity of the algorithm will decrease significantly and may correspond to the complexity of sorting, which in the case of simple iteration has the notation O(n 2). The monotony of changes in calculated values near the minimum value of the control cycle duration can serve as a criterion for evaluating the representativeness of a random sample. The duration of the control cycle is determined as follows:
The third approach, let's call it a configuration with combinatorial optimization of resources, as well as the second, is based on configuring the control cycle from a set of elements divided into groups, but not distributed across execution streams. During the optimization process, such a distribution is established among the execution threads, in which the duration of the control cycle does not exceed the specified one, and the required resources will be minimal. Let's consider the procedure for optimizing the resource management cycle. For this optimization variant, the formulas (2-10) obtained earlier for the time optimization variant are valid. Resources required to implement all operations of the u-th group of the management cycle:
When implementing the resource-optimal configuration of the management cycle, the (minimum) resources required will be
moreover, the duration of the control cycle should not exceed the specified maximum value: The configuration of the management cycle with time or resource optimization is optimization tasks related to the group of NP-complete tasks. The limitations of the task of configuring the control cycle with combinatorial optimization are, respectively, the permissible (computational) resources of the control system or the duration of the control cycle. Due to the high algorithmic complexity of the optimization problems under consideration, none of the typical optimization algorithms used for combinatorial optimization can fully reproduce it. In particular, it is not possible to build optimization algorithms on weighted oriented graphs, for example, the Dextra algorithm, the Floyd-Warshell algorithm, etc. The need to use weighted oriented graphs is due to the fact that the configuration elements of the control cycle have weights and sequence constraints. The problem in building optimization algorithms is related to the fact that in well-known algorithms, the weight of the elements (edges of the graph) affects the construction of the route locally, determining the sequence of a pair of vertices, meanwhile, in our optimization problem, the edges of the graph (time or resources of the element), although they have weight, but it affects only in combination with the weights of other edges of the graph. As a result, the same edge weight in different positions of the control cycle configuration has a different effect on the duration of the control cycle or the resources required for management. Existing optimization algorithms, however, can be used to solve local problems within the framework of combinatorial optimization. In particular, the use of comparison algorithms in the ordering of the sample for the pth stream of the uth group reduces the execution time of the sorting algorithm from O(n 2) to O(n log n) [8, pp. 206-207]. Among the best sorting algorithms in terms of time and memory are [9]: merge sort (merge sorting [8, pp. 181-192]), heapify and heap sort (pyramid sorting [10]) – the best, average and worst time O(n log n), memory O(n); quik sort (Hoare sorting) — best and average time O(n log n), worst time O(n 2), memory O(log n); tree sort (binary tree sorting) – best time O(n), average and worst time O(n log n), memory O(n); timsort (hybrid sorting algorithm combining insertion sorting and merge sorting) and binary insertion sort (hybrid sorting algorithm combining insertion sorting with binary insertion location search [11]) – best O(n) time, average and worst O(n log n) time, memory O(n).
Adaptive configuration of the control system cycle Along with the considered options for combinatorial optimization of the configuration of the control cycle in terms of time or resources, adaptive configuration of the control cycle can also be used. This configuration, like the optimization options considered, is applied to groups of elements, and the final results are obtained after combining the results obtained for them. At the same time, the maximum available resources are used at each moment of time, thereby minimizing the execution time of all elements of the group. At each moment of time, we know the elements implemented on each stream of the u-th group {a u(1),a u(2),...a u(p),...a u(p max)}, and each element is taken from the initial set of elements of the u-th group, after which in the original set, it is marked as used. This is implemented by setting the parameter z ui ∊ {1,0} "used" / "not used" for the elements a ui in the original set, along with time and resources. When the execution time of any of the elements ends, an unused element from the original set is assigned in its place (in the same thread), for which two conditions are met. According to the first condition, the assigned qth element must be assigned to execution no later than any of the remaining unused elements, i.e.
and according to the second condition, the assigned qth element must have the maximum resource requirement of all remaining unused in the original set, but the total required resources of all streams (including the stream of the assigned element) must not exceed the permissible maximum value, i.e.
If resources remain in the system after assigning a new element, then another element is assigned according to the same algorithm, for which a new thread is added. After completing the execution of the elements of the 1st group, the execution of the elements of the 2nd group is started, etc. The control cycle ends when the elements in the sets of all groups run out. The launch of adaptive configuration is carried out with the specified expert assessment of the developer of the starting elements (i.e., the elements performed, according to the control algorithms, first). Threads are added as long as the resources involved in all threads are less than those available in the system. To implement adaptive configuration, which in general does not guarantee minimizing the duration of the control cycle, the above sorting optimization algorithms can be used. They are applicable both to the implementation of condition (16) and (17). Adaptive configuration can be used to solve two different tasks: firstly, to distribute the elements of the control cycle in the control process, i.e. real-time configuration; secondly, to determine the quasi-optimal configuration of the control cycle based on multiple adaptive configuration and comparison of its results. In the case of adaptive real-time configuration, the frequency of configuration redefinition is limited by the time interval required for its calculation. This time interval is probably much longer than the duration of the control cycle, i.e. the redefinition of the control cycle configuration occurs once in hundreds or even thousands of control cycles. According to the authors, this is acceptable, since in the case when, due to the correction of the control algorithm, a redefinition of the execution time of the elements is required, it occurs in a time corresponding (for example, for motion control systems) to significant changes in the control object (for example, movements of the working body). Multiple adaptive configuration of the control cycle involves its execution with different permutations of the initial set of elements of a given group. According to the results of the jth adaptive configuration, a set is formed
formed by the elements b ujn = {u, i, p}, where {u, i, p} is the group number, the element number in the original set of elements of the uth group and the flow number for the nth element of the set B uj. The time interval required to complete the elements of the pth stream of the uth group is determined from the condition
The duration of the part of the control cycle during which all elements of the u-th group are executed, corresponding to the j-th implementation of adaptive configuration, is determined from the condition
The shortest duration of the part of the control cycle during which all elements of the u-th group are executed, out of all j max implementations of adaptive configuration:
The shortest duration of the control cycle obtained from the results of multiple adaptive configuration:
Conclusions Let's summarize the research conducted in the article: 1. Real-time management of complex objects requires a high-speed control system to ensure its feasibility. High performance implies the possibility of performing a limited (relatively small) set of control operations in a short time interval – a control cycle not exceeding a unit of milliseconds or even (for the most complex control objects) microseconds. 2. The required reduction in the duration of the control cycle can be achieved by parallel execution of computational and other control operations. This is best ensured by using the memory-oriented architecture of the control system, implemented through an actor instrumental model programmed by metaprogramming tools. 3. In practice, the task of ensuring the feasibility of management is reduced to configuring the control cycle, i.e. determining the mutual distribution of time spent on various control operations along the execution flows according to specified constraints on their sequence and available computing resources of the system. 4. There are three possible approaches to configuring the control cycle: without optimization with a configuration set based on the expert assessment of the developer, as well as configuration with combinatorial optimization in time or system resources. Combinatorial optimization (in both cases) is characterized by extremely high algorithmic complexity and in practice should be simplified by the introduction of stochastic methods. 5. A less accurate and reliable approach to configuring the control system cycle is adaptive configuration, which can be implemented in real time, or, if used repeatedly, as a tool for determining the quasi-optimal configuration of the control system cycle. References
1. Zelenskiy A.A., Kuznetsov A.P., Ilyukhin Yu.V., & Gribkov A.A. (2022). Realizability of motion control of industrial robots, CNC machines and mechatronic systems. Part 1. Machine Building Vestnik, 11, 43-51.
2. Zelensky A.A., Kuznetsov A.P., Ilyukhin Yu.V.V., & Gribkov A.A. (2023). Realizability of motion control of industrial robots, CNC machines and mechatronic systems. Part 2. Machine Building Vestnik, 3, 213-220. 3. Cell Broadband Engine Programming Tutorial. Version 2.0. (2006). IBM Systems and Technology Group, December 15. Retrieved from https://arcb.csc.ncsu.edu/~mueller/cluster/ps3/CBE_Tutorial_v2.0_15December2006.pdf 4. Ghose S., Hsieh K., Boroumand A., Ausavarungnirun R., & Mutlu O. (2018). Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions. Retrieved from https://arxiv.org/abs/1802.00320 5. Singh G., Chelini L., Corda S., Awan A.J., Stuijk S., Jordans R., Corporaal H., & Boonstraz A. (2019). Near-Memory Computing: Past, Present, and Future. August, Microprocessors and Microsystems 71. Retrieved from https://www.researchgate.net/publication/335028505_Near-Memory_Computing_Past_Present_and_Future 6. Zelensky A.A., Ivanovsky S.P., Ilyukhin Yu.V., & Gribkov A.A. (2022). Programming of the trusted memory-centric motion control system for robotic and mechatronic systems. Bulletin of Moscow Aviation Institute, 4, 197-210. 7. Juurlink, B., & Meenderinck, C. (2012). Amdahl's law for predicting the future of multicores considered harmful. ACM SIGARCH Computer Architecture News, 2, 1-9. 8. Knuth, D.E. (2018). The Art of Programming. Volume 3. Sorting and searching. Moscow: E.D. Williams LLC. 9. Time Complexities of all Sorting Algorithms (2023). Geeks for Geeks. Retrieved from https://www.geeksforgeeks.org/time-complexities-of-all-sorting-algorithms/ 10. Heap Algorithms. (2010). Massachusetts Institute of Technology. Retrieved from https://courses.csail.mit.edu/6.006/fall10/handouts/recitation10-8.pdf 11. Binary Insertion Sort. (2023). Geeks for Geeks. Retrieved from https://www.geeksforgeeks.org/binary-insertion-sort/
Peer Review
Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
|