Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Software systems and computational methods
Reference:

Performance optimization of machine learning-based image recognition algorithms for mobile devices based on the iOS operating system

Mamadaev Ibragim Magomedovich

iOS-Developer, Mail.ru LLC.

Russia, Moscow region, Moscow, Tverskoy bulv.

ibragim.m115@gmail.com
Minitaeva Alina Mazhitovna

PhD in Technical Science

Candidate of Technical Sciences, Department of Computer Science and Computer Engineering (IU-6), Bauman Moscow State Technical University

Russia, Moscow region, Moscow, Tverskoy bulv.

aminitaeva@mail.ru

DOI:

10.7256/2454-0714.2024.2.70658

EDN:

LDXKKC

Received:

05-05-2024


Published:

13-06-2024


Abstract: Today, mobile devices play an important role in everyone's daily life, and one of the key technologies leading to significant benefits for mobile applications is machine learning. Optimization of machine learning algorithms for mobile devices is an urgent and important task, it is aimed at developing and applying methods that will effectively use the limited computing resources of mobile devices. The paper discusses various ways to optimize image recognition algorithms on mobile devices, such as quantization and compression of models, optimization of initial calculations. In addition to ways to optimize the machine learning model itself, various libraries and tools for using this technology on mobile devices are also being considered. Each of the described methods has its advantages and disadvantages, and therefore, in the results of the work, it is proposed to use not only a combination of the described options, but also an additional method of parallelization of image processing processes. The article discusses examples of specific tools and frameworks available for optimizing machine learning performance on iOS, and conducted its own experiments to test the effectiveness of various optimization methods. An analysis of the results obtained and a comparison of the performance of the algorithms are also provided. The practical significance of this article is as follows: Improving the performance of machine learning algorithms on iOS mobile devices will lead to more efficient use of computing resources and increase system performance, which is very important in the context of limited computing power and energy resources of mobile devices. Optimization of machine learning performance on the iOS platform contributes to the development of faster and more responsive applications, which will also improve the user experience and allow developers to create new and innovative features and capabilities. Expanding the applicability of machine learning on iOS mobile devices opens up new opportunities for application development in various fields such as pattern recognition, natural language processing, data analysis, and others.


Keywords:

neural network, machine learning, mobile device, iOS, image recognition, optimization, OS Apple, efficiency, performance, parallelization

This article is automatically translated.

Introduction and relevance

Today, mobile devices play an important role in everyone's life, as they provide a wide range of features and services, without which many can no longer imagine their daily lives. One of the key technologies leading to significant advantages of mobile applications is machine learning [1], it is already used in many leading applications on the market, and large IT companies compete with each other, trying to attract more customers to their side. However, for their effective use on mobile devices, it is necessary to solve a number of difficult problems.

Optimization of machine learning algorithms for mobile devices is an urgent and important task and is aimed at developing and applying methods that will effectively use the limited computing resources of mobile devices, minimize power consumption and achieve high performance when performing complex machine learning tasks. Optimizations open up new opportunities for the development of applications such as smart assistants and voice assistants, real-time image and video processing, and automatic data classification.

Along with the growing popularity and use of Apple's mobile devices [2], there is also a need for efficient operation of machine learning algorithms on limited computing resources, low memory and insufficient battery capacity due to the size.

Analyzing the problems related to machine learning performance on mobile devices running the iOS operating system, the following aspects can be identified:

- delays in the execution of algorithms due to their complexity,

- reduced responsiveness of the user interface due to overloading of the computing power of the device,

- increased energy consumption and, as a result, increased heat dissipation.

These aspects have a negative impact on the user experience and set developers the task of ensuring high application performance.

The purpose of this work is to research and optimize the performance of image recognition algorithms based on machine learning on iOS mobile devices. The main task is to study existing optimization methods and techniques; to analyze the performance of various machine learning algorithms; to assess the impact of various factors on performance.

The article discusses examples of specific tools and frameworks available for optimizing machine learning performance on iOS, and conducted its own experiments to test the effectiveness of various optimization methods. It also provides an analysis of the results obtained and a comparison of the performance of the algorithms.

The practical significance of this article is as follows:

-        Improving the performance of machine learning algorithms on iOS mobile devices will lead to more efficient use of computing resources and improved system performance, which is very important in the context of limited computing power and energy resources of mobile devices.

-        Optimizing machine learning performance on the iOS platform contributes to the development of faster and more responsive applications, which will also improve the user experience and allow developers to create new and innovative features and capabilities.

- Expanding the applicability of machine learning on iOS mobile devices opens up new opportunities for application development in various fields such as pattern recognition, natural language processing, data analysis and others.

 

1 Overview of existing solutions

On iOS mobile devices, all kinds of variations of machine learning algorithms are often used to solve various tasks [3]. Some of them include classification algorithms, regression, clustering, neural networks and deep learning [4].

Prominent examples of such models are classification algorithms, logistic regression and the support vector machine, which are widely used to solve problems of pattern recognition and data classification on mobile devices.

These methods have relatively low complexity and scale well to work with large amounts of data [5]. Regression algorithms, which include linear regression and the least squares method, are used to predict numerical values based on raw data. Such algorithms are widely used in forecasting and data analysis tasks on mobile devices. Clustering, in turn, is a method of grouping similar objects based on their characteristics. Some clustering algorithms, such as k-means and DBSCAN, are used to process data on mobile devices and search for hidden structures. Neural networks and deep learning are also among the most popular machine learning algorithms today, as they can process complex data, images, texts, and at the same time achieve high accuracy in classification, recognition, and content generation tasks. Proper use of optimization techniques can significantly improve the efficiency of machine learning algorithms on mobile devices.

 

2 Quantification of models

One of the key optimization methods is the quantization of models [6], it allows you to reduce the size of the model and reduce the requirements for computing resources by representing weights and activations with less accuracy. In other words, quantization is the process of reducing the accuracy of weights by rounding, reducing accuracy. A visual illustration of this process is shown in Figure 1.

Êâàíòîâàíèå íåéðîííîé ñåòè: ÷òî ýòî òàêîå è êàê îíî ñâÿçàíî ñ TinyML?

Figure 1 is an example of quantization of the weight of 1 neuron and a 4–fold decrease in bit depth

The use of quantization allows you to speed up the calculation process and reduce memory usage, slightly affecting the accuracy of the model. One of the main advantages of quantization is a reduction in the size of the model, which in turn leads to a reduction in requirements both for device memory and directly to the acceleration of calculations. In addition, quantization allows the use of specialized hardware accelerators, such as the "Neural Engine" [2] in Apple chips, designed, among other things, to efficiently perform operations with low accuracy. However, quantization has a very big drawback - it can also lead to a loss of model accuracy, especially when using a decrease in the accuracy of the representation of weights and activations.

However, this significant disadvantage is overlaid by the following advantage - the feature of this method is that it can be used both during model training and after, which allows similar operations to be performed after the application is delivered to users.

 

3 Compression of models

Another optimization method is model compression, which reduces the size of the model by removing unnecessary or redundant parameters. One of the types of compression of models is "pruning" [7] (clipping or thinning). Graphs of the accuracy and performance of models depending on the percentage of thinning in this way are shown in Figure 2.

Òèïè÷íîå ïîâåäåíèå êà÷åñòâà íà òåñòîâîé âûáîðêå äëÿ ïðîðåæåííûõ ìîäåëåé â çàâèñèìîñòè îò ñòåïåíè ïðîðåæèâàíèÿ.  (ResNet50 íà ImageNet, òîï-1 òî÷íîñòü)

Figure 2 – Graphs of accuracy and performance during pruning (thinning) of models

The peculiarity of pruning, unlike quantization, is that this process is possible only in an already pre-trained model.

Compressing models in this way also has its advantages, it allows you to reduce the size of the model, which simplifies deployment and speeds up loading on devices. It can also reduce memory and computing resource requirements. However, when compressing models, there is a risk of loss of information and model accuracy. Some compression methods may result in the removal of parameters or relationships, which may affect the performance and effectiveness of the model.

Nevertheless, despite the existing disadvantages, this method is also considered in the article, since when used even with minimal thinning values, in combination with other optimization methods, it can give acceptable results in accuracy and performance.

 

4 Optimization of calculations

Optimization of calculations is also an important aspect, it may include the use of more efficient algorithms, optimization of computational graphs, allocation of calculations to a graphics processor (GPU) or the use of a specialized hardware accelerator (for example, Tensor Processing Unit) [8]. The peculiarity of this chip is that it was specially designed to work with models and processing multidimensional data. A simplified diagram of the tensor processor from Nvidia is shown in Figure 3.

Figure 3 – Tensor (TPU) from Nvidia

Optimization of calculations can lead to a significant acceleration of machine learning algorithms. Using more efficient algorithms, optimizing computational graphs, and distributing computations to specialized hardware accelerators can significantly improve performance. However, these methods require a good understanding of algorithms and computational models, as well as experience in their implementation and optimization.

 

5 Selection of machine learning frameworks and tools

In addition, there are frameworks and tools specifically designed to optimize machine learning performance on iOS. Some of them include the libraries "CoreML" [9], "Metal Performance Shaders", as well as the framework "Metal API" and others.

Figure 4 shows the scheme of the CoreML framework, the essence of which is to transform from a conventional model for stationary computing devices into a special optimized format for mobile devices, which is processed directly by the library itself and supplied to the mobile application being developed.

Flow diagram going from left to right. Starting on the left is a Core ML model file icon. Next, in the center is the Core ML framework icon, and on the right is a generic app icon, labeled

Figure 4 – "CoreML" operation diagram

The Metal Performance Shaders framework contains a collection of highly optimized computing and graphics shaders designed for easy and efficient integration into a mobile application. These data-parallel primitives are specifically configured to maximize the unique hardware features of each family of graphics processing units (GPUs) in order to ensure optimal performance. Applications using the Metal Performance Shaders framework achieve excellent performance without the need to create and maintain manual shaders for each GPU family. "Metal Performance Shaders" can be used together with other existing resources of your application (such as MTLCommandBuffer, MTLTexture and MTLBuffer objects) and shaders [9]. The framework supports the following functionality:

- applying high-performance filters to images and extracting statistical and histogram data from them,

- implementation and launch of neural networks for machine learning learning and output,

- solving systems of equations, factorization of matrices and multiplication of matrices and vectors [10][11],

- Accelerate ray tracing with high-performance ray intersection and geometry testing.

In turn, the Metal library is a low—level, low-cost software interface for hardware acceleration of 3D graphics and computing, developed by Apple and debuted in iOS 8. Metal combines functions similar to OpenGL and OpenCL in one package. It is designed to improve performance by providing low-level access to the hardware capabilities of the graphics processor (GPU) for applications on iOS, iPadOS, macOS and tvOS. It can be compared with low-level APIs on other platforms such as Vulkan and DirectX 12. Metal is object-oriented, which allows it to be used with programming languages such as Swift, Objective-C or C++17. According to Apple's promotional materials: MSL [Metal Shading Language] is a single language that allows for closer integration of graphics and computing programs [12].

There are also analogues of these libraries for devices based on the Android operating system, but they will not be considered due to the fact that the article focuses specifically on Apple devices and their chips with the prefix "A".

These tools provide optimized functions and APIs that enable efficient use of the hardware capabilities of the devices. However, using these frameworks requires additional efforts to integrate existing models and algorithms, as well as to study their features and capabilities.

 

6 Conducting experiments on combining algorithms

To conduct experiments on optimizing machine learning algorithms on iOS, let's review a methodology based on a systematic study of various parameters and settings of algorithms. The key stage of the experiments was to determine the optimal values of parameters such as "learning rate", "batch size" and the number of learning epochs. These parameters were chosen due to the fact that they have the greatest impact on the speed and quality of learning algorithms. Special attention was also paid to choosing the optimal network structure and optimization algorithm adapted specifically for the iOS platform, which made it possible to significantly improve the performance of machine learning algorithms on iOS devices.

To obtain the most accurate result with the most efficient algorithms, the following experiments were carried out:

-        Each of the listed methods of neural network optimization [12] is considered in pairs – quantization [13], compression, the use of TPU and the framework used. Various combinations were tried in search of the best efficiency, some of the approximate combinations are shown in Table 1.

-        Due to the fact that quantization and compression quite critically reduce the accuracy of neural networks [14][15] – a separate measurement was carried out without using them – only the use of a TPU chip and two separate combinations with different frameworks = CoreML and Metal

Table 1 – Approximate options for combining optimization methods

Quantification

Compression

TPU

The framework

Absent

0%

Is used

CoreML

Small

25%

Average

50%

Not used

Metal

Strong

75%

 

The result confirmed the hypothesis that the use of compression and quantization algorithms radically reduces the accuracy of the initial neural network – the accuracy of the machine learning algorithm has fallen several times, although the speed of operation has increased by an order of magnitude.

At the same time, the second experiment gave good results – combining the use of the TPU chip and the CoreML and Metal frameworks gave only an increase in performance, without reducing accuracy – with only one feature: each of the frameworks must be used only to solve tasks suitable for them, namely the use of machine learning algorithms with CoreML and 2D/3D processing images and models using the Metal framework.

During the experiments, another possible direction for optimization was also revealed – splitting the processing process into 2 components, for CPU and GPU with equivalence classes suitable for each other.

           

Conclusion

The article presented the main ways to optimize machine learning algorithms, however, to achieve the best result, it is necessary to use a synthesis of several of the described approaches.

After conducting experiments with the described optimization algorithms, an attempt was made to combine all available optimization methods to achieve the best indicator and solve the problem described earlier – the insufficient effectiveness of individual optimization methods [16][17].

The conducted research and necessary experiments have revealed that the combination of compression and quantization radically reduces the accuracy of the original neural network. Thus, in order to achieve optimization with acceptable accuracy losses, it is recommended to use only one of the ways to optimize the algorithm itself. Empirically, the recommendation of the developers of the toolkit was confirmed that it is necessary to combine the capabilities of a special chip with one of the frameworks [18].

Another result of the experiments was the identification of a new direction for optimization - splitting the input data processing process into equivalence classes so that processing takes place in parallel not only using GPU and TPU capacities, but also CPU. An approximate conditional partitioning scheme is shown in Figure 5. Despite the fact that the central processor is not designed to perform this kind of operation, with proper partitioning, it gave an increase in execution speed.

Figure 5 – Splitting the processing object by equivalence classes

As further areas of work, it is planned to investigate the application of the proposed method to mobile devices running the Android operating system [19], as well as to implement in practice the synthesis of several algorithms for optimizing machine learning models.

For the practical implementation of such a solution, a separate study and comparison of the performance of various machine learning algorithms on iOS mobile devices will be conducted as part of the dissertation and scientific work. The main purpose of this article is to determine the effectiveness and compare the performance of various algorithms in order to identify the most suitable ones for use on iOS devices.  To do this, standard datasets were used, such as "MNIST" for handwritten digit recognition and "ImageNet" [20] for image classification, in order to be able to compare the results with other studies. The experiments will take into account various factors that can affect the performance of algorithms, such as the size of the dataset, the complexity of the model and the selected parameters. Experiments were conducted with various settings of frameworks and libraries used with machine learning algorithms on mobile devices, and parameters to assess their impact on performance.

References
1. Zhang Ya, Liu Ya, Chen T., & Geng U. "Mobile Deep Learning for Intelligent Mobile Applications: An Overview." IEEE Access, 8, 103, 586-103, 607.
2. Apple Developer Documentation [Electronic resource]. "Core ML – Performance optimization on devices." Retrieved from https://developer.apple.com/documentation/coreml/optimizing_for_on-device_performance
3. Rastegari, M., Ordones, V., Redmon, J., & Farhadi, A. "XNOR-Net: Classification of ImageNet images using binary convolutional neural networks.". Proceedings of the European Conference on Computer Vision (ECCV) (pp. 525-542).
4. Sikhotan, H., Mark, A., Riandari, F. & Rendell L. "Effective optimization algorithms for various machine learning tasks, including classification, regression and clustering.". IEEE Access, 1, 14-24. doi:10.35335/idea.v1i1.3.
5. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. "MobileNetV2: Inverted residual blocks and linear bottlenecks.". Proceedings of the conference on Computer Vision and Image Processing. IEEE (pp. 4510-4520).
6. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, V., Veand, T. Et al. "MobileNets: Efficient convolutional neural networks for mobile computer vision applications.". IEEE 1704.04861.
7. Han, S., Mao, H., Dally, W. J. "Deep compression: Compression of deep neural networks using clipping, quantization of learning and Huffman coding.". IEEE 1510.00149.
8. Google TensorFlow Lite documentation. «TensorFlow». Retrieved from https://www.tensorflow.org/lite
9. Thakkar, M. "The beginning of machine learning in iOS: CoreML Framework.". IEEE Access. doi:10.1007/978-1-4842-4297-1
10. Minitaeva, A.M. (2022). Decision-making in conditions of interval assignment of preferences of decision makers. Proceedings of the conference "Information Technologies in Management" (ITU-2022) : 15th MULTI-CONFERENCE ON MANAGEMENT PROBLEMS, St. Petersburg, 04 06 October 2022. – St. Petersburg: Concern; Central Research Institute; Electropribor (pp. 197-200).
11. Minitaeva, A.M. (2023). A multi-model approach to forecasting nonlinear non-stationary processes in optimal control problems. Irreversible processes in nature and technology : Proceedings of the Twelfth All-Russian Conference. In 2 volumes, Moscow, January 31 – 03, 2023. – Moscow: Bauman Moscow State Technical University (National Research University), pp. 438-447.
12. Kochnev, A., "Conceptual foundations of the practical use of neural networks: problems and prospects". "Society and innovations". doi:10.47689/2181-1415-vol4-iss1-pp1-10
13. Kurbaria, M., Bengio, Y., David, J. P. "BinaryNet: Training deep neural networks with restrictions on weights and activations of +1 or -1.". IEEE:1602.02830.
14. Li, G., Wei, Gao, & Wuen, G. "Quantization techniques". doi:10.1007/978-981-97-1957-0_5
15. Samsiana, S., & Syamsul, A. "Machine learning algorithms using the vector quantization learning method". doi:10.1051/e3sconf/202450003010
16. Jeremy, A. Atayero, Samuel Adjani "Overview of machine learning on embedded and mobile devices: optimization and applications". doi:doi:10.3390/s21134412
17. Sandler, M., Howard, A., & LeKun, Y. "Mobilenetv3: A highly efficient scalable model of mobile computer vision.". Proceedings of the conference on Computer Vision and image Processing. IEEE/CVF (pp. 13840-13848).
18. Chen, B., Danda, R. & Yuan, Ch. "Towards the theft of deep neural networks on mobile devices". Security and privacy in communication networks (pp. 495-508). doi:10.1007/978-3-030-90022-9_27
19. Jarmuni, F., & Fawzi, A. "Launching neural networks in Android". University of Ottawa. Introduction to Deep Learning and Neural Networks with Python (cnh/247-280). doi:10.1016/B978-0-323-90933-4.00001-2
20. Bykov, K., & Muller, K. "The dangers of watermarked images in ImageNet". Artificial Intelligence. ECAI 2023 International Seminars (pp. 426-434). doi:10.1007/978-3-031-50396-2_24

First Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The article is devoted to the research and optimization of machine learning algorithms for image recognition on mobile devices running iOS. The paper discusses various optimization methods, such as quantization and compression of models, as well as the use of specialized frameworks and tools to improve performance. The research methodology includes an analysis of existing solutions, conducting experiments to test the effectiveness of various optimization methods, as well as a comparative analysis of the performance of algorithms. The authors used specific tools and frameworks such as CoreML and Metal Performance Shaders to test and optimize machine learning models. The relevance of the work is due to the widespread use of mobile devices and the increasing demand for applications using machine learning. Limited computing resources, power consumption and the need to ensure high performance on mobile devices make this topic very significant and in demand. The scientific novelty of the article lies in the proposal of an integrated approach to optimizing the performance of machine learning algorithms on iOS mobile devices. The article presents new combinations of optimization methods, such as the combined use of quantization and compression of models, which allows for higher performance and efficiency. The style of presentation of the article is scientific, the text is well structured. The article includes an introduction, an overview of existing solutions, a description of optimization methods, experimental results and a conclusion. Each part is logically connected to the previous one, which facilitates the perception of the material. The content of the article corresponds to the stated topic and covers all key aspects of the study. The bibliography contains relevant and relevant sources, including scientific articles and documentation on the frameworks and optimization methods used. However, it is recommended to add more links to current research and publications related to mobile applications and machine learning. The authors consider in detail the disadvantages and limitations of the proposed methods, which shows their objectivity and commitment to a comprehensive analysis of the problem. The article provides comparisons with similar solutions, which strengthens the argumentation and scientific significance of the work. The conclusions of the article are logical and justified. The authors summarize the results of the experiments and suggest directions for further research. The practical significance of the work lies in the possibility of applying the proposed optimization methods in real mobile applications, which will be interesting for developers and researchers in the field of machine learning and mobile technologies. Recommendations for improvement: 1. Clarify the methodology of conducting experiments, add more details about the parameters and settings of the algorithms used. 2. Increase the number of modern sources in the bibliography to better reflect the current state of the study. 3. Expand the section on the practical application of the proposed methods to include more examples and cases. 4. Include a discussion of possible limitations and potential risks when using the proposed optimization methods in real conditions. The article is a significant contribution to the field of optimization of machine learning algorithms on mobile devices. It has scientific novelty, relevance and practical significance. If the above recommendations are fulfilled, the work may be recommended for publication.

Second Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The subject of the study. Taking into account the generated title, the article should be devoted to optimizing the performance of image recognition algorithms based on machine learning for mobile devices based on the iOS operating system. The research methodology is based on data analysis and synthesis. It is valuable that the author uses graphical tools to represent the results obtained. At the same time, the absence of these sources under the tables and figures is noteworthy. The author pays special attention to the experiment, which confirmed the author's hypothesis. The relevance of the study of issues related to optimizing the performance of image recognition algorithms based on machine learning for mobile devices based on the iOS operating system is beyond doubt, since the issues of digitalization of socio-economic processes accelerate their implementation, which, among other things, affects the saving of financial resources. The potential readership is interested in the possibilities of applying the results obtained in solving the problem of ensuring the technological sovereignty of the Russian Federation. Scientific novelty is present in the material submitted for review. For example, it is related to the substantiation of the thesis that "the use of compression and quantization algorithms radically reduces the accuracy of the initial neural network – the accuracy of the machine learning algorithm has fallen several times, although the speed of operation has increased by an order of magnitude." It would also be advantageous to indicate in the text of the article the potential readership and specific areas of use of the results obtained. Style, structure, content. The style of presentation is scientific. The structure of the article is built by the author. It is also recommended to add a block "Discussion of the results obtained", as well as transform part of the conclusion into a section "Further directions of scientific research". Familiarization with the content showed a logical presentation of the material within the framework of the stated structural elements. Bibliography. The author has compiled a bibliographic list of 20 titles. It is valuable that it contains both domestic and foreign authors. It would also be interesting to study specific statistical data describing the practice of using machine learning-based image recognition algorithms for mobile devices in recent years. This would allow the author to further substantiate the relevance of the study using a specific numerical justification. Appeal to opponents. Despite the generated list of scientific publications, no scientific discussion was found in the text of the peer-reviewed scientific article. When finalizing the article, the author is recommended to pay attention to the elimination of this remark. This will allow the author to specifically show the presence of an increase in scientific knowledge, which is unambiguously made by the author, but a lot of the current edition is not perceived as advantageous as it could be presented. Conclusions, the interest of the readership. Taking into account all the above, we conclude that the article has been prepared at a high level, has scientific novelty and practical significance. The revision of the article based on the comments indicated in the text will further expand the potential readership.