Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Litera
Reference:

Standards for Assessing the Quality of Political Discourse Translation Using Large Language Models

Lyu Myao

ORCID: 0000-0003-2346-0600

PhD in Philology

Associate Professor; Faculty of Russian Language and Literature; Peking University

5 Yiheyuan Street, Beijing, 100871, China

liumiaolm@pku.edu.cn
Other publications by this author
 

 

DOI:

10.25136/2409-8698.2025.4.73946

EDN:

VVAHQM

Received:

29-03-2025


Published:

05-04-2025


Abstract: The subject of this research is the development of a comprehensive system of standards for assessing the quality of political discourse translation using large language models. The study focuses on creating an integrative approach to evaluating the translation of Chinese political discourse into Russian, which is particularly relevant in the context of strengthening the Russian-Chinese strategic partnership. The developed three-level assessment model takes into account the specifics of political discourse, characterized by a high degree of terminological density, conceptual specificity, and ideological marking. The model is based on parameters of informational integrity, linguistic accuracy, and intercultural readability, providing a comprehensive analysis of translation at all levels: from lexical and syntactic correspondence to the transfer of semantic relations and cultural-pragmatic aspects. Special attention within the research is given to applying the technological capabilities of large language models for automated analysis of translation problems and improving the political text translation. The proposed system takes into account the requirements of the "Basic Standards for Translating Chinese Political Discourse," adapting them to the context of Russian-Chinese intercultural communication. The research methodology is based on a systematic analysis of classical and modern translation assessment models, followed by functional modeling of an integrative approach that combines linguistic theories, computational methods, and the capabilities of large language models for translation assessment. The scientific novelty of the research lies in bridging the gap between theoretical models of translation assessment and their practical implementation by integrating traditional linguistic approaches, computational methods, and industry standard requirements with the technological capabilities of large language models. For the first time, a detailed structure for assessing the quality of political discourse translation with 10 second-level parameters and 36 third-level parameters is proposed, providing a comprehensive analysis of translation activity. It has been proven that the use of large language models significantly increases the efficiency and objectivity of assessment through deep semantic analysis and automated diagnosis of translation problems. The developed system has high practical value, being applicable both for analyzing completed translations and for predicting potential difficulties in translating political texts in the context of Russian-Chinese intercultural communication.


Keywords:

assessment standards, political discourse, large language models, integrative approach, information integrity, linguistic accuracy, intercultural readability, automated translation diagnostics, Russian-Chinese communication, translation parameters

This article is automatically translated.

Introduction

In the context of the global transformation of the world order and the formation of a polycentric system of international relations, effective intercultural communication is becoming a fundamental factor in geopolitical interaction. The Russian-Chinese strategic partnership, which has reached an unprecedented high level in the 21st century, is a vivid example of a constructive dialogue between civilizations based on mutual respect and understanding of national interests. The most important role in ensuring such a dialogue is played by the adequate translation of political texts, which, due to their high degree of terminological saturation, conceptual specificity and ideological labeling, require special approaches to assessing the quality of translation activities. It is the correspondence of the translation of political discourse to its pragmatic functions that determines the effectiveness of international communication at the highest level. However, ensuring this compliance is becoming an increasingly difficult task, since in the context of the information society and digital diplomacy, traditional methodological paradigms for assessing the quality of translation demonstrate certain limitations. This limitation is due to both the increased volume of interlanguage communication and the progressive complication of the political discourse itself, which actualizes the need to develop specialized evaluation criteria.

The problem of assessing the quality of translation has historically developed in two directions: linguistic and computational. In the field of developing standards for assessing the quality of translation, the Russian school of translation studies, founded by A. V. Fedorov [1], formed fundamental equivalence criteria, which were developed in the works of L. S. Barkhudarov [2] through the prism of the multidimensional nature of translation activities. A significant contribution to the creation of evaluation standards was made by V. N. Komissarov [3], who proposed a multi-level system of criteria, including functional adequacy and socio-cultural conventionality as key parameters for evaluating political texts. The Western translation tradition has made a significant contribution to the standardization of evaluation through the work of K. Reiss [4], who linked the evaluation criteria with the functional typology of texts, J. House [5], who developed a model based on register parameters, and M. Williams [6], who proposed an argumentative-centered approach relevant for evaluating the translation of political texts with their complex an argumentative structure.

In parallel, computer methods for automatic assessment of translation quality were developed, such as BLEU and NIST metrics, which allow quantifying the degree of agreement between translations and reference texts, and methods for predictive quality modeling without using benchmarks [7]. Approaches based on key linguistic phenomena have also developed, where the work of Yu Shiwen [8] played an important role, focusing on assessing the ability of systems to accurately convey certain language structures, and methods for comparing the quality of translations, in which traditional scoring gave way to ranking, showing higher consistency between experts, which is confirmed by research [9] and It has been used at WMT seminars since 2010.

An analysis of these methods shows that automatic metrics ensure the effectiveness and objectivity of evaluation, while linguistic approaches better take into account the semantic depth and cultural aspects of translation. However, both areas faced methodological limitations, and it was only with the advent of large language models at the end of 2022 that fundamentally new opportunities opened up for creating integrated translation quality assessment systems. Research in the field of standardization of quality assessment using BAM is developing in several directions. Complex metrics have been developed that make it possible to objectify an assessment based on a variety of parameters simultaneously [10]; [11], criteria for assessing the cultural and pragmatic adequacy of political translation have been proposed [12]. Standards of multilevel assessment have been formulated taking into account the structural, semantic and functional features of the text [13], adaptive criteria systems have been created for various types of political discourse [14]. The fundamental criteria of equivalence and adequacy have been methodologically rethought in the context of the possibilities of BAM [15], and new assessment parameters have been proposed that take into account the pragmatic potential of translation in the field of interstate communication [16-18]. Despite the fact that over the past 2-3 years, a number of studies have appeared on the use of large language models in assessing translation quality, these works are either overview-based, or focus solely on technical aspects, or are limited only to linguistic and translation theories. It is extremely rare to find studies that combine these two approaches and at the same time focus on specific text areas with the development of specific assessment systems.

This study suggests an integrative methodological approach to the development of such standards, synthesizing the achievements of linguistic translation theory and computer analysis methods with the functional capabilities of BAM. This methodological convergence makes it possible to overcome the limiting factors of individual approaches and form a comprehensive system of standards for evaluating translation quality. Special attention is paid to political discourse, characterized by a high degree of cultural conditioning, implicitness, the presence of ideologically loaded vocabulary and rhetorical techniques, which makes its translation a particularly difficult task. The proposed assessment system includes multilevel criteria covering both structural and semantic equivalence and pragmatic adequacy, using BAM for an in-depth analysis of contextual and connotative aspects of the translation of political texts.

The purpose of this research is to develop a scientifically based and technologically feasible system of standards for assessing the quality of translation of Chinese political discourse into Russian. To achieve this goal, the following research objectives are formulated:

(1) To analyze the existing linguistic and translation studies models for assessing the quality of translation;

(2) Consider computer-based methods for automatic translation evaluation;

(3) To substantiate the principles of integration of traditional theories and technologies of large language models;

(4) Develop a comprehensive translation quality assessment model that takes into account information integrity, linguistic accuracy, and cross-cultural readability.

The scientific novelty of the research lies in the development of an integrative model of standards for assessing the quality of translation of political discourse, based on a methodological synthesis of linguistic and technological approaches with the implementation of the functionality of large language models.

The theoretical significance of the work is determined by the contribution to the development of the methodology for assessing the quality of translation and the conceptualization of the role of BAM in the translation process. The practical value is due to the possibility of applying the developed standards to optimize the quality of translation of Chinese political discourse into Russian, which contributes to the intensification of Russian-Chinese strategic cooperation and to improving the effectiveness of intercultural communication in the context of modern geopolitical transformations.

1. Assessment of translation quality from the standpoint of linguistics and translation studies

Linguistic and translation studies approaches to assessing the quality of translation provide a fundamental basis for understanding the multidimensional nature of translation. This section examines the key theoretical models that form the basis of the proposed integrative assessment system.

1.1 The Reiss Translation Evaluation Model

K. Reiss, a leading representative of the German school of translation studies in the 1970s, proposed a functional approach to assessing the quality of translation in the work "Criticism of Translation: Opportunities and Limitations"[4]. Based on the model of Karl Bühler [19], who identified three functions of language, Reiss developed a typology of texts: informative, expressive and operational, later adding audiovisual texts. The key thesis of Reiss is: "The type of text is the primary factor influencing the choice of the translation method" [4, p. 17]. The Reiss assessment system includes intra-linguistic factors (semantic, lexical, grammatical, stylistic) and extra-linguistic (situation, topic, time, place, addressee, sender, emotional factor).

Although Reiss did not develop quantitative assessment methods, her approach became the basis for subsequent research. As Zhang Meifang notes [20], the theory has limitations: the classification of language functions is simplified, and the criteria for typology of texts are debatable.

1.2 Functional and pragmatic House model

J. House is one of the most influential theorists in modern translation studies. Her model for evaluating translation quality, developed in the late 1970s, is still widely discussed and applied. The House theory is presented in two key works: the "Translation Quality Assessment Model" [21] and the revised version "Translation Quality Assessment: a Revised Model" [5]. The theoretical basis of the House model is based on system-functional linguistics using the Halliday register theory. House considers translation as a substitution of the source text with the text in the target language while maintaining semantic and pragmatic equivalence. Her assessment model includes case analysis (field, tonality, and mode), where field refers to the topic and content of the text, tonality refers to the relationship between the participants in communication, and mode refers to the channel and method of language use. In the revised version of the model, House introduces the concept of genre as a socially recognized type of text that is essential for evaluating the quality of translation. House emphasizes the importance of functional equivalence by proposing the concepts of "explicit translation" and "implicit translation." An explicit translation is obviously a translated text, whereas a hidden translation seeks to create a functionally equivalent text in the culture of the target language. The House classifies errors in translation as "obvious errors", including obvious linguistic errors, and "hidden errors" associated with a violation of functional equivalence.

Despite the significant influence of the House model, it has been criticized by some scientists. According to Yuan Hong [22], although the model offers a detailed analysis, it lacks quantitative standards for an objective assessment. Nevertheless, House theory provides a comprehensive framework for evaluating translation quality, especially through attention to contextual factors and functional equivalence.

1.3 The Williams Evaluation Model

Malcolm Williams is one of the most important theorists in the field of translation quality assessment at the beginning of the 21st century. In his book "Evaluation of Translation Quality: an Argumentation-oriented Approach" [6], he proposed an innovative model for evaluating translation quality that combines traditional linguistic methods with argumentation theory. Williams argues that all texts have a universal argumentation structure, which makes his approach applicable to various types of texts in translation practice. The Williams model includes an analysis of the argumentation scheme (thesis, grounds, justification, support, qualifier and refutation), organizational relationships (problem-solution, conclusion-reason), propositional functions and connecting elements, types of argumentation, rhetorical figures and narrative strategies. According to Williams, these parameters cover all the information and purposes of the text at the macro and micro levels, providing a comprehensive approach to evaluating translation. The evaluation process in the Williams model involves determining the argumentation scheme and structural relationships in the source text, analyzing the coherence of the target text, evaluating key segments, and comparing propositional functions, types of argumentation, and narrative strategies, followed by an overall assessment of the quality of translation. Williams has developed evaluation scales with different parameters and weights for different types of texts. For example, argumentative texts give more weight to the argumentation scheme and stylistic aspects, whereas propositional functions and terminology are more important for statistical reports.

Despite the systematic and practical value of the Williams model, especially for evaluating complex argumentative texts, it has been criticized for its complexity, requiring in-depth knowledge of linguistics and argumentation theory, as well as for the lack of clarity of some quantitative evaluation criteria.

1.4 Newmark Translation Evaluation Concept

Peter Newmark, a renowned British translation theorist, has made significant contributions to the study of translation quality assessment. Although he did not develop a formal assessment model, his ideas on this issue are widely presented in the works "Approaches to Translation" [23] and "Textbook on Translation" [24]. Newmark's approach to evaluating translation quality is based on its understanding of the essence of translation as the art of recreating information and ideas from the original author in another language. Newmark offers a functional assessment that takes into account the type of text (expressive, informative or impactful), and emphasizes the importance of linguistic accuracy, for which he developed the method of "component analysis", which involves a detailed study of the original and translation at the level of words and sentences. An important place in Newmark's theory is the consideration of cultural factors and the concept of "cultural words" introduced by him. He distinguishes between semantic and communicative translation, considering that different methods are suitable for different types of texts and should be evaluated accordingly. Newmark has also developed a systematic classification of translation errors, including linguistic, cultural, and pragmatic errors, which allows for an objective assessment of translation quality.

1.5 The contribution of Chinese scientists to the theory of translation evaluation

Research on translation quality assessment in China began somewhat later than in the West, but they have significant results. Chinese scientists have not only introduced the scientific community to Western theories, but also developed their own models for evaluating the quality of translation. Let's look at three of the most influential Chinese models.

1.5.1 Gu Zhengkun's Best Approximation Model

Gu Zhengkun proposed a "best approximation" model [25] based on a multidimensional complementary theory of translation standards. He developed a system of standards: "absolute standard (original) – highest standard (best approximation) – specific standards." In this system, the highest standard, "best approximation," is defined as "the degree of reliability with which a translation mimics the content and form of the original." Peng Chunyan [26] modified this system, emphasizing the multidimensional, social nature and evolutionary nature of translation standards.

1.5.2 The Si Xianzhu Model of Functional Linguistics

Si Xianzhu has developed an assessment model based on system-functional linguistics. In his opinion, "the essence of translation lies in the equivalence of the meanings of the original and the translation at the semantic, pragmatic and textual levels" [27]. This model involves analyzing transitivity, mood, and modality in the sentences of the original and the translation to identify conceptual and interpersonal meanings, as well as translation deviations from the original. These deviations are then classified and evaluated to determine the degree of equivalence of the texts.

1.5.3 He Sannin's model of Relevance theory

He Sannin applied the theory of relevance to the assessment of translation quality. He believes that the maximum relevance of a translation lies in the desire to match the original, which is the theoretical basis for quality assessment [28]. He Sannin emphasizes that evaluating the quality of a translation "is related to the coincidence of the meanings of the original and the translation at all levels – from the sentence to the text, from the style to the impact on the reader, striving for maximum relevance of contextual effects." His model includes the microelements of evaluation: "interlanguage relevance", "textual relevance" and "general relevance".

Translation quality assessment models developed by both Western and Chinese scholars reflect the evolution of translation theory and the desire for an objective assessment. Although linguistic models provide a theoretical basis, their practical application is often hampered by complexity and low efficiency. In this context, automated assessment methods offer solutions that increase the speed and objectivity of the process.

2. Computer methods for automatic assessment of translation quality

Modern methods of automatic translation quality assessment offer effective solutions to overcome the limitations of linguistic models. Based on computer technology, they provide a quantitative, fast and objective analysis of translations. The four main approaches – benchmark assessment, benchmark–free assessment, testing of key language elements, and comparative ranking - complement traditional theoretical assessment models.

2.1 Evaluation using a reference translation

2.1.1 BLEU algorithm

The BLEU algorithm, proposed by Rarip and colleagues from IBM in 2002, evaluates translation quality by comparing the n-gram match between machine translation and one or more reference translations [29]. The method calculates the accuracy of n-grams, applies a penalty factor for brevity and a weighted geometric mean for the accuracy of n-grams of different lengths. The advantages of BLEU include ease of calculation and speed, as well as an objective score from 0 to 1. However, the method has limitations: it does not evaluate semantic equivalence, focusing only on lexical coincidence, is not sensitive enough to structural changes, and requires several reference translations for reliable evaluation.

2.1.2 The NIST algorithm

The NIST algorithm, developed by Doddington in 2002 [30], is an improved version of BLEU. It also uses n-gram matching, but introduces the concept of information weight. NIST calculates the accuracy of n-grams based on the information weight, uses an arithmetic mean instead of a geometric one, and optimizes the penalty factor for sentence length. The key innovation is the assignment of higher weights to rare n—grams, which increases the accuracy and distinctiveness of the assessment. Despite these improvements, NIST retains some limitations of n-gram-based methods.

2.2 Quality assessment without reference translation

Quality Estimation (QE), proposed by Specia and colleagues in 2010 [7], uses machine learning methods with a teacher to predict the quality of translation without reference texts. This approach includes extracting linguistic and statistical features from the source text and machine translation, training the model on the labeled data, and predicting the quality assessment. The advantages of QE include flexibility and practicality, the ability to apply at the level of words, sentences and documents. However, the method requires high-quality training data, careful selection of features, and has limitations in generalizing ability.

2.3 Testing based on key language elements

Assessment methods based on key language points assess the quality of translation through predefined language elements covering key semantics and common errors. Pioneering work in this field was carried out by Yu Shiwen from Peking University in the early 1990s [8]. His team has developed a comprehensive assessment system for Chinese-English machine translation, including test specifications and a special translation quality description language. The system classified the language test points into 9 categories, covering aspects from vocabulary to the translation of complex sentences. Although this method provides comprehensive analysis, it requires significant development costs and has limited flexibility.

2.4 Comparative ranking based on translation quality assessment

In the field of comparative assessment, there has been a shift from numerical assessments to quality ranking methods. Studies show[9]; [31] that when determining the relative superiority of translations, greater agreement is achieved between evaluators than with an absolute assessment (the Kappa coefficient for ranking is 0.37-0.56 versus 0.22-0.25 for ratings). Since 2010, ranking has become the official method of human evaluation at conferences on machine translation (WMT). The ranking problem is formalized as an ordered classification and solved using machine learning algorithms. Although the ranking does not reflect the degree of qualitative differences, it provides a relatively reliable and practical approach to evaluating machine translation.

Automatic translation assessment methods offer effective solutions to problems where traditional linguistic approaches face difficulties. However, they are limited in their understanding of deep semantics and cultural nuances. The optimal approach to assessing the quality of translation, especially for such a specific field as Chinese political discourse, should combine linguistic theory with computational methods, taking into account industry guidelines such as the "Basic Norms for Translating Chinese Political Discourse into English."

3. "The basic norms of translating Chinese political discourse into English" as the basis for assessment

When developing criteria for evaluating the translation of Chinese political discourse, it is necessary to take into account the genre features of the text. As noted by Reiss, the type of source text is a determining factor in the translation process, influencing not only the translator's strategy, but also the entire decision-making process during translation [4].

The "Basic Norms for Translating Chinese Political Discourse into English" define Chinese political discourse as "a special form of expression formed by the party and the government in the process of public administration" (hereinafter referred to as the "Norms") [32, p. 2]. According to this document, the basic principle of translating political discourse is a balance between fidelity to the original and readability, with the priority of fidelity in case it is impossible to achieve both. The document sets out four general requirements: a deep understanding of the original text before starting translation; taking into account the peculiarities of thinking of a foreign audience while remaining faithful to the original; improving the translation of key concepts with a deeper understanding of them; international cooperation and taking into account the opinions of native speakers of the target language. These principles and requirements form an important basis for developing criteria for evaluating the quality of translation of Chinese political discourse.

4. Integration of theories of translation quality assessment based on BIAM

The analysis of linguistic theories and computer methods for assessing the quality of translation shows the need for an integrated approach to evaluating the translation of Chinese political discourse. Such an approach should take into account both the theoretical depth of traditional linguistic models and the effectiveness of computer methods, as well as the specifics of political discourse. With the advent of large language models, it has become possible to overcome the limitations of existing assessment methods.

The history of the development of translation quality assessment models reflects the evolution from a purely linguistic approach to an interdisciplinary one. Linguistic models presented by Reiss [4], House [5], and Newmark[23] laid the theoretical foundation, emphasizing the importance of text type, functional equivalence, and cultural factors. At the same time, computer assessment methods were developed, from the analysis of key language elements [8] to automatic assessment systems (BLEU, NIST) and the latest machine learning methods [7]. Each of these approaches has its advantages and limitations in the context of evaluating translation quality. Based on the analysis of existing models and taking into account the requirements of the "Norms", the following theoretical foundations were chosen to create an effective system for evaluating the translation of political texts:

(1) The theory of text types [4] as a basis for assessing the conformity of translation with the genre features of political discourse;

(2) Error type analysis [23] to identify lexical, syntactic, and semantic deviations;

(3) The method of key elements of assessment [8] for checking the translation at the lexical, syntactic and pragmatic levels;

(4) Evaluation of the translation of culturally marked vocabulary [23], especially important for political texts with Chinese characteristics;

(5) A system of weighted assessment parameters adapted to the specifics of political discourse.

The use of large language models makes it possible to integrate the advantages of various approaches, providing both the depth of analysis and the effectiveness of automatic assessment of the quality of translation of Chinese political discourse.

4.1 Parameters for evaluating translation quality based on large language models

In accordance with the five classical theories of translation quality assessment discussed earlier and the requirements of the "Norms", we have developed a system for evaluating the quality of translation of Chinese political discourse based on three key criteria. Despite the terminological differences in the works of different researchers (for example, Reiss's "style" and House's "pragmatics"), our approach is based on the fundamental principle stated in the Norms: "The basic principle of translating Chinese political discourse is a balance between accuracy and readability" [32, p. 4]. Based on this, we have identified three main dimensions for evaluating translation quality: information integrity, linguistic accuracy, and cross-cultural readability.

4.1.1 Information integrity

Information integrity is a criterion focused on the source text and evaluates the degree of information transfer of the original in translation. Large language models can effectively analyze the semantic structures of source and translated texts, identify key information points and compare the correspondence between them, which increases the effectiveness of evaluation and allows you to identify the smallest discrepancies that may be missed during manual evaluation.

4.1.2 Language accuracy

Linguistic accuracy is focused on the translated text and evaluates the degree of correct use of linguistic norms and expressive means of the target language. Large language models are able to quickly identify lexical and syntactic errors, detect inappropriate phrases or modes of expression, and offer more authentic alternatives.

4.1.3 Cross-cultural readability

Cross-cultural readability is focused on the reader of the target language and evaluates how well the translation corresponds to the language habits and cultural expectations of the target audience. Large language models can analyze the language style of translation, identify expressions that may cause cultural misunderstanding, and offer alternatives that are more appropriate to the culture of the target language.

4.2 Parameters for evaluating translation quality based on large language models

The parameter determination process is a complex process that takes into account many factors. Our approach combines the theoretical foundations of classical translation theories, the requirements of industry standards, and the technical capabilities of large language models. Following the strategy of "focusing on industry standards, guidelines on linguistics and translation theory, and implementation using computer technology," we have developed detailed parameters for each of the three main dimensions.

4.2.1 Information Integrity parameters

1) The degree of information integrity

(1) Have the key policy terms or concepts been translated

(2) Whether important data or statistical information is omitted

(3) Have important components of sentences or paragraphs been translated

(4) Have the examples or quotations been translated from the original

2) The degree of reliability of the information

(1) Has the meaning of political terms or professional vocabulary been changed

(2) Are the numbers, dates, or units of measurement translated correctly?

(3) Are the polysemous words translated correctly?

(4) Are idioms or stable expressions translated correctly?

3) Semantic relations

(1) Are the cause-effect relationships expressed correctly?

(2) Is the juxtaposition relationship expressed correctly?

(3) Is the parallel relationship expressed correctly?

(4) Are the conditional relations expressed correctly?

4.2.2. Language accuracy parameters

1) The lexical level

(1) The correctness of inflection (declension of nouns, adjectives, numerals, etc.)

(2) Correct conjugation of verbs

(3) The correctness of the verb form

(4) The correct use of parts of speech

2) Syntactic level

(1) Correct word order

(2) Agreement of the subject and predicate

(3) Correctness of the structure of subordinate clauses

(4) Correctness of prepositional combinations

3) Redundant expression

(1) The presence of unnecessary significant words

(2) The presence of unnecessary official words

(3) The presence of unnecessary repetitions

4) Punctuation and spelling

(1) The correct use of punctuation marks

(2) Correct spelling of words

(3) The correct use of uppercase and lowercase letters

4.2.3 Cross-cultural readability parameters

1) The style level

(1) Has the style of political documents been preserved

(2) Are the tone and intonation of the original preserved

(3) Is the length of the sentences adequate (not too long or short)

(4) Is the translation of rhetorical techniques appropriate

2) Emotional level

(1) Is the emotional orientation of the original accurately conveyed

(2) Is the emotional intensity of the original accurately expressed

(3) Is the persuasiveness or impact of the original accurately expressed

3) Cultural level

(1) Are the cultural keywords translated correctly?

(2) Are the cultural background and cognitive habits of readers of the target language taken into account

(3) Are differences in political culture handled correctly?

The developed system for assessing the quality of translation of Chinese political discourse covers the main elements considered in traditional translation theories, such as the accuracy of information transmission and the normality of language use, and also emphasizes the importance of intercultural communication to meet the specifics and complexity of translating political discourse. The technical advantages of BAM, including semantic understanding, grammatical analysis, an extensive knowledge base and understanding of context, significantly increase the accuracy and effectiveness of assessment in all three dimensions.

Conclusion

The conducted research represents the integration of linguistic theories and technological capabilities of large language models in the field of assessing the quality of translation of political discourse. The developed three-level model, which covers the parameters of information integrity, linguistic accuracy and intercultural readability, creates a methodologically sound system for an objective assessment of translation activities.

The proposed system with 10 second-level parameters and 36 third-level parameters, supplemented by weighting coefficients and a one-hundred-point gradation of quality, provides the necessary flexibility in evaluating various genres of political discourse. The integration of BAM significantly increases the efficiency of identifying structural and stylistic inconsistencies, as well as contributes to the automated diagnosis of translation problems.

References
1. Fedorov, A. V. (1958). Introduction to the theory of translation: (linguistic problems) (2nd ed., revised). Foreign Languages Publishing House.
2. Barkhudarov, L. S. (1975). Language and translation. International Relations.
3. Komissarov, V. N. (1980). Linguistics of translation. International Relations.
4. Rys, K. (1971). The possibilities and limits of translation criticism. M. Hueber.
5. House, J. (1997). Translation quality assessment: A model revisited. Gunter Narr Verlag.
6. Williams, M. (2004). Translation quality assessment: An argumentation-centred approach. University of Ottawa Press.
7. Specia, L., Raj, D., & Turchi, M. (2010). Machine translation evaluation versus quality estimation. Machine Translation, 24, 39-50. https://doi.org/10.1007/s10590-010-9077-2
8. Yu, S. (1993). Some studies in computational linguistics. Application of language and writing, 3, 55-64.
9. Callison-Burch, C., Fordyce, C. S., Koehn, P., Monz, C., & Schroeder, J. (2007). (Meta-) Evaluation of machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation (pp. 136-158).
10. Kocmi, T., & Federmann, C. (2023). Large language models are state-of-the-art evaluators of translation quality. arXiv preprint arXiv:2302.14520.
11. Denisenkov, V. V., & Chesnikov, L. S. (2024). Strategies for optimization and assessment methods for fine-tuning large language models. International Journal of Humanities and Natural Sciences, 4-1(91), 180-184. https://doi.org/10.24412/2500-1000-2024-4-1-180-184
12. Liu, M., Shao, Q., Xie, G. (2024). Multi-Agent Approach to Political Discourse Translation: From Large Language Models to MAGIC-PTF System. Litera, 11, 28-46. https://doi.org/10.25136/2409-8698.2024.11.72197
13. Zhao, J., & Li, X. (2024). Research on the construction and application of translation agents based on large language models. Teaching Foreign Languages with Electronic Technologies, 5, 22-28, 75, 108.
14. Li, D., Wang, H., & Liu, S. (2025). National capabilities of translation technologies and large language models. Shanghai Translation, 2, 18-24.
15. Lu, Q., et al. (2023). Error analysis prompting enables human-like translation evaluation in large language models. arXiv preprint arXiv:2303.13809.
16. Huang, H., et al. (2023). Towards making the most of LLM for translation quality estimation. CCF International Conference on Natural Language Processing and Chinese Computing.
17. Zhang, B., Haddow, B., & Birch, A. (2023). Prompting large language models for machine translation: A case study. International Conference on Machine Learning. PMLR.
18. Zhao, Y., Zhang, H., & Yang, Y. (2024). Comparative study of the quality of large language models in translating texts-an example of the translation of "Blooming". Teaching Foreign Languages with Electronic Technologies, 4, 60-66, 109.
19. Bühler, K. (1934). Theory of language: The representative function of language. Gustav Fischer.
20. Zhang, M. (2005). Functional approach to translation research. Shanghai Foreign Language Education Press.
21. House, J. (1977). A model for translation quality assessment. TBL-Verlag Narr.
22. Yuan, H. (2007). The explanatory power of text typology and functional linguistics in translation quality assessment. Journal of Hunan University of Humanities and Technology, 1, 158-161.
23. Newmark, P. (1981). Approaches to translation. Pergamon Press.
24. Newmark, P. (1988). A textbook of translation (Vol. 66). Prentice Hall.
25. Gu, Z. (1989). On the multidimensional complementary theory of translation standards. Chinese Translation, 1, 16-20.
26. Peng, C. (2004). A new theory of translation standards-a revision of the translation standards established by Professor Gu Zhengkun. Scientific Journal of Sun Yat-sen University, 5, 237-241.
27. Si, X. (2004). Research on the model of translation quality assessment from the perspective of functional linguistics. Teaching Foreign Languages, 4, 45-50.
28. He, S. (2015). Research on translation quality assessment models. Central Translation Press.
29. Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 311-318).
30. Doddington, G. (2002). Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. In Proceedings of the Second International Conference on Human Language Technology Research (pp. 138-145).
31. Duh, K. (2008). Ranking vs. regression in machine translation evaluation. In Proceedings of the Third Workshop on Statistical Machine Translation (pp. 191-194).
32Basic norms of translation of Chinese political discourse into English (Eds.). (2023). Foreign Language Literature Press.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

In the reviewed article, the subject of research is the assessment of the quality of translation of political discourse using large language models, the relevance of which is argued, on the one hand, by the crucial role of "adequate translation of political texts, which, due to their high degree of terminological saturation, conceptual specificity and ideological labeling, require special approaches to assessing the quality of translation activities" in ensuring constructive intercultural dialogue, on the other hand, is due to the insufficient number of comprehensive studies of an integrative methodological approach to the development of standards for assessing the quality of translation of political discourse, synthesizing the achievements of linguistic translation theory and computer analysis methods with the functionality of large language models.: "Despite the fact that over the past 2-3 years, a number of studies have appeared on the use of large language models in assessing the quality of translation, these works are either overview-based, or focused solely on technical aspects, or limited only to linguistic and translation theories. It is extremely rare to find studies that combine these two approaches and at the same time focus on specific text areas with the development of specific assessment systems." The theoretical basis of the work was the works of such Russian and foreign researchers as L. S. Barkhudarov, V. N. Komissarov, K. A. Buhler, V. Fedorov, K. Rice, V. V. Denisenko, L. S. Chesnikov, He Sanning, Si Xianzhu, Peng Chunyan, P. Newmark, J. House, T. Kocmi, C. Federmann and others. others devoted to various aspects of translation theory and practice, modern translation technologies, large language models, translation quality assessment models, etc. The bibliography of the article includes 32 sources, it seems sufficient to summarize and analyze the theoretical aspect of the problem under study, corresponds to the specifics of the subject under study, the substantive requirements and is reflected on the pages of the manuscript. All quotations of scientists are accompanied by the author's comments. The research methodology is determined by the goal ("development of a scientifically based and technologically feasible system of standards for assessing the quality of translation of Chinese political discourse into Russian") and the objectives ("to analyze existing linguistic and translation studies models for assessing the quality of translation; to consider computer methods for automatic evaluation of translations; to substantiate the principles of integration of traditional theories and technologies of large language models; to develop a comprehensive a translation quality assessment model that takes into account information integrity, linguistic accuracy and intercultural readability") and is complex in nature: general scientific methods of analysis and synthesis, generalization are used; descriptive method; discursive and comparative methods; socio-cultural analysis, etc. In the course of the work, the key theoretical models that formed the basis of the proposed integrative assessment system are consistently considered (the Reiss translation assessment model, the House functional and pragmatic model, the Williams evaluation model, the Newmark translation assessment concept), as well as the contribution of Chinese scientists to the theory of translation assessment; computer methods for automatic assessment of translation quality (assessment with a reference, assessment without a benchmark, testing of key language elements and comparative ranking); "basic norms of translation of Chinese political discourse into English" as the basis for assessment; integration of theories of translation quality assessment based on large language models. A three-level model of translation quality assessment has been developed, covering the parameters of information integrity, linguistic accuracy and intercultural readability, which creates a methodologically sound system for an objective assessment of translation activities. The results obtained, of course, have theoretical significance and are determined by the contribution to the development of the methodology for assessing the quality of translation and the conceptualization of the role of large language models in the translation process. The practical value is due to the possibility of applying the developed standards to optimize the quality of translation of political discourse from Chinese into Russian, which "contributes to the intensification of Russian-Chinese strategic cooperation and to improving the effectiveness of intercultural communication in the context of modern geopolitical transformations." The style of presentation meets the requirements of scientific description and is characterized by consistency and accessibility. The content of the work corresponds to the title, the logic of the research is clear. The article has a complete form; it is quite independent, original, will be interesting and useful to a wide range of people and can be recommended for publication in the scientific journal Litera.