Computational Methods in Comparative Media Analysis: Operationalization of Conflict Narrative Research Using the Example of CNN and Al Jazeera

Соколов А.В.

doi:10.7256/2454-0749.2025.12.77328

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Back to contents

Philology: scientific researches

Reference:

Sokolov, A.V. (2025). Computational Methods in Comparative Media Analysis: Operationalization of Conflict Narrative Research Using the Example of CNN and Al Jazeera. Philology: scientific researches, 12, 103–120. https://doi.org/10.7256/2454-0749.2025.12.77328

Computational Methods in Comparative Media Analysis: Operationalization of Conflict Narrative Research Using the Example of CNN and Al Jazeera

Sokolov Aleksandr Vladimirovich

PhD in Philology

Professor; Department of Television and Radio Broadcasting; Academy of Media Industry
Associate Professor; Department of Mass Communications and Media Business, Faculty of Social Sciences and Mass Communications; Financial University under the Government of the Russian Federation

127521, Moscow, Oktyabrskaya str., 105, building 2.

alvsokolov@fa.ru

DOI:

10.7256/2454-0749.2025.12.77328

EDN:

PXYKFT

Received:

12/16/2025

Published:

12/25/2025

Abstract: The subject of the research is the application of computational methods in the comparative analysis of media narratives that shape representations of armed conflicts in the global media landscape. The object of the study consists of news materials from the international media outlets CNN and Al Jazeera related to the Israeli-Palestinian conflict. The author analyzes aspects of the topic such as the operationalization of theoretical concepts in media analysis, the comparison of narrative strategies across different media, and the possibilities of scalable analysis of media discourse. Special attention is given to how media not only reflect events but also construct meaning models of conflict through framing, tonality, and attribution of agency. The article examines differences in thematic priorities, emotional coloring, and narrative structure in the materials from the two media systems that represent different political and institutional contexts. It discusses the limitation of traditional qualitative analysis methods when working with large text corpora and the necessity of integrating computational tools into media research. Thus, the study aims to identify persistent narrative differences and to form a reproducible analytical framework for comparative studies of media discourse on conflicts. The methodology of the research is based on a hybrid approach that combines natural language processing methods, thematic modeling, and interpretative analysis using large language models. The main findings of the conducted research reveal systematic differences in the narrative and discursive strategies of CNN and Al Jazeera when covering the same conflict. It has been established that Al Jazeera's materials are characterized by a consistently negative tonal background and an emphasis on the humanitarian dimension of the conflict, while CNN creates a more fragmented narrative alternating between negative and positive plots related to diplomatic events. The novelty of the research lies in the development and testing of a meta-model for comparative media analysis, which combines quantitative computational methods with qualitative interpretation of media texts. A significant contribution of the author is the operationalization of classic theories of media studies through specific NLP metrics and procedures, which enhances the reproducibility and transparency of the analysis. It has been shown that the integration of large language models expands the possibilities for semantic comparison of narratives without losing analytical depth. The conclusion is made regarding the continued influence of institutional factors in media systems on the formation of media reality even in the context of digitalization and automation of analysis.

Keywords:

comparative media analysis, computational hermeneutics, natural language processing, large language models, media narratives, framing, operationalization, meta-model, conflict discourse, media analysis

This article is automatically translated.

Introduction

The modern global media sphere demonstrates the intensification of information confrontation, in which various media organizations construct competing narrative realities. The Middle East conflict represents a paradigmatic case of a mediatized confrontation, where the information dimension becomes no less significant than the military-political one ^[25]. Al Jazeera and CNN, as global media brands, broadcast diametrically opposed geopolitical positions: the Qatari channel focuses on the Palestinian perspective ^[6], the American corporation supports the strategic alliance of the United States with Israel.

The relevance of the study is determined by a number of factors. First, the critical theory of media proceeds from the fact that the media are not a neutral mirror of reality, but act as an element of institutional power that shapes public opinion and ideology, constructing meanings through the selection and emphasis of certain frames and thereby participating in the production of social reality, and not just its reflection ^{[15, 14]}.

Secondly, traditional qualitative media analysis methods have limited scalability, which hinders the processing of large corpora. Thirdly, modern computational methods integrate Natural Language Processing technologies, thematic modeling and large language models (LLM), providing innovative tools for large-scale analysis of media discourse ^{[12, 8, 24]}.

The analysis of current works demonstrates the lack of research that operationalizes the theoretical concepts of media analysis through computational methods in the Russian academic tradition. Foreign works on computational media studies focus on automated content analysis ^[9], coding using large language models (LLM-assisted coding) ^[10], sentiment analysis of large datasets ^[17]. The Russian-speaking field is dominated by theoretical and review works on framing/agenda (agenda-setting) ^{[5, 2, 3, 4]}, whereas systematic operationalization through NLP/LLM is still found on a spot ^[1].

The scientific novelty of the present study lies in the development of a meta-model for the comparative study of media narratives, integrating critical media theory, computational social science and hermeneutic interpretation. An attempt is made to implement a systematic operationalization of media analysis based on the synthesis of NLP libraries with generative AI models, ensuring reproducibility with theoretical validity of the conclusions.

The aim of the research is to develop and empirically validate a meta-model for comparative analysis of media narratives using computational hermeneutics. The research objectives include: theoretical substantiation of the concept of computational hermeneutics in media analysis; operationalization of framing, the theory of "agenda" and propaganda models through NLP metrics; development of the architecture of a hybrid analytical platform; empirical testing of the meta-model based on the comparative analysis of Al Jazeera and CNN.

The theoretical significance is determined by the validation of classical media research paradigms in the context of the digital media environment and the expansion of the methodological tools of computational media research. The practical significance is due to the development of a set of software tools for automated comparative analysis, relevant for journalistic practice, media education and political analytics.

Research methodology

The methodological framework of the research is based on the concept of computational hermeneutics^[1] ("computational hermeneutics") – interpretation through computation, and synthesizes objectifying NLP procedures with a hermeneutic understanding of semantic structures ^[27]. The meta-model of comparative research is structured in four interrelated dimensions. The ontological dimension postulates that media narratives do not reflect, but constitute, social reality through selection, framing, and attribution of meanings to events. The epistemological dimension is based on methodological pluralism: quantitative NLP methods objectify patterns of the surface structure of the text, qualitative interpretation reveals deep semantic structures, LLM-mediated analysis implements scalable hermeneutics.

The methodological dimension operationalizes three levels of comparison. The first level represents a quantitative comparison of linguistic patterns: the frequency distribution of lexical units, the tonality of discourse through the VADER sentiment analyzer ^[17], the density of named entities through spaCy Named Entity Recognition^[2], thematic clusters through Latent Dirichlet Allocation^[7]. The second level includes semantic comparison of narrative structures: dominant frames according to the R. Entman model ^[14], attribution of agency, causal schemes, modality of statements. The third level represents a pragmatic comparison of discursive effects: legitimizing strategies, constructing an enemy image, and forming affective communities.

The institutional dimension integrates the structural analysis of media systems through the propaganda model of E. Herman and N. Chomsky ^[15]: filters of ownership, advertising, sources, ideology are operationalized as independent variables that determine narrative production. Comparative analysis reveals how different filtering configurations produce divergent narratives for Al Jazeera (government funding, regional geopolitical position) and CNN (corporate ownership, advertising dependence, alignment with American foreign policy).

The research is implemented through a two-component analytical platform that integrates computational linguistics methods, thematic modeling, and transformeroriented semantic comparison. The first module presents a web parsing and NLP processing system in Python using the following libraries: requests for HTTP communication, beautifulsoup4 for DOM structure parsing, newspaper3k for structured content extraction, spaCy for linguistic annotation, VADER ^[17] for sentiment analysis, geopy for geocoding, gensim for thematic modeling using latent Dirichlet placement (LDA). The second module is a React-based web application for comparative AI analysis using the Google Gemini API.

The empirical basis of the study was a two-component corpus of news materials from the "Middle East" sections of the Al Jazeera and CNN news agencies for October 31, 2025^[3]. Case parameters: Al Jazeera – 15 publications, CNN – 20 publications, total volume – 35 news items, total tokenized length – approximately 89,000 tokens. The representativeness of the sample is ensured by the temporal identity of the collection period and the completeness of the coverage of publications of the relevant section for the analyzed period.

The data collection procedure included: HTTP request to target pages of the "Middle East" categories; parsing via BeautifulSoup using specific selectors; filtering relevant URLs by structural patterns; asynchronous processing via ThreadPoolExecutor; structured data extraction via newspaper3k. The linguistic annotation included: Named Entity Recognition via spaCy en_core_web_sm to identify personalities, geopolitical units, and organizations; Sentiment Analysis via VADER to calculate a composite tonality score in the range from -1 to +1 (sentiment score); geocoding via the Nominatim API to convert geographical entities into coordinates; thematic modeling using LDA with parameters: the number of topics K is five, the iteration passes is ten, the Variational Bayes algorithm.

LLM integration was carried out through the Google Gemini API with the configuration: gemini-2.5-pro model, temperature 0.2 to minimize stochasticity, responseMimeType application/json for structured output, responseSchema for validating the response structure. The industrial architecture is structured in four sections: role specification (expert in media analysis of Middle Eastern journalism), task formulation (identification of dominant narratives, comparison of sentiment scores, analysis of keyword divergence), data injection (serialized corpora in JSON format), definition of the output format (JSON Schema specification with the fields overallSummary, alJazeeraAnalysis, cnnAnalysis, comparisons, comparativeInsight).

Validation of the results was carried out through computational triangulation – convergence of conclusions of various methods. If statistical NLP analysis, LDA topic modeling, and LLM interpretation produce congruent conclusions about narrative differences, the validity of the conclusion increases. The divergence of methods indicates the need for an in-depth qualitative interpretation to resolve contradictions.

Results

The analysis of tonality using the VADER sentiment analyzer revealed a systematic divergence of the emotional polarity of texts^[4]. The average tone of Al Jazeera publications shows a high degree of negativity: M = - 0.67, SD is 0.45. The average tonality of CNN is M = - 0.24, SD = 0.78. When compared with the absolute value of the average tonality of Al Jazeera materials (the modulus of which is about 2.8 times higher than 0.24), it can be concluded that CNN discourse turned out to be "~180% less negative", in other words, the negative bias of Al Jazeera is almost 3 times exceeds the negative of CNN. At the same time, it is important to emphasize that we are talking about comparing the modules of average values, and not about a direct percentage reduction in negative tonality. The categorical distribution of tonality demonstrates that Al Jazeera is characterized by monotonous negative discourse: 93.3% of publications are classified as highly negative, which indicates a strategy for constructing an image of humanitarian catastrophe and systemic violence. CNN demonstrates a bipolar distribution: 60% of highly negative materials plus 35% of highly positive publications, which indicates an editorial policy of balancing critical reporting (strikes, casualties) and positive diplomatic narratives (truce, hostage release). The absence of a gray area in both buildings indicates a high degree of polarization of the media discourse.

The most negative Al Jazeera publication received a negative 0.9985 with the headline "Acute trauma: the indelible wounds of Gaza's children from Israel's war" (Acute trauma: The ever-present wounds of Gaza's children from Israel's war), focusing on the clinical description of the psychological trauma of the child population using medical terminology to enhance the emotional impact. The most positive CNN publication received a plus 0.9983 tone with the headline "As Israel celebrates the hostages' return, Trump basks in the glory" (As Israel celebrates the hosts' homecoming, Trump baskets in the spotlight), focusing on diplomatic triumph, emotions of liberation, and the role of the United States as a mediator.

Frequency analysis of named entities through spaCy NER revealed contrasting geopolitical optics of the sources^[5]. Al Jazeera demonstrates the dominance of Israel (70 mentions) as the primary aggressor actor, high representation of RSF (Rapid Support Forces) and El Fasher (35 plus 34 mentions, respectively) how to focus on the Sudanese conflict, the Palestinians (18 mentions) as a collective sacrifice, Al Jazeera self-reference (20 mentions) as a meta-discursive marker of one's own role as an information actor. CNN shows the dominance of Gaza (207 mentions) and Hamas (188 mentions) as a territorial and organizational focus, not an ethnic one, D. Trump (63 mentions) Focusing on the role of the United States as a diplomatic mediator, Iran (64 mentions) as a geopolitical extension of the narrative to regional players, the UN (35 mentions) as an emphasis on international institutions.

A contrasting discursive structure is manifested in chains of actors: Al Jazeera constructs the sequence Israel – US – Palestinians as oppressor – ally – victim, CNN constructs the sequence Gaza – Hamas – Israel as territory – organization –state. This differentiation indicates the different attribution modes of agency and patient in media narratives.

The extraction of keywords by the TF-IDF algorithm revealed semantic patterns of lexical priorities^[6]. Al Jazeera demonstrates the vocabulary of the humanitarian crisis: "prisoners" (5 mentions), "bodies" (5), "killed" (3) as the dominance of lexical units of death and imprisonment; "cease-fire" (8), "truce" (truce) (6) as a focus on ending violence as an imperative. CNN demonstrates the vocabulary of the diplomatic process: "hostages" (10 mentions) quantitatively exceeds "prisoners" (prisoners) as a terminological framework legitimizing Israeli operations through the semantics of "hostages" (hosts), implying the legal validity of the release; "deal" (3), Trump (trump) (3) as the commercialization of diplomacy, the personification of the process; "war" (7) as an explicit designation of conflict by military action.

The terminological divergence of "prisoners" versus "hostages" represents a semantic struggle for legitimacy: prisoners depersonalizes and criminalizes the status of detainees, implying the illegality of detention; in turn, hosts legally justifies the Israeli position through the framework of anti-terrorism, constructing a legal reality where military operations are legitimate as the release of prisoners.

Thematic modeling using LDA with parameters K equal to five, passes equal to ten^[7] revealed structural differences in latent topics. Al Jazeera constructs a multilocation narrative: "The humanitarian crisis in Gaza" as a dominant topic (probability 0.030 for Israel, 0.023 for nuclear), "Conflict in Sudan/Darfur" (Sudan Darfur conflict) as a parallelism of humanitarian disasters, "Lebanon/Hezbollah dynamics" (Lebanon Hezbollah dynamics) as a regional extension, "Middle East geopolitics" as contextualization, "Journalist targeting" as a meta-discourse of repression against the media. CNN demonstrates a monofocus structure: all five topics are centered on the Israel–Hamas–Gaza triad with varying accents – "Hostage negotiations", "Gaza military operations", "Trump diplomacy", "the implementation of the cease-fire regime" (Ceasefire implementation), "Regional security threats" (Regional security threats).

Geographical analysis through Nominatim geocoding revealed spatial discourse^[8]. Al Jazeera demonstrates a dispersed geography: "Sudan" (Sudan) (3 mentions), "Darfur" (Darfur) (2), "Khartoum" (Khartoum) (2), "North Darfor" (North Darfor) (2) – 9 mentions of Sudanese locations, which indicates a strategy of conflict parallelism – "Gaza" and "Darfur" as equivalent humanitarian crisis zones. CNN demonstrates concentrated geography: "Gaza" (15 mentions), "Gaza City" (10), "Gaza Strip" (5) – 30 mentions of Gaza-territories, detailing Israel through "Tel Aviv" (6 mentions) and "Jerusalem Jerusalem (5) as a representation of a differentiated space, diplomatic hubs through Egypt (9 mentions) and Qatar (4) as an emphasis on intermediary states.

The integration of AI analytics through the Google Gemini API with a temperature of 0.2 revealed implicit narrative structures^[9]. Constructing agency: Al Jazeera attributes active agency to Israel as a subject of violence, the Palestinians are represented as objects of suffering; CNN distributes agency among Hamas as the initiator, Israel as a reacting actor, D. Trump as a mediator. The temporality of the narrative: Al Jazeera constructs the continuum of trauma through the discursive markers "omnipresent wounds", "ongoing" (ever-present wounds, ongoing), the historical context of colonialism; CNN creates the discreteness of events through the markers "truce enters the second week", "after the agreement" (ceasefire enters second week, following agreement), the procedural nature of negotiations. Legitimization strategies: Al Jazeera legitimizes resistance through the dehumanization of the occupier through the terms "no mercy", "mass killings"; CNN legitimizes Israeli actions through the legal framework "after violating the ceasefire", "staging hostage discovery".

Discussion of the results

The results obtained demonstrate the heuristic validity of the developed meta-model of the comparative study of media narratives. The convergence of the conclusions of three methodological paradigms – statistical NLP analysis, thematic modeling, and LLM interpretation - confirms the concept of computational triangulation as a validation mechanism in computational media research. The quantitative metrics of tonality correlate with the qualitative findings of the Gemini analysis.: A statistically significant differentiation of the average tonality scores (minus 0.67 versus minus 0.24, t-test p less than 0.001) corresponds to the LLM identification of humanitarian disaster construction versus diplomatic management.

The results confirm the theory of framing in the context of the digital media environment. Al Jazeera implements a framework of systemic violence: diagnosing the problem as "genocide of the Palestinian people," attributing causality to "Israel with the support of the United States," normative assessment as "morally unacceptable," prescribing solutions as "ending the occupation." CNN implements a framework of geopolitical management: diagnosing the problem as a "protracted military conflict," attributing causality to "Hamas initiator – Israel reacting actor," normative assessment as "tragic but requires an understanding of security concerns," prescribing solutions as "a diplomatic settlement mediated by the United States." This divergence validates the thesis that frames are determined not only by what they include, but also by what they omit, registering the "imprint of power" in news texts.

The propaganda model (Herman and Chomsky) demonstrates sustained relevance for explaining systematic patterns of media production. CNN has classic corporate filters: Warner Bros. property Discovery creates shareholder pressure in the context of profitability, which intensifies the focus on "advertiser-friendly content"; advertising dependence creates economic pressure on content that attracts a wealthy audience; dependence on official sources is manifested in the predominance of quoting Israeli government agencies, the Israel Defense Forces (IDF), the government, and American diplomats The ideology of fear is realized through the "terrorist framing of Hamas."

Modified filtering dynamics operate for Al Jazeera: government funding by the government of Qatar eliminates advertising dependence but creates alignment with the geopolitical interests of a regional power; sources are mainly Palestinian witnesses, UN reports, human rights organizations; ideology is constructed through a framework of anti-imperialism and support for national liberation movements. The systematic action of these filters produces different information realities for audiences, which confirms the model's thesis about the media as ideological apparatuses of influence groups (states).

The theory of setting the second-level agenda ^[20] is validated through differential attribution of characteristics to actors. Al Jazeera attributes to Israel the characteristics "aggressive", "occupier", "intruder" (perpetrator) through a high-frequency combination with the lexemes "merciless" (no mercy), "mass killings" (mass murders); Palestinians attribute the characteristics: "victims" (victims), "traumatized", "suffering" through collocations with "acute trauma", "indelible wounds". CNN attributes to Hamas the characteristics of "violator", "militant", "hostage-taker" through the collocations "violating ceasfire", "staging discovery"; Israel attributes the characteristics of "reactionary subject" (respondent), "security-concerned" (security-concerned), "negotiator" (negotiator); D. Trump (Trump) attributes the characteristics of "dealmaker" (dealmaker), "mediator" (mediator), "seeking attention, seeking publicity" (spotlight-seeker). This attributive differentiation not only determines what the audience thinks about, but also how to think about these actors.

The methodological innovation of the research is the operationalization of the concept of computational hermeneutics through a hybrid workflow. Automated NLP procedures ensure the scalability of large corpus analysis while objectifying quantitative patterns, which is unattainable by traditional qualitative methods due to resource constraints. LLM integration provides semantic comparison, identifying articles on identical events even with different terminology "prisoners returned vs hostages released", and framing analysis that recognizes the rhetorical strategies of "victim framing" and "legitimization". Human interpretation provides theoretical validity and critical reflection on the limitations of methods. Triangulation of the three approaches minimizes the weaknesses of each: NLP provides objective metrics but does not interpret semantics, LLM provides depth but can hallucinate, a person introduces theory but is subjective.

The limitations of the study include the small volume of the corpus (35 articles), which requires validation on large datasets for statistical power of conclusions; the one-time slice on October 31, 2025, which does not take into account the diachronic evolution of the narrative; the English-language focus, which does not analyze Arabic-language publications of Al Jazeera Arabic, which may demonstrate a different narrative; the lack of multimodal analysis of visual content (photographs, video), which limits the completeness of understanding media structures; limitations of VADER for detecting irony and sarcasm, which requires the use of BERT-based sentiment models in future research.

Comparison with the results of other researchers demonstrates the convergence of conclusions. M. Elmasry found a 60-fold difference in framing Israeli actions as self-defense versus Palestinian as aggression in Instagram^[10]-Western media posts, which corresponds to our conclusions about the terminological struggle of prisoners versus hosts ^[13]. K. El Damanhouri and F. Saleh in their study established that Al Jazeera America quoted only Palestinian citizens and always distinguished between militants and civilians when covering Palestinian casualties. At the same time, about 15% of CNN articles did not specify whether the deceased was a militant or a civilian ^[11], which confirms our conclusion about the humanitarization of victims in the Al Jazeera discourse. Hossain et al. In the computer text analysis of Facebook^[11] posts, differential framing based on organizational filters was identified, which validates the applicability of the propaganda model to modern digital media ^[16].

The theoretical implications of the research include the expansion of the conceptual apparatus of computational media research through the introduction of the term computational hermeneutics as a methodological paradigm synthesizing quantitative and qualitative approaches; the operationalization of classical theories of media analysis (framing, agenda, propaganda model) through NLP metrics and LLM interpretation, which provides empirical testability of theoretical postulates; the development of a meta-model of comparative research of media narratives as a theoreticala methodological framework for reproducible analysis.

Practical implications include the development of software tools for automated collection, processing and comparative analysis of news corpuses, which can be adapted for the study of other media conflicts; the relevance of the results for journalistic practice through increased reflexivity in relation to their own framing strategies and structural limitations of media production; applicability to media education through the development of critical media literacy of audiences aware of the media construction of reality; relevance for diplomatic intelligence through the understanding of media frames as indicators of the geopolitical positions of states.

Conclusion

In this study, a meta-model for comparative analysis of media narratives was developed and empirically validated using computational hermeneutics that integrates Natural Language Processing, thematic modeling, and LLM-mediated semantic comparison. The key results demonstrate a systematic divergence of the media narratives of Al Jazeera and CNN: tonal differentiation is 180% (minus 0.67 versus minus 0.24), the terminological struggle of prisoners versus hosts manifests competing frames of legitimacy, attribution of agency constructs different causal schemes (passive victim versus active aggressor in Al Jazeera, the mutual agency of Hamas – Israel –Trump in CNN), geographical ideology is implemented through the dispersive multilocation of Al Jazeera (Sudan, Lebanon) versus the concentrated Gas-centric structure of CNN.

The developed meta-model provides a theoretical and methodological framework for reproducible analysis of media structures of reality through four dimensions: ontological (media narratives constitute reality), epistemological (methodological pluralism), methodological (three levels of comparison of linguistic patterns, narrative structures, discursive effects), institutional (filters of the propaganda model). Operationalization is carried out through a two-component analytical platform that automates web parsing, NLP annotation, LDA topic modeling, and LLM comparison.

The methodological innovation consists in the introduction of the concept of computational hermeneutics as a synthesis of objectifying NLP procedures with a hermeneutic understanding of semantic structures, implemented through computational triangulation – the convergence of conclusions from statistical analysis, thematic modeling and AI interpretation. This approach overcomes the limitations of traditional methods: automation provides scalability, LLMs provide depth of semantic analysis, and human interpretation provides theoretical validity.

The theoretical significance is determined by the validation of classical media research paradigms (Entman framing, McCombs second-level agenda-setting, Herman-Chomsky propaganda model) in the context of the digital media environment, which confirms the stable explanatory power of these theories. The structural filters of media systems continue to determine information production despite technological democratization: algorithms and platform ownership create new filtering mechanisms in addition to the classic corporate and government ones.

Prospects for further research include expanding the corpus to 500 plus articles over a three-month period for statistical power of conclusions; longitudinal design with weekly sampling to track the diachronic evolution of narrative; integration of multimodal analysis of visual content through CLIP (Contrastive Language-Image Pre-training); application of BERT-based sentiment models to overcome the limitations of VADER in detecting irony; cross-national expansion on BBC, Reuters, RT, Al Arabiya to identify global media framing patterns; integration of multilingual NLP models (XLM-RoBERTa) to analyze Arabic-language publications.

The final thesis of the study is that in the context of the global mediatization of conflicts, information warfare is becoming no less significant than military warfare. Al Jazeera and CNN don't just report on conflict – they constitute different interpretations of conflict through divergent narrative practices, forming incompatible epistemological realities for their audiences. Computational hermeneutics as a methodological paradigm provides a scalable study of these media structures while maintaining theoretical validity and interpretative depth, which expands the toolkit of contemporary computational media studies.

^[1] Computational hermeneutics is an iterative method of text analysis that combines computing techniques and hermeneutic understanding to identify hidden semantic structures and interpret discourse through the correlation of parts and the whole. See ^{[21, 19, 23]}

^[2] spaCy-NER and named entity density are used at the NLP toolkit level, see ^[26].

^[3] The choice of a one-day period (31.10.2025) is due to the need to fix a coherent media context that excludes the cumulative effects of the news cycle and calendar distortions, as well as the representativeness of the point of high geopolitical saturation. The analysis of the Middle East section was chosen due to its discursive density, conflict sensitivity, and global significance, which makes it possible to identify the structural mechanisms of framing, semantic divergence, and narrative models in a synchronous media stream. This design conforms to the procedures of event-based sampling, micro-temporal discourse analysis, and computational hermeneutics, ensuring experimental and methodological purity and comparability of discursive patterns. See ^{[22, 18]}

^[4] VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon- and rule-oriented algorithm for tonality analysis optimized for media and news texts. He assigns values of positive, negative, and neutral valence to each utterance, as well as an integral composite index. This allows us to identify the direction and intensity of emotional assessments in the corpus and compare them between sources.

^[5] Named Entity Recognition (NER) in the spaCy library automatically identifies and classifies entities in the text (for example, states, cities, organizations, persons). Frequency counting of detected entities makes it possible to capture thematic focuses and map the geopolitical accents of sources, correlating the intensity of mentions with the narrative orientation of the corpus.

^[6] The TF-IDF (Term Frequency – Inverse Document Frequency) algorithm calculates the significance of words in a corpus, increasing the weight of lexemes specific to a particular document and decreasing the weight of lexemes often found in other texts. Thus, the key terms reflecting the local thematic accents and semantic priorities of the discourse are highlighted.

^[7] The Latent Dirichlet Allocation (LDA) algorithm, a probabilistic model that distributes words and documents by latent topics, was used to extract hidden thematic structures. The parameter K = 5 fixes the number of allocated topics, ensuring the interpretability of a limited volume corpus; the parameter passes = 10 means a tenfold complete passage through the data to stabilize the distributions and improve the convergence of the model.

^[8] Nominatim is a geocoder that uses OpenStreetMap data to convert geographical mentions to coordinates and vice versa. Automatic comparison of toponyms with spatial metadata allows you to reconstruct the geographical configuration of events and visualize the territorial focus of the media discourse.

^[9] The Google Gemini model is used in the controlled generation mode with the parameter temperature = 0.2, which minimizes the stochasticity of the output and enhances the determinacy of interpretations. The low temperature provides a more stable identification of hidden semantic connections and stable narrative patterns, allowing you to capture the implicit structures of discourse while maintaining strict semantic discipline of response.

Instagram Facebook ^[10] The American multinational holding company Meta Platforms inc (Facebook and Instagram social networks) has been recognized as extremist and its activities are prohibited in the Russian Federation.

Instagram Facebook and Instagram ^[11] The American multinational holding company Meta Platforms inc (Facebook and Instagram social networks) has been recognized as extremist and its activities are banned in the Russian Federation.

The article is published in its final version as approved following the last positive peer review recommending acceptance for publication. It incorporates revisions made by the author in response to prior negative peer review reports that did not recommend publication. All peer review reports, including initial negative reviews, are published in open access alongside the article. All versions of the author’s revisions are archived in the publisher’s repository and may be made available upon reasonable request in accordance with Elsevier’s editorial policies and applicable data availability requirements.
Read all reviews on this article

References

1. Alekyan, M. V., & Erisyanz, D. E. (2024). The application of the automated sentiment analysis method for studying the specifics of Telegram channel content. Systemic Transformations of Journalism, 5-6.
2. Aslanov, I. A. (2020). Metaphorical framing in media texts and communication about depression: Results of content analysis and experiment. Bulletin of Moscow University. Series 10: Journalism, 6, 3-22. https://doi.org/10.30547/vestnik.journ.6.2020.322
3. Dunas, D. V., Salikhova, E. A., Tolokonnikova, A. V., & Babyna, D. A. (2022). Establishing the agenda and the framing effect: On the necessity of conceptual unity in media research of "digital youth." Bulletin of Moscow University. Series 10: Journalism, 4, 47-78. https://doi.org/10.30547/vestnik.journ.4.2022.4778
4. Dunas, D. V., Tolokonnikova, A. V., Babyna, D. A., Boiko, O. A., & Sidorov, E. A. (2025). The agenda in social media and federal news agencies: The thematic gap and semantic unity (on the example of "digital youth" in Russia). Journal of Siberian Federal University. Humanities, 18(1), 178-193.
5. Kazakov, A. A. (2015). Agenda-setting theory vs. framing: On the correlation of approaches. Politia: Analysis. Chronicle. Forecast, 1(76), 103-113.
6. Sokolov, A. V. (2024). Contextualization of the pan-Arab narrative "Al-Jazeera": Thematic classification, framing, and sentiment analysis of news content. Journalist. Social Communications, 4(56), 97-110.
7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
8. Bojić, L., Zagovora, O., Zelenkauskaite, A., et al. (2025). Comparing large language models and human annotators in latent content analysis of sentiment, political leaning, emotional intensity, and sarcasm. Scientific Reports, 15, Article 11477. https://doi.org/10.1038/s41598-025-96508-3
9. Burggraaff, C., & Trilling, D. (2020). Through a different gate: An automated content analysis of how online news and print news differ. Journalism, 21(1), 112-129. https://doi.org/10.1177/1464884917716699
10. Chew, R., Bollenbacher, J., Wenger, M., Speer, J., & Kim, A. (2023). LLM-assisted content analysis: Using large language models to support deductive coding. arXiv. https://doi.org/10.48550/arXiv.2306.14924
11. Damanhoury, K. E., & Saleh, F. (2017). Is it the same fight? Comparative analysis of CNN and Al Jazeera America's online coverage of the 2014 Gaza War. Journal of Arab & Muslim Media Research, 10(1), 85-103. https://doi.org/10.1386/jammr.10.1.85_1
12. Dunivin, Z. O. (2025). A computational qualitative approach to large-scale characterization of cultural identities on social media. SocArXiv. https://doi.org/10.31235/osf.io/5wvhx_v1
13. Elmasry, M. H. (2024). Images of the Israel-Gaza war on Instagram: A content analysis of Western broadcast news posts. Journalism & Mass Communication Quarterly, 102(3), 695-721. https://doi.org/10.1177/10776990241287155
14. Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51-58. https://doi.org/10.1111/j.1460-2466.1993.tb01304.x
15. Herman, E. S., & Chomsky, N. (2002). Manufacturing consent: The political economy of the mass media. Pantheon Books.
16. Hossain, A., Abdul Wahab, J., & Khan, M. S. R. A. (2022). A computer-based text analysis of Al Jazeera, BBC, and CNN news shares on Facebook: Framing analysis on COVID-19 issues. SAGE Open, 12(1). https://doi.org/10.1177/21582440211068497
17. Hutto, C. J., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Weblogs and Social Media, 8(1), 216-225. https://doi.org/10.1609/icwsm.v8i1.14550
18. Kim, H., Jang, S. M., Kim, S.-H., & Wan, A. (2018). Evaluating sampling methods for content analysis of Twitter data. Social Media + Society, 4(2). https://doi.org/10.1177/2056305118772836
19. Kommers, S., et al. (2025). Computational hermeneutics: Evaluating generative AI as a cultural technology. SSRN. https://doi.org/10.2139/ssrn.5409144
20. McCombs, M., Llamas, J. P., Lopez-Escobar, E., & Rey, F. (1997). Candidate images in Spanish elections: Second-level agenda-setting effects. Journalism & Mass Communication Quarterly, 74, 703-717. https://doi.org/10.1177/107769909707400404
21. Mohr, J. W., Wagner-Pacifici, R., & Breiger, R. L. (2015). Toward a computational hermeneutics. Big Data & Society, 2(2). https://doi.org/10.1177/2053951715613809
22. Pavelko, R. L., & Grabe, M. E. (2017). Sampling, content analysis. In The International Encyclopedia of Communication Research Methods (pp. 1-10). Wiley-Blackwell. https://doi.org/10.1002/9781118901731.iecrm0223
23. Picca, D., Schnyder, A., & Romele, A. (2024). Computational hermeneutics of emotion: A comparative study of emotional landscapes in Dostoevsky's Crime and Punishment. Humanities and Social Sciences Communications, 11, Article 1428. https://doi.org/10.1057/s41599-024-03955-w
24. Schroeder, H., Aubin Le Quéré, M., Randazzo, C., Mimno, D., & Schoenebeck, S. (2025). Large language models in qualitative research: Uses, tensions, and intentions. Proceedings of the CHI Conference on Human Factors in Computing Systems, 1-15.
25. Seib, P. (2008). The Al Jazeera effect: How the new global media are reshaping world politics. Potomac Books.
26. Taher, H. A., Alabid, N., & Hasan, B. M. (2025). Integration named entity recognition and latent Dirichlet allocation to enhance topic modeling. Annals of Emerging Technologies in Computing, 9(2), 20-30. https://doi.org/10.33166/AETiC.2025.02.002
27. van Atteveldt, W., Welbers, K., & Van der Velden, M. A. C. G. (2019). Studying political decision making with automatic text analysis. In Oxford Research Encyclopedia of Politics (pp. 1-11). Oxford University Press. https://doi.org/10.1093/acrefore/9780190228637.013.957

First Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The subject of the research is the modern global media sphere, which demonstrates the intensification of information confrontation between various media organizations, which arises as a result of the construction of competing narrative realities, the analysis of which is devoted to this article. The emphasis on the communicative design of media spaces makes us think about effective lexical methods of influence in the information environment. The author focuses on two main media giants: Al Jazeera and CNN, they are analyzed by the author not only as global media brands, but also information platforms that broadcast diametrically opposed geopolitical positions, the author focused on the main concepts that form the data of the media space. The relevance of the research is undeniable, since modern narratives of society, politics and culture are mostly formed in the modern world by the main media standards, therefore it is necessary to understand the patterns and features of the formation of the communicative field of the impact of these information platforms on a single person. The novelty of the research is unjustified, since in the world of global computerization, public opinion is mainly formed by lexical paradigms that prevail in various media giants. In our opinion, the author successfully chose the platforms for analysis, since they broadcast opposing narratives, but at the same time at the same level of popularity among users, this allows us to trace the communicative hooks that attract users and prepare the necessary recommendations in the future for the effective design of such media spaces in order to influence the masses. From this point of view, the study is quite interesting for philologists, sociologists, political scientists and, in general, for a wide range of readers. The style of the article corresponds to the scientific one, the author states the thesis, illustrates it with examples from the information platforms of the main media giants he analyzes, Al Jazeera and CNN, and draws logical conclusions. The structure of the article meets the requirements of the journal and has an introduction, methods, main part and conclusion. The introduction is clearly structured and maintains relevance, novelty, main goals and objectives. The content has the logic of scientific presentation. The bibliography of the article contains sources from the last 10 years, which confirms the modern approach to the issue under study, and the author of the study refers to fundamental classical studies, which indicates the study of this issue in a retrospective aspect, in addition, by involving worldwide media space material, which shows the study of the issue comprehensively. The conclusions of the article contain in a focused form the intermediate conclusions of the entire article and may be of interest to a wide audience: philologists, linguists, literary critics, teachers and students, sociologists, political scientists, philosophers. Therefore, the article is interdisciplinary in nature and is located at the junction of several sciences, therefore it may be of interest to a wide audience of readers. The article may be recommended for publication.

Second Peer Review

The reviewed article is devoted to the development and empirical testing of a computational meta-model for comparative analysis of media narratives based on the coverage of the Middle East conflict by global media resources CNN and Al Jazeera. The subject of the research is formulated clearly and consistently: the author considers the media narrative as a form of constructing social reality and analyzes how the institutional, ideological and technological parameters of media systems are reflected in the linguistic and discursive structures of news texts. The choice of the research object is justified both from the point of view of the political significance of the conflict and from the point of view of the methodological productivity of comparing two media actors representing different geopolitical and cultural positions. The research methodology is one of the strongest aspects of the work. The author builds a complex but logically coherent framework based on a synthesis of critical media theory, computational social science, and a hermeneutic approach. Using the concept of computational hermeneutics makes it possible to combine quantitative methods of processing large text corpora (NLP, thematic modeling, tonality analysis, NER) with qualitative interpretation of narrative structures. The detailed data collection procedure, the parameters of the corpus, the libraries and algorithms used, as well as the principles of validation of results through computational triangulation ensure reproducibility of the study and compliance with modern standards of computational media studies. Especially noteworthy is the careful integration of LLM models as an analytical tool rather than an autonomous interpreter, which reduces the risk of methodological distorting effects. The relevance of the article is beyond doubt. In the context of the global mediatization of conflicts and the growing role of information warfare, the need for scalable and theoretically sound methods of media discourse analysis becomes obvious. The author convincingly shows the limitations of traditional qualitative methods when working with large corpora and demonstrates the potential of computational approaches to operationalize classical framing theories, agenda setting, and propaganda models. The emphasis on the lack of such empirical research in the Russian-speaking academic tradition is particularly significant, which makes the work methodologically and institutionally in demand. The scientific novelty of the research lies in the development of a meta—model of comparative media analysis that integrates several levels of analysis — linguistic, semantic, pragmatic and institutional - into a single analytical architecture. What is new is not only the synthesis of the methods themselves, but also the way they are operationalized: quantitative metrics of tonality, frequency, and thematic distribution correlate with qualitative interpretations of frames, agency, and legitimization strategies. The empirical results obtained — differences in tonal structure, terminological choice ("prisoners" vs. "hosts"), geographical focus, and agency attribution — convincingly demonstrate the heuristic potential of the proposed model. The style and structure of the article are characterized by a high degree of scientific reflexivity and conceptual density. The text is logically organized, consistently moving from a theoretical justification to a description of the methodology, then to the results and their interpretation. At the same time, the article is full of specialized terminology and complex syntactic constructions, which requires a high degree of preparation from the reader. In a number of fragments, there is an overload of technical implementation details (API description, model parameters), which somewhat shifts the balance towards the methodological report; some of this information could be included in the application without prejudice to the main argumentation core. The bibliography of the article is representative and relevant. The list includes both classical works on media theory (Entman, Herman, Chomsky, McCombs), as well as modern research in the field of computational methods, LLM-assisted content analysis and NLP. The presence of both Russian-speaking and English-speaking sources indicates a good command of the international scientific context and correct theoretical navigation. An appeal to opponents is realized through a comparison of different approaches to media analysis and a demonstration of their limitations. The author does not engage in direct controversy, but consistently shows that traditional methods do not provide scalability and reproducibility, while purely computational approaches need hermeneutical interpretation. This position looks balanced and methodologically mature. The conclusions of the study follow logically from the presented analysis and have both theoretical and applied significance. The work will be of interest to media discourse researchers, specialists in political communication, digital journalism, as well as graduate students and undergraduates in the humanities who master computational methods of text analysis. The presented article is a methodologically strong, conceptually rich and relevant research. The comments made are purely advisory in nature and do not affect the overall positive assessment. We believe that the article can be recommended for publication in a scientific journal.

Journals

Books

Computational Methods in Comparative Media Analysis: Operationalization of Conflict Narrative Research Using the Example of CNN and Al Jazeera