Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Litera
Reference:

Forms of submission of materials in data journalism

Baranova Ekaterina Andreevna

ORCID: 0000-0003-1794-9936

Doctor of Philology

Professor, Department of Communication Management and Relationship Management, RSSU

4 Wilhelm Peak str., Moscow, 129226, Russia

kat-journ@yandex.ru
Other publications by this author
 

 
Shnaider Anna Aleksandrovna

ORCID: 0000-0002-4836-073X

Doctor of Philology

Professor, Department of Management and Hospitality, Baltic International Academy

1003, Latviya, g. Riga, ul. Lomonosova Iela, 4

anshnaider@gmail.com

DOI:

10.25136/2409-8698.2022.3.37556

Received:

16-02-2022


Published:

17-03-2022


Abstract: The term data journalism appeared in 2005, however, a single definition of the term "data journalism" has not yet appeared in the scientific community, researchers give different interpretations. This is largely due to the fact that there are discrepancies in understanding the essence of the phenomenon: whether it is a new direction of journalism development or a new genre, format of providing information. As part of the preparation of this article, E.A. Baranova conducted expert interviews with Konstantin Poleskov, the editor of SAĬta novayagazeta.ru , Roman Anin, editor of the investigation department of "Novoi Gazeta", Tina Berezhnaya, Advisor to the General Director for Information Technology of the TV channel "Russia Today", Alexey Smagin, a graduate of the Department of Data Journalism of the Higher School of Economics, an active specialist working at Novaya Gazeta. The experts were asked questions concerning the prerequisites for the emergence of data journalism, the forms of providing materials in data journalism, the development of new competencies among media workers, ethical problems that may be associated with the development of data journalism. The subject of the research is data journalism as a new direction of journalism development. The object of research is the forms of submission of materials in data journalism. The authors studied the data materials published on the websites of Russian and foreign media on the Yandex platform. The article highlights five forms of content submission in data journalism: analytical article; picture; flashcards; longrid; interactive multimedia project. The authors of the article come to the conclusion that data journalism today gives a new development to traditional journalistic genres - analytical article, investigation, news genres. The development of data journalism is associated with the emergence of ethical issues. To date, there are practically no specialists in data journalism in Russia, in order to develop this direction, it is necessary to reform educational programs in journalism.


Keywords:

Data Journalism, Date journalism, Date-material, Data Department, Big Data, Forms of presentation of the material, Computer journalism, Date-journalist, Longrid, Interactive research

This article is automatically translated.

 

Introduction

In the USA in the 1950s, the concept of "computer-assisted reporting" (CAR) appeared [Houston B. Fifty Years of Journalism and Data: a Brief History. Global Investigative Journalism Network. Available at: https://gijn.org/2015/11/12/fifty-years-of journalism-and-data-a-brief-history/ (accessed: 15.02.2022)]. American reporters then began to actively use automated systems to process a large amount of information.  For example, in 1952, such an automated system helped the CBS television company to process the results of the presidential election [ibid].

The term "data journalism" appeared fifty years later. It is believed that it was introduced by a journalist WashingtonPost.com Adrian Holovaty [1]. In 2005, he launched his project, "chicagocrimes.org ", for which he created a special code program that analyzes information published on the official website of the Chicago Police Department.  The program issued reports on all crimes committed in the city during the week. This helped to fill the site with news. Golovaty said that journalism in tandem with computer technologies allows collecting information that is "growing exponentially" today [2, p. 33], passing it through a filter (editorial office) and presenting it in a certain way. That is, these are the same tasks that traditional journalism faces, only other methods are used to achieve them.

For the development of data journalism, it is necessary not only to have automated programs that allow processing and analyzing large amounts of information, but also the openness of data. In this regard, an important date in the history of the formation of data journalism in Russia was 1999, when the official website of Rosstat (Federal State Statistics Service) appeared. It was then that various studies based on the data of this site began to appear. Rosstat today publishes data on various spheres of human life, which in turn opens up great potential for creating full-fledged research.

In 2010, in accordance with the Decree of the Government of the Russian Federation dated May 26, 2010 No. 367, the EMISS database appeared [3, p. 212].  In 2013, Federal Law No. 112-FZ "On Amendments to the Federal Law "On Information, Information Technologies and Information Protection" and the Federal Law "On Ensuring Access to Information on the Activities of State Bodies and Local Self-Government Bodies" were adopted [Federal Law No. 112-FZ of June 7, 2013. URL: http://www.consultant.ru/document/cons_doc_LAW_147222 / (accessed: 01/26/2022)]. All state structures were obliged to provide reports on their activities. These changes served as an impetus for the development of data journalism in Russia.

In the West, data journalism began to develop much earlier. For example, The Guardian datablog, which many experts [4, 5, 6] consider an illustrative example of European data journalism, was created in 2009. If the first data departments in foreign media appeared back in the 2000s, then the first data department in the editorial office of the Russian media (Novaya Gazeta) opened only in 2018. The journalists of the department, using machine methods of big data analysis, identify various trends. For example, in 2018, the data department investigated 60,000 sentences of Russian courts related to extremist crimes. The journalists saw a clearly tracking trend – the courts crossed out signs qualifying extremism, since they, in their understanding, were not reliable [7. p. 42].

Data journalism: definition of the concept

Despite the fact that such a phenomenon as data journalism has existed for more than a decade, a single definition of the term "data journalism" has not yet appeared in the scientific community, researchers give different interpretations. This is largely due to the fact that scientists consider data journalism from different angles: as a direction of the media industry, closely related to the development of computer, computational journalism [8]; in the perspective of the epistemology of data journalism (the influence of the phenomenon of big data on modern journalism) [9]; in the perspective of the topic related to journalistic education (new requirements for training journalists programming, statistics, data visualization) [10, 11].

American data journalist Paul Bradshaw in the "Handbook of Data Journalism" notes that this direction is characterized by new opportunities that appear if it is possible to combine the traditional "flair for news", the skill of searching for information and the ability to present to the reader an exciting story based on a huge amount of diverse numerical information, which became publicly available [Bradshaw, Paul. (2013). Ethics in data journalism: accuracy. Online journalism blog. Retrieved from: https://onlinejournalismblog.com/2013/09/13/ethics-indata-journalism-accuracy/ (accessed: 15.02.2022)].

The manual, created by the Aljazeera Media Institute, mentions that data journalism is the process of extracting meaning from data for writing an article. At the same time, the journalist necessarily uses methods of data analysis and interpretation [An Introductory GuideBook. Data Journalism. Alzajeera. Media Institute. Retrieved from: https://institute.aljazeera.net/sites/default/files/2019/Data%20Journalism%20En%20-%20Web.pdf (accessed: 15.02.2022)]. Greek data journalism researchers Andreas Veglis and Karalampos Bratsas define data journalism as the process of extracting useful information from data, writing articles based on information, and implementing visualization (in some cases interactive) that helps users understand the significance of a story. The peculiarity of data journalism is the presence of visualization in the article, which helps to present complex information in an accessible form [12].

Sherstyukova M.N. in the publication "Media. Information. Communication" characterizes the principles of work of data journalists. She points out that they use all kinds of databases as the main source of information, which can be presented in the format of maps, graphs, summaries, tables, lists [13].

Simakova S.I. in the article "Data journalism as a media trend" speaks about this phenomenon as a tool for presenting important information to the public that could go unnoticed, as well as an effective way of critical research of some issues [14, p. 483].

M.G. Shilina, in turn, gives a broader definition of data journalism, saying that it is the process of creating convergent content based on the use of arrays of computer and Internet data. But data journalism can and should be understood even more broadly: as "a set of specific skills for searching, analyzing, visualizing information from digital metadata sources for the formation of interactive formats for the unique presentation of author's analytical content and effective interaction with the audience; this is the format of current journalism, the format of media text (media content), the method of its creation, broadcast, consumption, which can be used as a meta-method and meta-basis for othernrov" [15]. In the context of the transformation of genre forms of modern journalism, data journalism is also considered by Nerents D. V. [16].

The above definitions allow us to conclude that both foreign and Russian researchers mean similar concepts by data journalism. However, there are some differences in understanding the essence of the phenomenon: whether it is a new direction of journalism development or a new genre, a format for providing information.

Foreign and Russian authors are sure that the development of data journalism is associated with the emergence of new journalistic competencies. It is necessary to have programming skills (knowledge of Python), work with certain technical tools, such as data extraction (Tabula, document cloud), data cleaning and analysis (Google spreadsheets, open refine), data visualization tools (Datawrapper, inforgram, flourish).

Journalists have used data in their materials before, but technological progress has created new opportunities for integrating data into research. The technological revolution of the XXI century has allowed the media to open new horizons and see other ways of development.

Variety of forms of submission of data materials

         In data journalism, several types of data are used with which information can be presented. These are numbers, text, geolocation, date and time, multimedia. M.G. Shilina defines multimedia as "a complex convergent representation of information in any digital format (conditionally verbal, conditionally visual, conditionally auditory, etc.), which allows you to create a unique type of content applicable on any media communication channel" [15].

 The main task of a journalist is to structure various data correctly. Traditional news journalists who do not have programming skills also cope with this work. Data structuring allows you to conduct unusual research, identify patterns and present information in a new way for the audience.

The form of content presentation is becoming an increasingly important component in modern journalism, the researchers emphasize: "new technologies shift the focus of attention from the content of journalistic publications to the form of its production" [17, p. 410].  In this regard, it seems relevant to us to study the phenomenon of data journalism precisely from the perspective of considering the most common forms of submission of data materials. The authors conducted a content analysis of various materials published on the websites of Russian and foreign media, in the catalog of the Open Data Hub of Russia (Open Data Hub statistics), as well as on YouTube, Yandex platforms to highlight possible forms of presentation of data materials. We believe that today we can distinguish five forms of content submission in data journalism.

Any data-material is based on the analysis by machine methods of a large amount of information (big data). This analysis can be carried out within the editorial office of one media outlet, journalists can also use third-party data.

We distinguish such forms of submission of materials in data journalism.

1.     An analytical article is one of the most common forms of submission of materials in data journalism. This is an article with elements of analysis and deep processing of a large amount of data. Many such materials prepared by the data department are published in the "Investigations" section of Novaya Gazeta. Consider the article ["Judges hold us for a dummy" https://novayagazeta.ru/articles/2019/03/20/79929-sudi-derzhat-nas-za-bolvanku (accessed: 01.02.2022)]

The material is accompanied by additional links to documents that have been analyzed by machine methods (50,000 judicial acts), each file can be opened and examined independently [Data to the text of the New Newspaper – The judges hold us for a blank. https://hubofdata.ru/dataset/judgement-copies (accessed: 01.02.2022)]. The article is illustrated by an infographic created by the editorial data department.  Such investigations, as a rule, take a lot of time, they come out with a frequency of 1-2 times a month. These are not ordinary investigative journalism, as they are based on the analysis of big data that is publicly available. As a rule, for the analysis of specific tasks, a separately written computer code is required, which selects the necessary data and highlights patterns based on machine analysis.

This example shows that data journalism gives new development to such traditional genres as analytical article, investigation [18].

2.     Picture. This format of providing data material can often be found on the RBC website. As an example, consider the material [Which of the ministers lives not only on a salary https://www.rbc.ru/politics/27/12/2017/5a43b3239a794779ab9a13e2 . (accessed: 01.02.2022)]. Based on the analysis of data taken from the sites government.ru , minfin.ru , it was possible to compare salaries and declared incomes of ministers. Data journalists participated in the preparation of the material. 

3.     Cards. In the material-picture there is one main component – the picture itself, often in JPG format, on which, as a rule, data of any indicators are compared. Materials-cards contain three or more pictures. The cards are linked by a common theme. This format of data material is often used when it is necessary to demonstrate specific volumes, for example, investments, income, etc. Consider the material prepared by RIA Novosti on theater financing [Financing theaters in Russia and Europe. https://ria.ru/20190930/1559214764.html (accessed: 01.02.2022)]. In a concise form, against the background of images with which the audience has specific associations, the data obtained during the analysis of a number of sources are indicated.

Materials-pictures and cards illustrate the main principle of data journalism – "building material not around news, but around statistical, background information, figures, reporting documentation, summaries" [19, p. 146].

4.     Longrid. As an example, consider the material published on the Yandex platform – [A film in which https://yandex.ru/company/researches/2019/whatsthemovie (accessed: 01.02.2022)]. The company collected data from search queries of users who are looking for different movies without remembering or knowing its name. Funny posters were created with the main combinations of queries, recommendations were given on how to quickly find a certain movie. Users could also search for movies by certain words.

5.     Interactive m ultimedia project. This format of data materials is especially popular abroad. In 2019, The Pudding website received the Best Website of the Year award in the category of excellent project in online Journalism (General Excellence in Online Journalism, Micro Newsroom). Various projects can be found on the Internet resource. One of them is devoted to the gender theme and the problem of racial inequality. The crossword helps to find out which of the famous historical figures belonged to the white race, and who belonged to the Negroid race. By clicking on a certain area of the crossword puzzle, the user sees how the visual component of the crossword changes. In the projects on The Pudding website, the reader acts as a creator: it depends on his actions how the material will change further [Playable mini puzzles https://pudding.cool/2020/11/crossword-puzzles /(accessed: 01.02.2022].

Consider another example – the material published on the Financial Times website "New Trade Routes: the Silk Road Corridor" [One belt, one road. Financial Times. https://ig.ft.com/sites/special-reports/one-belt-one-road/?mhq5j=e3 (accessed: 15.02.2022)]. The user is given the opportunity to visually see and trace the Silk Road on an interactive map and get important information in a compressed form.

 

Ethical aspects of data content

The development of data journalism entails a problem related to data processing by machine methods. Foreign researchers of data journalism raise such ethical questions: whether programs should take into account the parameters of objectivity, responsibility and accuracy when creating texts. There is also a question of data integrity, because missing elements can lead to bias in content creation [Bradshaw, Paul. (2013). Ethics in data journalism: accuracy. Online journalism blog. Retrieved from: https://onlinejournalismblog.com/2013/09/13/ethics-indata-journalism-accuracy/(accessed: 15.02.2022)]. Lin Weeks in the article "The Law on Mass Media and the copyright of automated journalism" notes that the content created by machine methods raises complex questions about copyright. Vix suggests that the rights can even be transferred to the computer program itself [20].

The main ethical problem of data journalism is the need to be accurate and provide the proper context of the story [Bradshaw, Paul. (2013). Ethics in data journalism: accuracy. Online journalism blog. Retrieved from: https://onlinejournalismblog.com/2013/09/13/ethics-indata-journalism-accuracy/(accessed: 15.02.2022)]. This can affect how journalists analyze data, compile reports and publish them. Eric Litke, an American practitioner and data journalist, notes that a journalist is responsible for understanding the data, the period of the study, the changes that have been made to the data, and the potential errors that they may contain. He says that the data cannot be simply taken, a journalist is obliged to ask a number of questions to himself and find answers to them before using the data in his research [McBride, Rebekah E. D. (2016). The Ethics of Data Journalism. Retrieved from: https://core.ac.uk/download/pdf/188108922.pdf (accessed: 15.02.2022)]. It is obvious that the volume of materials of data journalists will increase every year. Ethics-related issues will probably be even more relevant for a data journalist than for a traditional journalist.

 

Conclusion

Many researchers say that data journalism is a new direction in the development of mass media. In Russia, indeed, it appeared relatively recently, but in the USA and Europe it has existed for more than a decade.

We live in the era of big data. In this aspect, data journalism becomes a trigger for creating new forms of content provision. Many data materials are made on the basis of serious analysis by machine methods of a large amount of information (big data). Thus, each material, regardless of the form in which it is presented to the audience, contains a lot of research. If an ordinary journalist, when creating a material, turns to experts for comments, personal experience, and other publications, then a data journalist also turns to big data. As a result, both a traditional journalist and a data journalist receive a media product that at first glance may not differ in any way from ordinary journalistic material.

Today, data journalism gives a new development to such traditional journalistic genres as an analytical article or an investigation. It can be noted that the picture or cards in turn give a new development to news genres. The genre of Internet journalism - longrid can also be made based on the use of big data.

For the development of data journalism, it is necessary that as much official data as possible appear in the public domain. Therefore, when official bodies upload their reports to the Internet, it makes it easier for a data journalist to work and makes it more accurate and predictable.

To date, there are practically no specialists in data journalism in Russia, in order to develop this direction, it is necessary to reform educational programs in journalism.

The development of data journalism is associated with the emergence of ethical questions: whether programs should take into account the parameters of objectivity, responsibility and accuracy when creating texts.  

References
1. Howard Alexander Benjamin. The Art and Science of Data-driven Journalism. Columbia Journalism School, 214. 144 p.
2. Âàéãåíä À.Big Data. Âñÿ òåõíîëîãèÿ â îäíîé êíèãå. – Ì.: Ýêñìî, 2018. – 384 ñ.
3. Melmik G.S., Vinogradova S.M. Business Journalism: A Textbook. – St. Petersburg: Peter, 2010. – 304 p.
4. Borges-Rey E. Unravelling Data Journalism: a Study of Data Journalism Practice in British Newsrooms // Journalism Practice. 2016. ¹10 (7). Pp. 833-843.
5. Knight M. Data Journalism in the UK: a Preliminary Analysis of Form and Content //Journal of Media Practice 2015. ¹ 16 (1). Pp. 55-72.
6. Stalph F. Classifying Data Journalism. A Content Analysis of Daily Data-Driven Stories // Journalism Practice. 2017. ¹ 12 (10). Pp.1332-1350.
7. Baranova E.A. Convergence journalism: A Textbook for University Students. – M.: Urayt, 2021. 156 p.
8. Flew T., Spurgeon C., Daniel A., Swift A. The promise of computational journalism // Journalism Practice. 2012. ¹ 6 (2). Pp. 157–171.
9. Parasie S. Data-driven revelation? Epistemological tensions in investigative journalism in the age of big data‘// Digital Journalism. 2015. ¹ 3 (3). Pp. 364–380.
10. Hewett J. Learning to teach data journalism: Innovation, influence and constraints //Journalism. 2015. ¹ 17 (1). Pp. 119–137.
11. Yarnall L., Johnson J. T., Rinne L., Ranney M. A. How Post-second ary Journalism Educators Teach Advanced CAR Data Analysis Skills in the Digi tal Age // Journalism & Mass Communication Educator. 2008. ¹ 63 (2). Pp.146-164.
12. Veglis A. & Bratsas C. Reporters in the age of data journalism: The case of Greece // Journal of Applied Journalism & Media Studies. 2017. ¹ 6 (2). Pp. 225-244.
13. Sherstyukova M.N. Data-journalism as a new word to the mass media system //Media. Information. Communication. 2012. ¹ 1. Pp. 12-14.
14. Simakova S.I. Data Journalism as Mediatrend // Bulletin of the Lobachevsky University of Nizhny Novgorod. 2014. Issue. 2 (2). Pp. 481-484.
15. Shilina Ì. G. Data Journalism in the Structure of Media Communication // Mediascope. 2013. ¹ 1.
16. Nerents D. V.Thematic diversity of data stories in modern mass media // Almanac Accents, new in mass communication. 2019. Issue. 3-4 (162-163). Pp. 9-18.
17. Svitich L.G. Change the journalistic profession in mediakonvergencii // Bulletin of the Chelyabinsk State University. 2015. ¹ 5 (360). Pp. 406-414.
18. Shilina A. Data journalism in the foreign quality press (the case of specialized resources of the guardian and the new york times newspapers) // Bulletin of the Moscow State University. 2019. ¹ 5. Pp. 135-19.
19. Lisitsin M.E. The definition of the "data journalism" in modern research articles // Communication tudies. 2018. ¹ 3. Pp. 144-154
20. Weeks Lin. Media law and copyright implications of automated journalism // Journal of intellectual property and entertainment law. 2014.Volume 4, number 1. 94 p.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

Data journalism is a fairly new direction in the development of mass media. In Russia, indeed, it appeared relatively recently, but in the USA and Europe it has existed for more than a decade. At the same time, the general trends in the implementation of this form are close and similar. The reviewed article is devoted to the evaluation of data journalism from the standpoint of submitting material to the so-called open "print". I think that the object of the study is non-trivial, new, therefore, the generalization of data on a new form of media is quite relevant. This material, in my opinion, can become the main one for new research in a related thematic mode. The methodology of the work correlates with analytical, statistical, and empirical principles. Such a deciphering of the question is quite justified, especially since the author notes: "data journalism today gives a new development to such traditional journalistic genres as an analytical article or an investigation. It can be noted that the picture or cards, in turn, give a new development to news genres. The genre of online journalism - longrid can also be made based on the use of big data. For the development of data journalism, it is necessary that as much official data as possible appear in the public domain. Therefore, when official authorities upload their reports to the Internet, it makes it easier for a data journalist to work and makes it more accurate and predictable." Also, the novelty of the work is the systematic principle of information evaluation – algorithmization of types / types / forms of submission of materials in data journalism. The author does not exclude, but on the contrary fosters what can be indicated by an argumentative and illustrative platform. References and citations are given in full-fledged form; the requirements of the publication for the design of footnotes are maintained. The essence of the issue as a whole is expressed and disclosed, the author's point of view / opinion on the topic is clear. The work has an objective tone, the style correlates with the scientific type itself: for example, "despite the fact that such a phenomenon as data journalism has existed for decades, there has not yet been a single definition of the term "data journalism" in the scientific community, researchers give different interpretations. This is largely due to the fact that scientists consider data journalism from different angles: as a direction of the media industry, closely related to the development of computer and computational journalism; from the perspective of the epistemology of data journalism (the influence of the phenomenon of big data on modern journalism); from the perspective of a topic related to journalistic education (new requirements for training journalists in programming, statistics, data visualization)", or "foreign and Russian authors are confident that The development of data journalism is associated with the emergence of new journalistic competencies. It is necessary to have programming skills (knowledge of Python), work with certain technical tools such as data extraction (Tabula, document cloud), data cleaning and analysis (Google spreadsheets, open refine), data visualization tools (Datawrapper, inforgram, flourish)", or "the main ethical problem of data journalism is the need to be accurate and provide the proper context for the story. This can affect how journalists analyze data, compile reports and publish them," etc. The work is full-fledged, independent, original; it is noteworthy that the author creates a direct, conceptual dialogue with opponents, agreeing with someone in some ways, arguing, debating. I think this material will be of interest to both specialists in the field of studying new media and novice researchers. The goal of the work has been achieved, the set block of tasks has been solved as a whole; the text does not need to be edited and corrected. I recommend the peer-reviewed article "Forms of submission of materials in data journalism" for open publication in the scientific journal "Litera".