Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Litera
Reference:

The Method of Application of Mini-corposes for the Analysis of Language Variation of Etiquetial Expressions in Bilingual Language Situations in the Sociolinguistic Aspect.

Soloveva Anna Andreevna

ORCID: 0000-0001-9472-5966

PhD in Philology

Senior teacher, Russian as foreign language department, Russian State Agrarian University

127247, Russia, Moscow, Beskudnikovsky passage, 4-2-84

yarlee@yandex.ru

DOI:

10.25136/2409-8698.2023.10.39183

EDN:

WGLTKE

Received:

16-11-2022


Published:

16-10-2023


Abstract: The author examines the process of creating and methodology of applying the mini-corps to solve the problems in the field of language choice learning in the sphere of speech etiquette. The different stages of compiling a mini-corps from the peculiarities of material collection to the construction of theoretical models of language use of speech etiquette are considered. The terminological apparatus of the study is introduced. The features of different methods are compared and the need for a comprehensive approach to compiling a mini-corpus is substantiated. The shortage of corpora for the study of language choice in a bilingual situation is stated. Advantages and disadvantages of the chosen methodology are described. In particular, the necessity of selecting authentic examples of spontaneous speech taken from videos based on the method of non-included observation is pointed out. The place of the mini-corpus technique in the set of methods used in this study is shown. The result is a corpus-based description of the algorithms of speakers' use of etiquette expressions in a particular language under the influence of various configurations of factors and parameters of the social context. It is concluded that this method allows, on the one hand, to obtain the necessary and sufficient material for the analysis, on the other hand, it does not involve the use and processing of a large array of data. The prospect of researching the method is to apply it in other situations of bilingualism in order to complement and correct the theoretical models obtained in the course of the study. Thus, the method of compiling bilingual mini-corps makes it possible to reach the level of generalization of linguistic facts, sufficient to build universal models for interpreting the patterns of language choice of etiquette expressions in bilingualism.


Keywords:

sociolinguistics, corpus linguistics, bilinguism, etiquette expressions, methods, language choice, content analysis, Catalan language, Hindi language, mini-corpus

This article is automatically translated.

Introduction

Linguistic research involves working with linguistic facts, linguistic units of different levels. The collection and processing of language units should correspond to the goals and objectives of the study. In addition, the methods must have the logic of scientific analysis of linguistic tasks proper. The problem of choosing the methodology of linguistic research is the subject of active study of modern Russian linguistics [1],[6],[7],[9].

This study examines a special category of language units – speech etiquette.

Speech etiquette in this case means "the regulating rules of speech behavior, a system of nationally specific stereotyped, stable communication formulas adopted and prescribed by society to establish contact between interlocutors, maintain and interrupt contact in the chosen key" [8, 11]. For the study of speech etiquette, it is necessary to work at the lexical level of the language. At the same time, the concept of context is important, which in a bilingual language situation is complicated by the coexistence of two languages within one socio-cultural community, and, accordingly, by a bilingual context.

In paradigmatics, the bilingual language situation puts the speaker in front of a choice of units in two languages. In this situation, the speaker makes a choice based on the sociolinguistic context, on the characteristics of the type of addressee and the language situation itself [4, 10].

From the point of view of syntagmatics, in some cases it is also necessary to take into account the linguistic context. At the same time, it is important to take into account the peculiarities of the etiquette expressions themselves.

Such initial data require the researcher to develop a methodology for fixing units of speech etiquette while preserving the characteristics of the speech situation and social context.

The development of a methodology for this kind of research is relevant, since with a large number of monolingual automated corpora, there is a problem of compiling corpora that include texts in two or more languages simultaneously for one language situation. This study of the methodology involves the creation of such corpora as a tool for the study of the multilingual environment.

The purpose of this study is to build a step-by-step methodology for creating a mini-corpus necessary for conducting a study of etiquette expressions in situations of bilingualism.

The objectives of the study include: 1) to demonstrate the principle of operation of the methodology for compiling a mini-corpus; 2) to prove the effectiveness of this technique by analyzing the data of language situations in Catalonia and Northern India; 3) to outline the scope and prospects of using this method in linguistic research.

Methods

Etiquette expressions are special units of the language, as they perform a number of functions that distinguish them from other units in the language. R. O. Jacobson considered one of the main functions of etiquette in the language to be fatique, noting that "the fatique function is carried out through the exchange of ritual formulas or even whole dialogues, the sole purpose of which is to maintain communication" [10, 67]. The main purpose of messages with this function is to "establish, continue or interrupt communication, check whether the communication channel is working, attract the attention of the interlocutor or make sure that he listens attentively" [10, 67].

Two bilingualism situations were chosen for the study – Northern India and Catalonia (Spain).

The parameters of these language situations correspond to those required in the study. So, in both situations there is a language variation. They are also influenced by similar parameters [4, 10] and factors of the social context, which makes it possible to compare them.

A number of sociolinguistic methods are used to analyze language variation, including observation, questionnaires, quantitative and qualitative content analysis [3].

The methodological core in the study of linguistic units in the situation of bilingualism is content analysis, namely its qualitative variety. "Qualitative content analysis is aimed at understanding the phenomena under study; at analyzing the relationships and processes between these phenomena; it is focused on covering the totality and complexity of the phenomena under study and is aimed at studying individual cases" [2, 70].

In this study, the content analysis method and the methodology developed on its basis logically continue the methods of sociolinguistic research listed by U. Labov, since it is based on linguistic facts obtained during non-included observation.

A significant advantage of using qualitative corpus-based content analysis in sociolinguistics is that the study allows us to consider units both in a linguistic and social context at the same time, which is necessary for a more accurate understanding of the patterns of language choice.

The material for content analysis was the units of speech etiquette in two languages, collected in mini-corpora.

A mini-case is understood as a case with the number of occurrences of the studied units in the amount of 100. The need to compile their own mini-corpus was caused by the lack of specialized bilingual corpus of etiquette expressions of the studied regions based on the material of spontaneous speech of speakers. In addition, this type of corpus was chosen because it statistically meets the criteria for conducting qualitative content analysis aimed at studying single phenomena of the language.

The compilation of the mini-case consisted of several stages.

At the first stage, it was necessary to detect, select and record the video material. The corpus is based on text versions of video materials. The videos were supposed to contain elements of spontaneous speech, and therefore the advantage was given to videos of the following genres: practical jokes, social experiments, conversational video blogs, documentaries about the lives of ordinary people and hidden camera shooting. The type of video clip was recorded in the mini-case table.

The selected video materials were listened to in order to identify the studied speech situations and units of speech etiquette in them. Further, when identifying the unit of interest, it was entered into a special table, an example of which is presented below. The table also reflected the language context, described the situation, indicated links to the source, timecode, location, group of etiquette expressions and translation into Russian.

It should be noted that translation is particularly difficult, since in order to understand and analyze language situations, it is necessary to carry out the most accurate translation from both languages within each of the studied bilingual regions. In addition, it is also necessary to mark up the language belonging of the language units of etiquette and context in order to exclude the influence of the language context on the choice of language and to separate its influence from the influence of the actual social factors and parameters. All of the above steps are time-consuming, necessary to obtain qualitative analysis results and build on their basis theoretical models of the use of speech etiquette languages in bilingualism.

Thus, mini-corpora of speech etiquette units were created on the basis of video materials.

From the point of view of corpus linguistics, corpora can be characterized as follows: – oral, specific, synchronic. The corpora include about 1000 words and 100 occurrences of etiquette expressions in each of their countries.

After fixing the units and the accompanying context, it was necessary to divide the contexts into linguistic and social. In this case, the task was to prove that the language of etiquette and the language of context are not directly related, that the language context does not fully determine the language of etiquette expressions. This required the detection of situations where the language of speech etiquette units and the language of the context do not coincide. For this, Catalan and Hindi were conventionally designated as L1, and Spanish and English as L2.

 

Table 1. Correlation of the context language and the language of etiquette expressions (EV) in examples from video materials. 

 

The languages of EV and context are the same

The language of the EV and the context do not match

L1

L2

L1+L2

India (number of examples)

26

27

47

Catalonia (number of examples)

48

7

45

This table shows that in India, almost half of the observed examples contained a mixed language context. The same situation is observed in Catalonia. Situations differ in what role L2 contexts play in them. So, in India, the role of L2 contexts is higher than in Catalonia.

However, both cases confirm the assumption of an indirect dependence of the language of the etiquette on the language of the context. Thus, for all the studied groups of etiquette expressions in Catalonia and for the majority of such groups in India, a mixed language context is a normal environment for using etiquette expressions, the same as a monolingual context.

Thus, based on the data in the table, it can be assumed that the discrepancy between the language context and the language of etiquette expressions in most groups is evidence that the choice of the EV language is determined not only by the language of the context. It is logical to assume that such a choice can also be influenced by factors and parameters of the social context. And this conclusion is true for both India and Catalonia, which indicates similar general principles of choosing the language of etiquette expressions.

At the same time, it was also necessary to prove that the language of EV is not fully determined by the group of etiquette expressions either. To do this, it was necessary to establish that in each of the groups of etiquette there are units of two languages.

 

Table 2. The ratio of groups of etiquette expressions and context languages.

 

L1

L2

L1+L2

 

India

Catalonia

India

Catalonia

India

Catalonia

Greeting

5

15

4

1

10

5

Farewell

0

6

4

3

0

7

An apology

0

7

4

1

7

1

Gratitude

2

6

5

0

8

9

Appeal

20

13

9

2

22

23

 

In Table 2, you can find a different ratio of the number of units in the first and second language, however, it is not possible to trace a consistent relationship between a particular language and a group of etiquette expressions.

The next logical stage was the search for factors and parameters within the social context that influence the choice of the language of etiquette expressions.

The analysis of the social context included a sequential consideration of each of the four parameters ("Friend" / "Stranger", social status, gender and quantity) in their influence on the choice of EV to the relevant types of addressee in each of the regions.

To do this, the number of situations of using L1 and L2 for the recipient types for each of the parameters was calculated.

The analysis also took into account the degree of influence of social factors of prestige and language loyalty, depending on the language situation and the type of recipient.

Further, examples of the use of etiquette expressions with the direct application of the method of qualitative content analysis were considered in detail. The correlation of social and linguistic contexts was analyzed, and the reasons for the language choice in each case were suggested. 

The result of the analysis of video materials was a comparison of the role of context components in the choice of language in the studied regions.

Results

The result of the application of the data obtained as a result of the application of this technique is the construction of theoretical models [5, 98-103] that explain the patterns of choosing one of the two languages in the analyzed speech situations.

Model 1 showed that the "Friend"/ "Stranger" parameter is associated with the level of linguistic loyalty of the region.  With a high level of linguistic loyalty in communicating with "their" addressees, speakers are more likely to choose an etiquette expression on L1. If the level of language loyalty is low, and L2 belongs to prestigious languages, then it is most likely to expect the use of etiquette expressions on both L1 and L2. 

Model 2 demonstrates the choice of the language of etiquette expressions with addressee types having different social status. With the equal social status of the interlocutors, the linguistic affiliation of the etiquette expressions is not significant. However, with unequal status, the choice of language is influenced by the level of language loyalty. Thus, with a high level of linguistic loyalty, etiquette units usually correspond to the language of the context, and speakers try to avoid mixing languages. With a low level of linguistic loyalty and at the same time a prestigious L2, etiquette units on L2 are used for an addressee with an unequal status.

Model 3 reflects the patterns of language choice for the type of recipient "Friends" and shows that there is no difference in the preference of any language in this case. Similar language behavior is also characteristic if communication occurs with a male stranger. When talking to an unfamiliar woman, L1 is selected if the bilingual system already has its own etiquette units in this language. Otherwise, they are replaced by label expressions with L2.

In model 4, a consistent choice of language depending on the number of addressees can be spoken only in relation to the type of addressee "Stranger": with a high level of language loyalty, L1 is more often used, but if L2 is prestigious in this region, it is more likely to detect expressions on L2. However, in general, it can be concluded that the language of etiquette expressions for several addressees is equal to the language used in this situation in relation to one addressee.

The scope of this technique may include language situations where speakers use speech etiquette in more than one language.

The prospect of studying the methodology is its application in other situations of bilingualism to supplement and correct the theoretical models obtained during the study.

Thus, the method of compiling bilingual mini-corpora allows us to reach the level of generalization of linguistic facts sufficient to build universal models for interpreting the patterns of choice of the language of etiquette expressions in bilingualism.

Discussion

The described technique has an advantage over other methods of language choice research, in particular, over the quantitative method of content analysis, in that it allows obtaining objective data on a limited number of language facts. In comparison with the questionnaire method, often used in sociolinguistics, the technique using the data of the mini-corpus involves the elimination of the speaker's self-reflection and, consequently, the influence of his self-perception on the results of the analysis, as well as significantly reduce the influence of the observer on the speaker.

The considered technique continues the traditions of the method of unconnected observation and allows us to bring this method to a new level, using video materials of spontaneous speech as a database, which greatly facilitates and accelerates the work of the researcher.

 In addition, the proposed methodological development made it possible to identify a promising direction for the research of language choice – incomplete correspondence of the survey data and analysis of video materials.

The disadvantage is the complexity of collecting and compiling the corpus database, which at this stage is associated with problems of its automation, since each unit must be considered and described according to several parameters (time of mention, place, characteristics of the speaker, etc.). 

References
1. Golubkova, E. E. (2015). The use of linguistic corpora in solving semantic problems. Methods of cognitive analysis of word semantics: computer-corpus approach, 39-80 pp. Moscow, YASK Editorial house.
2. Kirpikov, A. R. (2018). Qualitative content analysis as a research method. XXI International Conference in memory of Professor L. N. Kogan "Culture, personality, society in the modern world: methodology, experience of empirical research", 22-23 of March, Ekaterinburg: URFU, 67-74.
3. Labov, U. (1975). On the mechanism of language changes. New in linguistics, Moscow,V. VII, Progress Ed.
4. Soloveva, A. A. (2019). The degree of influence of the parameter of social status on the choice of speech etiquette in bilingualism. World of Science. Sociology, philology, cultural studies, 10(1), 10. Retrieved from https://sfk-mn.ru/PDF/10FLSK119.pdf
5. Soloveva, A. A. (2022). Etiquette expressions in bilingualism: a sociolinguistic analysis, pp. 98-103. Moscow, Russcience Ed.
6. Suleimanova, O. A. (2020). Principles and methods of linguistic research, 2nd edition, Languages of the world Ed.
7. Chernyavskaya, V. E. (2018). Discursive analysis and corpus methods: Necessary link of evidence? Explanatory possibilities of qualitative and quantitative approaches. Questions of cognitive linguistics, 2, 31-37.
8. Formanovskaya N. I. (2008). Russian speech etiquette: Linguistic and methodological aspects, Moscow, «URSS».
9. Frumkina, R. M. (1980). Linguistic hypothesis and experiment (on the specifics of hypotheses in psycholinguistics). Hypothesis in modern linguistics, pp. 183-216. Moscow: Science Ed.
10. Jakobson, R. (1960). Closing Statement: Linguistics and Poetics. Style in Language. Ed. by Thomas A. Sebeok. – Cambridge, Mass.; New York; London: The Technology Press of Massachusetts Institute of Technology; John Wiley & Sons, Inc.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

Language units of different levels also provide for a difference from the analysis, as the author of the reviewed work notes, "the collection and processing of language units should correspond to the goals and objectives of the study. In addition, the methods must have the logic of scientific analysis of linguistic tasks proper." The paper pointwise examines a special category of language units – speech etiquette, in my opinion, the choice is quite justified, while "the purpose of this study is to build a step-by-step methodology for creating a mini-corpus necessary for conducting research on etiquette expressions in situations of bilingualism." The material has a verified structure, having determined a stable and objectively formulated set of tasks, the author proceeds to a point analysis of the problem. All stages of the study are commented on, first of all, the introductory block focuses on a full explanation of the essence of the assessment: "two bilingualism situations were chosen for the study – Northern India and Catalonia (Spain). The parameters of these language situations correspond to the required ones. So, in both situations, there is a linguistic variation. They are also influenced by similar parameters and factors of the social context, which makes it possible to compare them." In my opinion, the chosen methodology – content analysis - is quite suitable for possible verification of the problem, "in this study, the content analysis method and the methodology developed on its basis logically continue the methods of sociolinguistic research listed by U. Labov, since it is based on linguistic facts obtained during non–included observation." The reviewed material has a pronounced practical character, it is convenient to use it in the formation / conduct of thematically related research. In fact, throughout the entire composition, the nominated style correlates with the scientific type, no serious errors or violations have been revealed. The main block of the article is an analytical section of the evaluation of mini-corpora of speech etiquette units based on video materials. I think that data systematization is quite successfully presented in the form of tables, graphical /visual generalization is most productive for language research. The author reasonably introduces the so–called intermediate conclusions into the work, because this allows a potentially interested reader to follow the course of thought development: for example, "thus, based on the data in the table, it can be assumed that the discrepancy between the linguistic context and the language of etiquette expressions in most groups is evidence that the choice of EV language is determined not only by the language of the context. It is logical to assume that such a choice may also be influenced by factors and parameters of the social context. And this conclusion is valid for both India and Catalonia, which indicates similar general principles of choosing the language of etiquette expressions," or "thus, the method of compiling bilingual mini-corpora allows you to reach the level of generalization of linguistic facts sufficient to build universal models for interpreting the patterns of choosing the language of etiquette expressions in bilingualism" and etc. I believe that the research has achieved its goal, the tasks set have been solved; the material is interesting, relevant, and new within the framework of the chosen language field. I recommend the article "The methodology of using mini-corpora to analyze the linguistic variation of etiquette expressions of bilingual linguistic situations in the sociolinguistic aspect" for open publication in the journal "Litera".