Stellenbosch Papers in Linguistics Plus, Vol. 42, 2013, 93-110 doi: 10.5842/42-0-170 The pragmatic markers anyway, okay, and shame: A South African English corpus study1 Kate Huddlestone and Melanie Fairhurst Department of General Linguistics, Stellenbosch University, South Africa E-mail: [email protected] Abstract Pragmatic markers are “a class of short, recurrent linguistic items that generally have little lexical import but serve significant pragmatic functions in conversation” (Andersen 2001:39). While pragmatic markers are receiving growing consideration in the literature, pragmatic markers in South African English have been given little attention compared to other varieties of English. This paper provides a description of the distribution and functions of the pragmatic markers okay, anyway and shame as they occur in the spoken component of the South African version of the International Corpus of English (ICE). Using the commercially available Concordance program, WordSmith Tools, all instances of okay, anyway and shame were identified in the corpus and all non-pragmatic marker instances were then excluded. The remaining instances of okay, anyway and shame were then hand-coded to determine the primary functions that these elements exhibit. The classification of the functions of the pragmatic markers was carried out according to Fraser’s (1996, 1999, 2006) framework for identification of pragmatic markers. The findings of the corpus investigation included identifying the functions of okay as both a conversation-management marker and a basic marker, as well as its role in turn-taking. Anyway was found to function as an interjection, a mitigation marker, a conversation-management marker and a discourse marker. Shame, as a uniquely South African pragmatic marker, was found to function both as an interjection and as a solidarity marker, as an expression of sympathy or sentiment. Keywords: discourse markers, pragmatic markers, South African English, corpus linguistics 1. Introduction Africa offers many opportunities to study both New Englishes and World Englishes. While first-language varieties of South African English (SAE) are not considered to be New Englishes, South Africa’s many other languages have had a profound effect on the variety of the English language that is spoken in the country today, making it quite unique, as Crystal (2008:143) concurs: 1 This paper is partially based on work done for the second author’s Master’s thesis (Fairhurst 2013). http://spilplus.journals.ac.za 94 Huddlestone and Fairhurst I had studied the evolution of South African English over the years. There is nothing quite like it in the English-speaking world. The vocabulary is the really striking thing. It is hugely distinctive and diverse, thanks to the number of languages which feed it. There are eleven official languages in South Africa. Each one borrows wildly from the others. And English borrows most of them all. The starting point for the study from which this article developed was the desire to delve into some of what makes SAE so unique. One aspect of a language that is strongly influenced by culture is that of pragmatics, how language is used and interpreted in context. The decision was made therefore to focus on pragmatic markers, as part of the vocabulary of SAE, and in light of their important role in contributing to pragmatic meaning. Such markers add little, if anything, to the semantic content of an utterance. Rather, they provide information on the speaker and on the speaker’s attitude, among other aspects of the linguistic situation. Due to the nature of pragmatic markers, they are thought to reflect a speaker’s cultural and linguistic background, and so to be ideal for contributing to an examination of what makes a particular first-language variety of English unique. Aijmer and Simon-Vandenbergen (2009) note that most studies of pragmatic markers place the emphasis on (spoken) corpus data, as corpora “make it possible to investigate the distribution of pragmatic markers in speech and writing and in different registers”. For this reason, we elected to work with the International Corpus of English (ICE) for South Africa, ICE-SA, as this was the only spoken language corpus of SAE we were able to gain access to at the time. Various researchers have examined different aspects of SAE, including non-native varieties such as Black SAE, with some even making use of corpora (cf. Mesthrie 1992, 2002; Gough 1996; Lass 2002; De Klerk and Gough 2002; Jeffery 2003; Jeffery and Van Rooy 2004; De Klerk 2005; De Klerk, Adendorff, de Vos, Hunt, Simango, Todd and Niesler 2006; Da Silva 2007 and Bekker 2009). However, there is little research to be found on pragmatic aspects of SAE. This article will give a brief historical description of the variety of English examined in the study, SAE, followed by a general characterisation of pragmatic markers. A brief sketch of the field of corpus linguistics will then be provided, including a description of the corpus and the methodology used in the study. Finally, the data analysis and discussion will conclude the article. 2. South African English The English language holds a very interesting place in the South African linguistic landscape which goes back to when the British took over the government of the Cape Colony from the Dutch in 1795. The early years of British rule in South Africa centred on the Cape as a stopover for ships travelling to and from the East. Most of the English speakers living in the Cape at the time were military and government officials. In the late 1810s, Britain decided to expand their hold on South Africa and to start settling in some other areas of the country. The main goal at the time was to create a buffer between the Xhosa-occupied Eastern Cape and the British-settled Western Cape. For this purpose, the British government started providing assisted passage and land grants in the Eastern Cape, around the Fish River (Mesthrie 2002:108). In 1820, a group of about British 5000 settlers arrived in the Eastern Cape. Although the English speakers were, http://spilplus.journals.ac.za The pragmatic markers anyway, okay, and shame 95 at the time, outnumbered by the Dutch speakers, Lord Charles Somerset declared English to be the official language of the Cape Colony in 1822 (Mesthrie 2002:108). Even in the Boer Republics, which were established in the Free State and Transvaal, English was considered to be the language of the well-educated (Mesthrie 2002:109). In the 1840s and 1850s, a second large wave of settlers arrived in the Natal region. The third and most diverse wave of settlers, however, arrived from around 1875–1904, when gold was discovered and first came to be mined in the Witwatersrand. Although the settlers from the different waves mentioned would have brought with them different dialects and varieties of English, it would seem that “standard” SAE was mostly influenced by the first English-speaking settlers from the 1820s (Mesthrie 2002:109). English has a fair distribution throughout South Africa, as both a first and second language, although it is more prominent in the metropolitan and urban areas. English in South Africa is not monolithic; it has a wide range of varieties. Clear distinctions can be made between White SAE, Coloured SAE, Indian SAE and Black SAE, with the latter being a predominantly second- language variety of English. Many people speak an African mother tongue at home, but go through their school careers in English; because of this, “South Africa’s second-language varieties of English are heavily marked at every level of linguistic structure by the primary language of their speakers” (Kamwangamalu 2006:162). This is of particular interest to the current study because, although all the data collected for the study were from people who received their schooling in English to matriculation level or beyond, their English might be marked by specific features if they are fluent bilinguals or multilinguals, or if English is not their mother tongue. 3. Pragmatic markers Pragmatic markers (PMs) serve several purposes in discourse. One of their primary functions is to point to features of the context indexically (Schiffrin 1987). Aijmer and Simon- Vandenbergen (2009) further characterise PMs as reflexive, because they comment on the utterance, and thus assist in the interpretation thereof. Östman (1995, cited in Aijmer and Simon-Vandenbergen 2009) refers to PMs as the “windows” that hearers use to make deductions and assumptions about the speaker’s attitude and opinion. Holker (1991, cited in Aijmer and Simon-Vandenbergen 2009) lists four key features which can be used to characterise PMs: (i) PMs do not affect the truth conditions of an utterance; (ii) PMs add nothing to the propositional content of an utterance; (iii) PMs are related to the speech context or situation, rather than to the situation under discussion; and (iv) the function of the PM is emotive and expressive, rather than referential, denotative or cognitive. PMs have been studied in various fields in linguistics, and the definition of a PM depends greatly on the linguistic approach that is taken in a particular study, which also influences whether or not an element is considered to be a PM. For this reason, the same element has also been referred to, variously, as “discourse particle”, “pragmatic marker”, “segmentation marker”, “modal particle” and “pragmatic particle”. In this paper, we use the term “pragmatic marker” and focus on the uses of PMs as outlined by Fraser (1996, 1999, 2006). http://spilplus.journals.ac.za 96 Huddlestone and Fairhurst The first type of PM is the basic PM, with such markers conveying the illocutionary force of the speaker. The second type of PM is the commentary marker, which is used to indicate the fact that the following segment of discourse is connected to the previous segment. There are several types of commentary markers laid out by Fraser (1996, 1999, 2006). The third type of PM identified by Fraser is the parallel marker, which, in contrast to a commentary marker, is used to indicate that the following segment of discourse is separate from the previous segment. One of the subtypes of the parallel-marker type of PM is the conversation-management marker. The fourth and final type of PM is the discourse marker (DM). 4. Corpus linguistics In the language sciences a corpus is a body of written text or transcribed speech which can serve as a basis for linguistic analysis and description. Over the last three decades the compilation and analysis of corpora started in computerized databases has led to a new scholarly enterprise known as corpus linguistics. (Kennedy 1998:1) The compiling of corpora for linguistic purposes has been performed since the 1950s; however, the field expanded significantly with the rise in computer technology. According to Baker (2007:1), corpus linguistics involves using “large bodies of naturally occurring language data stored on computers”, as well as “computational procedures which manipulate this data in various ways”, in order to find linguistic patterns. Stegmeier (2012) provides a summary of the different research perspectives that can be adopted for corpus linguistics, as is illustrated in Figure 1. Figure 1. Research perspectives in corpus linguistics (Stegmeier 2012:96) The present study falls under the quantitative/qualitative aspect of corpus linguistic research, as both small-scale statistical and context-based data are presented and analysed. The corpus used in the current study originated as part of the ICE project, which aimed to compile parallel corpora of varieties of contemporary English (Nelson 2006). The ICE corpora have a common corpus design and a common methodology (Greenbaum 1996), and data are collected for the project only in countries where English is either the first language of, or is used as a second official language by, adult speakers of the language. http://spilplus.journals.ac.za The pragmatic markers anyway, okay, and shame 97 4.1 International Corpus of English The ICE corpora consist of 200 samples of written texts and 300 samples of spoken texts, all 2000 words in length, making a total of one million words for each corpus. The samples are drawn from several specified aspects of day-to-day life (see Table 1 as an illustration of how an ICE corpus is compiled). Table 1. Design of ICE corpora SPOKEN Dialogues Private Face-to-face conversations (90) (300) (180) (100) Phone calls (10) Public Classroom lessons (20) (80) Broadcast discussions (20) Broadcast interviews (10) Parliamentary debates (10) Legal cross-examinations (10) Business transactions (10) Monologues Unscripted Spontaneous commentaries (20) (120) (70) Unscripted speeches (30) Demonstrations (10) Legal presentations (10) Scripted Broadcast news (20) (50) Broadcast talks (20) Non-broadcast talks (10) Student essays (10) WRITTEN Non- Student writing (200) printed (50) (20) Exam scripts (10) Letters Social letters (15) (30) Business letters (15) Printed Academic writing Humanities (10) (150) (40) Social sciences (10) Natural sciences (10) Technology (10) Popular writing Humanities (10) (40) Social sciences (10) Natural sciences (10) Technology (10) Reportage Press news reports (20) (20) Instructional Administrative writing (10) Skills/Hobbies (10) writing (20) Persuasive writing Press editorials (10) (10) Creative writing Novels and short stories (20) (20) http://spilplus.journals.ac.za 98 Huddlestone and Fairhurst Although the ICE corpora can stand alone as a useful tool for research, their true value comes from the fact that they are exactly comparable, and therefore indispensable to today’s study of World Englishes. 4.2 ICE-SA SAE was originally not going to be included in the ICE corpora, due to political reasons; however, this ban was eventually lifted and research began in June 1992 (Jeffery 2003:341). Chris Jeffery of the University of Port Elizabeth was the lead researcher from the start, but worked with teams collecting data from all over the country. The initial plan was that all the data used would be collected between 1990 and 1996. The set time frame, however, proved to be too restrictive and so was left open-ended. The population to be sampled had to be 18 years of age or older, and they had to have completed their education in English up to matriculation level (Jeffrey 2003). This corpus has yet to be released via the ICE website and was made available to the researchers by Bertus van Rooy (NWU), who, through his collaboration with Jeffery and in his role as director of the South African component of the International Corpus of Learner English (ICLE) project, now has control of the ICE-SA project. Table 2 provides a statistical characterisation of the make-up of ICE-SA’s spoken component. As can be seen from the number of tokens (running words) in the text, ICE-SA is not complete, falling approximately 200 000 words short of the 600 000 word target for ICE corpora. As Jeffery (2003:343) notes, certain categories, specifically the Spoken Monologue section, are difficult to fill, while access to private telephone calls is also problematic. It is worthwhile noting that about half of the words in the corpus are contained in what can be characterised as private conversations/dialogues, which one could argue are the most authentic types of spoken discourse. In this respect, then, one can consider the ICE-SA corpus to be sufficiently representative of SAE, taking into account its current size. Table 2. Statistical composition of ICE-SA ICE-SA Overall tokens (running words) in text 407 254 tokens used for word list 404 285 types (distinct words) 19 240 type/token ratio (TTR) 5 Unfortunately, where the pre-final state of ICE-SA is somewhat of a hindrance to comprehensive corpus analysis is in the lack of mark-up in a portion of the transcriptions that comprise the corpus, specifically, in certain transcriptions of face-to-face conversations. Furthermore, some might see the fact that the corpus is not tagged as a drawback, however, as Hunston (2002:93) points out, “the categories used to annotate a corpus are typically determined before any corpus analysis is carried out, which in turn tends to limit, not the kind of question that can be asked, but the kind of question that usually is asked”. As the present study is corpus- driven, pre-tagged text is not required; rather, the raw text is examined directly and, as Sinclair (2004:191) notes, “patterns of this uncontaminated text are able to be observed”. One final problematic aspect of the spoken component of ICE-SA is the apparent lack of comprehensive metadata for all the texts included in the corpus. While Jeffery (2003:343) states http://spilplus.journals.ac.za The pragmatic markers anyway, okay, and shame 99 that, for example, “each speaker’s population group is identified in the header”, identifying metadata – including speakers’ sociological and linguistic background – is not consistently indicated across all the texts included in the corpus. 5. Methodology Statistics on the composition of the corpus were determined using the Concordance program WordSmith Tools (Scott 2012). As can be seen in Table 2, the total number of words for the spoken component of ICE-SA is approximately 400 000 words, with an overall type/token ration2 (TTR) of 4.75. An initial search was undertaken to determine the prevalence of various pragmatic markers, specifically, anyway, but, I mean, ja, just, like, no, now, oh, okay/ok, right, shame, so, well and you know. Figure 2 presents a screenshot of the WordSmith concordances of shame in the ICE-SA corpus as an illustration of the results of such a search. Figure 2. Wordsmith Tools screenshot The choice of these specific markers was determined by various factors. Firstly, we considered the literature to determine which specific PMs had been examined as particularly representative of culture or group. Secondly, we considered Fraser’s (1996) categorisation of PMs when looking at representative PMs of different categories. Thirdly, we used our own intuitions about which PMs are likely to be unique to SAE. As one of the characteristics of PMs is “multi- categoriality” (Schourup 1999:234), it was essential to determine which of the instances in the search results were non-PMs, and exclude them from the analysis. Given that the scope of the study from which this article developed was limited, we therefore restricted our subsequent 2 Number of Types divided by Number of Tokens times 100. http://spilplus.journals.ac.za 100 Huddlestone and Fairhurst investigation to three PMs, namely okay, anyway and shame, based on their prevalence in the corpus3, and, in the case of shame, on its uniquely South African nature. Once the concordance list of total occurrences for each word had been obtained, they were examined, line by line, and all the instances of PMs were selected. Figure 3 graphically represents, for each corpus, the total number of instances of each word found versus the number of instances of that word as a PM. Interestingly, while the prevalence of the PMs okay and shame make up 90% and 95% of the total number of occurrences of these elements, respectively, anyway occurs as a PM only 55% of the time. Total occurences PM occurences 7 5 3 6 2 3 4 9 1 0 3 1 8 6 3 3 okay/ok anyway shame Figure 3. The total and PM occurrences of okay, anyway and shame in each corpus 6. Data analysis and discussion In this section, we discuss the various occurrences of the three PMs, characterising their distribution and identifying and illustrating the primary functions that these PMs perform in the ICE-SA corpus, as representative of educated SAE. 6.1 Okay The PM okay (and its alternate OK/ok) is the most frequent of the three PMs. The PM okay occurs in various utterance positions in the ICE-SA corpus. Approximately 40% of the instances of okay occur in utterance-initial position or as the only element in an utterance. The second- most prevalent position for okay is in utterance-final position, followed by its occurrence in utterance-medial position. However, given the nature of transcribed speech, with its lack of prosodic indications, it is possible that a more accurate analysis of some instances of utterance- medial okay would be as utterance-initial or -final. For a small number of occurrences of okay, it is not possible to determine what positions they occupy, although in all such cases, okay occupies an utterance-peripheral position. Figure 4 presents a graphical representation of the number of times okay as a PM occurs in each utterance position in the ICE-SA corpus. 3 Results of less than 500 concordance lines. http://spilplus.journals.ac.za The pragmatic markers anyway, okay, and shame 101 okay 11 33 occurs in isolation 89 occurs in utterance-initial position occurs in utterance-medial position 108 occurs in utterance-final position unable to determine 86 Figure 4. Utterance position of PM okay in ICE-SA Examples (1)-(3)4 illustrate the occurrence of okay in the three utterance-related positions, respectively. (1) <$A> Okay As I mentioned in the beginning uhm as a scientist (SAE, s2a-027) (2) … Art is essentially mysterious okay and truth has to be comprehensible otherwise it's not truth … (SAE, s1b-003) (3) <$H> … You mustn't take it any more OK <$K> The doctor said I must take it (SAE, s1a-083) Gaines (2011:3292) notes that various studies of the PM okay have shown that it performs “an almost bewildering array of functions”. Some of these functions observed in the corpus will be discussed and illustrated, after which an analysis of the distribution of this element in the corpus will be provided in order to highlight some interesting aspects of this PM. The PM okay is able to serve several functions in the utterance-initial position. One function of okay in this position is to draw attention to the speaker, as illustrated in (4). In terms of this function, okay plays an important role in the indication of turn-taking. This function of okay is a way for the speaker to acknowledge their turn and to prepare to speak. (4) <$C>I can't say that <$A><#>okay that's uh uh now I want to ask you why did didn't you stop the vehicle you were just nine metres behind the vehicle (SAE, s1b-066) Another function of the PM okay is that of introducing a new topic. As with the turn-taking use of okay illustrated in (4), in cases such as that illustrated in (5), okay functions as a parallel marker (Fraser 1996:168). Specifically, okay is used as a conversation-management marker, a 4 The examples have been presented in this article as they appear in ICE-SA. http://spilplus.journals.ac.za 102 Huddlestone and Fairhurst subtype of parallel markers (Fraser 1996:168), as the speaker uses it to steer the conversation towards a forgotten or unrelated topic. (5) … and the high density plastic Both of them are recyclable okay the question is what happens to the stuff when once we collect (SAE, s2a-027) When okay appears as a PM in the utterance-final position, it serves one of two functions. The first function, as with okay in utterance-initial position, has to do with turn-taking. Okay acts as an indicator that the speaker has finished speaking, and that it is now the other individual’s turn to start talking. As was mentioned before, the next speaker will often start their turn with the PM okay to reinforce the turn transition. Beach (1993:341) refers to this function as a “projection device for turn and, at times, speaker transition”. The simplest reason for okay appearing in the utterance-final position is the fact that the speaker is giving others the option of asking for clarification of what they he/she has just said, a so- called “tag-positioned comprehension check” (Broderick and Broderick 2003, cited in Gaines 2011:3292). As indicated in Table 1, a portion of the corpus is made up of classroom interactions and unscripted speeches such as those found in the lecture hall. In such educational situations, the educator is often seen to end an utterance with okay. The utterance functions as an informal way of asking whether the students have understood what has been said, and whether they are ready to move on to the following aspect of the topic. This is illustrated by the utterance in (6). (6) the history of or the narrative of spirit on the way to truth <,>okay That’s not a problem for him (SAE, s1b-003) The final position in which okay appears is the utterance-medial position. Okay occurs in this position for several reasons. A primary reason is that the speaker needs to pause in order to collect their thoughts, but does not want the pause to be silent, as demonstrated in (7). In these instances it performs a gap-filling function. (7) <$A> Then the only thing that I want OK is just an explanation from you (SAE, s1b-004) In some cases, okay appears in the utterance-medial position, acting, however, as if it were in the utterance-initial position. Such action occurs when the speaker is reporting speech. Often a speaker starts reported speech in exactly the same way in which the speech was given, starting with the PM okay, as we see in (8). (8) What about if we collect the stuff and we say to people OK I'll give you the bread but in exchange I want one bag of plastic (SAE, s2a-027) These examples above illustrate that okay as a PM appears most prominently as a conversation- management marker (Fraser 1996:185), as it is used to control the flow of the conversation. In such control, okay is often used to take the floor or to introduce reported speech, thought processes or an offer. In terms of the distribution of okay in the corpus, there are some interesting observations to be made. Firstly, the highest number of occurrences of okay for one single text was found in an http://spilplus.journals.ac.za
Description: