Research Article https://doi.org/10.12973/ijem.8.1.55 International Journal of Educational Methodology Volume 8, Issue 1, 55 - 68. ISSN: 2469-9632 https://www.ijem.com/ Preliminary Indicators of EFL Essay Writing for Teachers’ Feedback Using Automatic Text Analysis Rong Phoophuangpairoj Piyarat Pipattarasakul* Rangsit University, THAILAND Rajamangala University of Technology Krungthep, THAILAND Received: October 19, 2021 ▪ Revised: December 2, 2021 ▪ Accepted: January 31, 2022 Abstract: During the pandemic of Coronavirus disease 2019 (COVID-19), English as a foreign language (EFL) students have to study and submit their assignments and quizzes through online systems using electronic files instead of hardcopies. This has created an opportunity for teachers to use computer tools to conduct preliminary assessment of the students’ writing performance and then give advice to them timely. Hence, this paper proposed some indicators which were essay readability scored by Flesch Reading Ease (FRE), length of essays, errors in writing and a method to assist the teachers in providing writing feedback to the students. The results showed a large difference in FRE, the number of words, sentences, paragraphs and errors. The K-means clustering findings were applied to classify groups of students based on writing proficiency indicators. The findings also revealed that the number of words and sentences in the essays indicated some deficiencies. The concept of paragraph should be reinforced while some specific errors such as misspelling, grammatical and typographical errors found need to be eliminated. This study showcased that the computer tools should be integrated to process the students’ essays so that the teachers can pinpoint the problems and make suggestions to their students in appropriate time. Lastly, the results can be served as the guidelines for the teachers to develop and adjust teaching materials pertinent to writing to enhance the writing performance of EFL learners. Keywords: Clustering, EFL, essay writing feedback, readability, writing errors. To cite this article: Phoophuangpairoj, R., & Pipattarasakul, P. (2022). Preliminary indicators of EFL essay writing for teachers’ feedback using automatic text analysis. International Journal of Educational Methodology, 8(1), 55-68. https://doi.org/10.12973/ijem.8.1.55 Introduction Writing is an essential communication skill that has tailored English as a foreign language (EFL) learners to fulfill professional achievement in education and business realms as written communication via emails and other communication technologies have tremendously played a crucial role in international workplaces (Kongkaew & Cedar, 2018; Rattanadilok Na Phuket, 2015; Watcharapunyawong & Usaha, 2013). To master in writing, learners need to possess a variety of writing competences such as vocabulary knowledge, grammatical structures, organization, punctuation and spelling (Anh, 2019; Kampookaew, 2020). As a result, writing is regarded the hardest skill to acquire for EFLs and writing teachers need to spend substantial amount of time and energy to enhance writing quality and proficiency of their students (Watcharapunyawong & Usaha, 2013). Therefore, teachers are the key mentors and facilitators to analyze the students’ weaknesses so as to minimize errors and strengthen their writing skills. Kamberi (2013) concluded that teacher feedback is an important tool in improving writing skills reflected from the students’ perspectives. Many English language learners preferred the feedback that was informative on the content and provided metalinguistic explanations for orthographic and grammatical errors (Bastola & Hu, 2021; Zhang et al., 2021). However, inadequate and delayed feedback might hinder the development of writing performance of the learners (Kim, 2018; Sepasdar & Kafipour, 2019). Numerous studies on writing errors have been prevalently conducted in Thai context, revealing that grammatical accuracy is the area Thai EFL teachers need to emphasize most critically. Apparently, grammatical competence can be defined as the ability to produce correct sentences in writing assignments (Kampookaew, 2020). It can be observed that EFLs still have limited grammatical competence in the sense that they could compose short simple sentences, but not the complex ones (Cahyono et al., 2016). Thus, Wali and Madani (2020) asserted that EFL learners practiced paragraph writing as the foundation for other forms of academic writing development. * Corresponding author: Piyarat Pipattarasakul, Faculty of Liberal Arts, Rajamangala University of Technology Krungthep, Bangkok, Thailand. [email protected] © 2022 The Author(s). Open Access - This article is under the CC BY license (https://creativecommons.org/licenses/by/4.0/). 56 PHOOPHUANGPAIROJ & PIPATTARASAKUL / Preliminary Indicators of EFL Essay Writing Nowadays, computers have been integrated to play a monumental role in language education as technology can do many tasks, such as searching and computing faster than humans. During the spread of global pandemic, online education has replaced face-to-face classes and facilitated the interactions among the students and the teachers (Rapanta et al., 2020). This has turned the crisis into an opportunity in the sense that digital technologies overcome some teaching/learning restrictions by removing the physical and context limitations. In particular, Alavi (2021) posited that the writing courses greatly benefit from the online classes in a way that the learners can post their work through the online platforms. Online teaching context allows the teachers to provide feedback to the students in form of the returned assignments or exam answers in files. Several computer tools can be used to identify writing errors and compute readability of an essay. Readability level of a passage is measured by the average number of syllables in one word together with the number of words in a sentence (Zamanian & Heydari, 2012). The high readability level or high FRE indicates that the students compose the essays comprising many short words with low number of words in the written sentences. This might imply that the students have insufficient grammar knowledge. When the readability is low, readers need to spend significant effort to untangle overly complex sentence structures and vocabulary. Automatic Readability Tool for English (ARTE) is a program that can automatically calculate a variety of readability formulas for texts (Choi & Crossley, 2020). Grammar and Mechanics Error Tool (GAMET) is a computer program which aids in identifying different types of errors such as duplication, grammar errors, misspelling, typography and white space errors. GAMET can be used to check hundreds of English essays at the same time and provide the error-checking results of each plain text file. Tool for the Automatic Analysis of Cohesion (TAACO) is a tool which was recently developed for text cohesion indices (Crossley et al., 2015). TAACO is used to find the number of words, sentences and paragraphs in each text file. ARTE, GAMET and TAACO have advantages over other programs for the reason that they can be easily installed in the hard drive and process many text files simultaneously without internet access. A clustering algorithm is used to automatically classify data into groups. The K-means clustering can be used to group data into K groups. K-means clustering finds means of the K clusters in which data points are allocated. The data points are allocated by trying to minimize the distances of them and the k centroids. The centroid of each cluster then is computed from data in each cluster. It is an iterative process of computing the centroids of clusters and finding the members of each cluster by keeping clusters as small as possible until the data do not change their clusters (Shovon & Haque, 2012). The centroid or mean of each group is used as the representative of each group’s data. Students can be divided into groups based on the readability grade levels, readability scores, the number of errors in an essay, etc. Teachers can classify students into groups according to their desired attributes of students’ essays. Using the clustering such as K-means, the teachers do not have to sort the data and classify data into groups by themselves. Hence, this work proposed a method to provide rapid writing feedback using automatic text analysis. Computer tools were used to find indicators consisting of readability scores, the number of words, sentences and paragraphs and writing errors for the teachers to make decisions on giving feedback to the students, contributing to enhanced writing performance. In this work, K-means clustering was studied to group the essays of EFL students based on the information obtained from the computer tools. The teachers can use these indicators as the guidelines to give rapid feedback to their students. Research objectives The objectives of this study were to answer the following research questions. (1) To study the writing indicators consisting of readability of essays, the number of words, sentences, paragraphs and errors in the essays written by Thai EFL students (2) To propose a method of combining a clustering technique with the indicators to provide feedback Literature Review Feedback is indispensable to improve the writing ability of the students since it is often executed in EFL writing courses all over the world (Lv et al., 2021). The methods to give feedback have been studied by several researchers. Hartshorn (2008) postulated that feedback must be manageable, timely, and meaningful. Providing feedback requires tremendous and continuous efforts from both teachers and students in a way that the teachers need to dedicate their time and energy to correct piles of papers. Meanwhile, they also need to keep track of the students’ progress and make decisions upon which points to reinforce and then make meaningful remarks individually. For the students, they should be committed to produce the work pieces (Alvira, 2016). For years the Coronavirus disease has menaced the global citizens gravely. Most educational institutes have transformed classroom-based teaching to online learning. It is stated that about 1.2 billion children in 186 countries have been affected by the school closures (Ho & Tai, 2021). This has immensely caused learning losses in education. In connection with writing courses, Yang (2016) suggested that online peer feedback could improve the writing quality of the students (Huisman et al., 2019; Noroozi & Hatami, 2019; Pham et al., 2020). International Journal of Educational Methodology 57 Trends of writing research have shifted focus from the writing process to study the relationship between the characteristics of the final text and quality indicators (Crossley, 2020). This is partly because the writing processes and writing strategies might be implicit whereas the text characteristics are concrete evidence which can be easily and promptly identified, contributing to enhanced writing performance. With regard to errors, researchers have drawn attention to investigate the errors in writing because the results could be used to analyze the causes which are important attributes to reflect the students’ competence (Khumphee & Yodkamlue, 2017; Kuptanaroaj, 2019; Rattanadilok Na Phuket, 2015; Waelateh et al., 2019). Consequently, these errors should be minimized. A wide range of research on the types of errors made by EFL students included grammatical and lexical errors. For example, Kuptanaroaj (2019) analyzed grammatical errors in English essays written by the undergraduate students. The results revealed that sentence fragment was the most frequently committed errors (218 errors or 20.57%). The second most frequent error was the use of subject-verb agreement (168 errors or 15.85%), followed by the use of noun endings (144 errors or 13.58%), while the type of grammatical errors with the lowest frequency of occurrence was the word order (16 errors or 1.51%). Moreover, Khumphee and Yodkamlue (2017) investigated common types of grammatical errors based on their frequency of occurrence in English essay writing of Thai EFL undergraduate students. Suvarnamani (2017) studied grammatical and lexical errors, particularly tenses, fragment, and collocation errors, found in the descriptive writing of the first year Arts students at Silpakorn University. Narrative essays composed by Thai university students were analyzed by Rattanadilok Na Phuket (2015), reporting that the students made errors about translation from Thai words most frequently. Other errors were related to word choice, verb tense, preposition and comma. Readability refers to the level of comprehensibleness of texts, which depends on the number of words in a sentence and syllables in a word (Turkben, 2019). The FRE grades a text on a 0–100 scale (Flinton et al., 2018). The higher readability means the easier that an essay can be read. For example, FRE readability scores of 0-50 mean that the text is very difficult and the approximate reading grade level of the text is higher education. Yulianto (2019) stated that the best text contained shorter sentences and words of which score range 60-70 is widely acceptable. The reading difficulty is stemmed from longer sentences and polysyllabic words. The measurement of text readability, known as readability indices such as Flesch-Kincaid Reading Ease and Flesch-Kincaid Grade Level, was used to investigate writing tasks of the students who registered ENG214 English Writing and the high average value of Flesch-Kincaid Reading Ease of 83.2 was reported (Rodsawang, 2017). In general, the existing automated essay scoring systems incorporate the length of essays as a quality indicator. It was assumed that the longer essays tended to have better quality than the short ones (Lim et al., 2021). In contrast, the low number of words, sentences and paragraphs may be considered ineffective ones, resulting from the student’s insufficiency of lexical and grammatical knowledge. Aside from the pertinent factors previously mentioned, Macedo et al. (2018) applied K-means algorithm to classify the groups of distance learning students based on grammar errors when they studied online. Another relevant research conducted by Wright el al. (2018) used K-means 3-cluster procedure to categorize the students while they practiced English conversation online. Methodology Research Design To answer the research questions, ARTE was used to find the FRE readability index while GAMET was utilized to identify the errors in the essays. TAACO found the number of words, sentences and paragraphs in each essay. There are several readability formulas such as classic formulas like Flesch Reading Ease, Flesch-Kincaid Grade Level, Automated Readability Index and SMOG readability which could be easily derived by using ARTE, for providing the preliminary feedback. In this study, FRE was applied because it is integrated in famous writing software that is Grammarly and it is widely acceptable. Each tool was executed to find out the characteristics from the essays, all of which the essay text files were stored in a computer folder. Data Collection A total number of 96 English essays were collected from Thai English major undergraduate students enrolled in an Essay Writing course. In the writing quiz, the students were instructed to write a narrative essay on the topic of “New normal in your life during the global pandemic COVID-19” with a length of 3-5 paragraphs. The quiz was conducted online and the duration was three hours. The text files were submitted to the teachers. To process the writing quiz, the students’ names, student ID and the essay topic were removed from the texts before processing using the computer tools. 58 PHOOPHUANGPAIROJ & PIPATTARASAKUL / Preliminary Indicators of EFL Essay Writing Instruments ARTE, GAMET and TAACO were applied to analyze the students’ essays. These three programs were free computer tools that could be easily downloaded and installed in a computer while SPSS (Rangsit University License) was used to cluster the essays into groups. Analyzing of Data The results were analyzed and the K-means clustering was applied to group the essays into 3-5 groups based on the FRE, the number of words and the number of errors. The clustering method assisted the teachers in grouping the students based on their proficiency, enabling them to make decisions more effectively on providing advice and eventually adjusting the course to suit the students’ competence. Findings / Results Descriptive Statistics of the Preliminary Indicators Table 1 showed the number of essays, ranges, the minimum value, the maximum value, means and standard deviation of the preliminary indicators studied in this work. The ranges of each indicators presented the inhomogeneity of the students’ skills. As a result, the teachers could use these indicators to identify the students’ shortcomings and provide them timely feedback. Table 1. Descriptive Statistics of the Preliminary Indicators Range Minimum Maximum Mean Std. Deviation Flesch Reading Ease 37.31 47.70 85.01 69.83 8.44 Number of words 427 49 476 247.27 80.35 Number of sentences 27 3 30 16.70 5.56 Number of paragraphs 9 1 10 4.53 1.58 Number of errors 16 0 16 3.99 3.32 Number of grammatical errors 10 0 10 .97 1.46 Number of misspelling errors 8 0 8 1.59 1.76 Number of typographical errors 11 0 11 1.24 1.91 Number of duplication errors 1 0 1 .02 .14 Number of white space errors 2 0 2 .17 .50 Flesch Reading Ease Figure 1 showed the readability index of the essays composed by the students, which could help the teachers to know which essays obtained low or high FRE readability scores. These scores enabled the teachers to decide which essays needed investigating the vocabulary usage and sentence structures, in particular, the ones which had FRE higher than 80. Provided that the students wrote very high readability essays, it was assumed that they were incapable of using advanced vocabulary and constructing complex sentence structures. 100 80 60 E R F 40 20 0 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 1. FRE of Each Essay Number of Words, Sentences and Paragraphs in Each Essay Figures 2-4 illustrated the number of words, sentences and paragraphs in each essay. The results indicated that the number of words and sentences varied greatly among the essays. Figures 2-3 showed that a student wrote the short essay consisting of low number of words and sentences (49 words and 3 sentences), so the teachers should pay more attention to this student. International Journal of Educational Methodology 59 500 s d400 r o w f 300 o er 200 b m u100 N 0 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 2. The Number of Words in Each Essay 35 s e30 c n e25 t n e20 s f 15 o er 10 b m 5 u N 0 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 3. The Number of Sentences in Each Essay Figure 4 presented the number of paragraphs in each essay. The graph indicated that 5 students wrote the essays with 8 or 10 paragraphs and 4 students wrote the essays with 7 paragraphs. There were 3 students who composed the essays with only one paragraph. According to the testing instructions, the students were asked to compose an essay consisting of 3-5 paragraphs. Writing too large or small number of paragraphs may result from the fact that the students may not understand how to combine sentences into a paragraph and overly divide sentences into paragraphs. Therefore, it is necessary the teachers give feedback to their students to improve their performance. 10 hs 9 p 8 a r 7 g a 6 r a 5 p f 4 o r 3 e 2 b m 1 u 0 N 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 4. The Number of Paragraphs in Each Essay Errors in Essays Figures 5-10 exhibited the number of errors including types of errors derived from the GAMET. The results allowed the teachers to identify the students who committed high number of errors and then pay attention to such errors which were grammatical, misspelling and typographical errors to pinpoint the weaknesses of the students. Examples of the grammatical errors were shown in Table 2. 60 PHOOPHUANGPAIROJ & PIPATTARASAKUL / Preliminary Indicators of EFL Essay Writing 20 18 s r16 o r14 r e12 f o10 r e 8 b m 6 u 4 N 2 0 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 5. The Number of Errors in Each Essay 10 al c ti 8 r a m 6 m s ar ro 4 gr of er 2 r e b m 0 u 1 4 7 1013161922252831343740434649525558616467707376798285889194 N Essay no. Figure. 6. The Number of Grammatical Errors in Each Essay Table 2. Examples of the Grammatical Errors No. Sentences and errors (reported from GAMET) 1. Since the outbreak of COVID-19 Everything in the world has changed. (…Maybe a comma, question or exclamation mark is missing, or the sentence is incomplete and should be joined with the following sentence.) 2. Sometimes, we could misunderstanding and confused of online chatting. (The verb 'could' requires the base form of the verb: 'misunderstand') 3. Both in terms of life, society, economy, education, as well as national security that has changed too much to return to normal as before. (Probable usage error. Use 'and' after 'both'.) 4. First in life making my life more difficult. Because the disease can spread easily. (Maybe a comma, question or exclamation mark is missing, or the sentence is incomplete and should be joined with the following sentence.) 5. I want my normal life back anyway because I thinking of my friends. (Did you mean 'I am'?) 6. Next, my mind and my feeling is bad too because I need to stay at home and I can't go outside to meet a friends or anyone as much as I want that make me feel sad and upset. (Don't use indefinite articles with plural words. Did you mean 'a friend' or simply 'friends'?) 7. In conclusion, before pandemic is the best things in nowadays everyone need it to return our normal life. (nowadays is used without 'in'. Use simply: 'nowadays'.) 8. … you have get less income because COVID makes people don’t come outside just stay at home … (Use past participle here: 'got', 'gotten'.) International Journal of Educational Methodology 61 10 g n elli 8 p s 6 misors er of err 24 b m u 0 N 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 7. The Number of Misspelling Errors in Each Essay Examples of the misspelling errors were shown in Table 3. Table 3. Examples of the Misspelling Errors No. Sentences and errors (reported from GAMET) 1. I thing the government should be manage this situation better. (Did you mean 'think' or 'thinks'?) 2. For example, studying online is much different than going to university (Did you mean 'different 'from''? 'Different than' is often considered colloquial style.) 3. It’s very good weather and I can breath comfortable and I feel fresh. (Did you mean 'breathe'? 'breath' is a noun.) 4. I was really laughing at myself but I kept doing this everyday until it became my new routine. ('Everyday' is an adjective. Did you mean 'every day'?) 5. Everyone have to stay at home in stead of using vacation time somewhere beautiful. (Did you mean 'instead of'?) 6. Currently, Thailand has a epidemic and the consequence is that there are not enough hospitals for people infected with COVID-19 … (Use 'an' instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour') 7. I can't go outside by not wearing a mask and carry a alcohol spray and need social distancing. (Use 'an' instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour') 8. I rarely go out with my friends and i can't study at university. (Did you mean 'I'?) 12 al c hi 10 p a r 8 g os r of typerror 46 e b m 2 u N 0 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 8. The Number of Typographical Errors in Each Essay According to the findings, the main typographical problems of the students were punctuation marks such as period, commas, quotation and capitalization as shown in Table 4. 62 PHOOPHUANGPAIROJ & PIPATTARASAKUL / Preliminary Indicators of EFL Essay Writing Table 4. Examples of the Typographical Errors No. Sentences and errors (reported from GAMET) 1. I have to eat the same food every day. such as fried eggs or instant noodles because I can't go out. (This sentence does not start with an uppercase letter) 2. We can't go back to the old way of life, like eating at a dining table together. living a life called social distancing. (This sentence does not start with an uppercase letter) 3. I never thought there was a disease that could change the lives of so many people. until it intensified … (This sentence does not start with an uppercase letter) 4. Especially the trading of parcels and food delivery service or some restaurants have to screen every table to prevent the spread. and keep the distance of sitting between the dining table ... (This sentence does not start with an uppercase letter) 5. The daily lives of many people have to change in order to survive in age no matter what they do, they have to be very careful. due to COVID-19… (This sentence does not start with an uppercase letter) 6. Today I'm going to talk about the New normal in my life during the global pandemic COVID-19.” (Unpaired symbol: '"' seems to be missing) 7. The situation is not good right now. Finally he went out with his friends. (Did you forget a comma after a conjunctive/linking adverb?) 8. Therefore. I want COVID-19 to disappear so that no family would lose their family and return to normal life. (Did you forget a comma after a conjunctive/linking adverb?) The results in Figures 9 and 10 suggested that very few students made errors on duplication and white space. 2 n o ti a c uplirs 1 do of rr r e e b m 0 u 1 4 7 1013161922252831343740434649525558616467707376798285889194 N Essay no. Figure. 9. The Number of Duplication Errors in Each Essay To eliminate the duplication errors, the teachers could remind the students of rereading their work before submission. Examples of the duplication errors were shown in Table 5. Table 5. Examples of the Duplication Errors No. Sentences and errors (reported from GAMET) 1. For example, in the past, I could go to the the beach, ate some seafood, and spent my life to the fullest. (Possible typo: you repeated a word) 2. … learn to safe myself and learn to adapted of the the pandemic. (Possible typo: you repeated a word) 2 e c a p s e t whiors1 r of err e b m u 0 N 1 4 7 1013161922252831343740434649525558616467707376798285889194 Essay no. Figure. 10. The Number of White Space Errors in Each Essay International Journal of Educational Methodology 63 In similar to the duplication errors, the teachers could also mark the white space errors and present them to the students so that they were more cautious when using punctuations like commas, space and period. Examples of the white space errors were shown in Table 6. Table 6. Examples of the White Space Errors No. Sentences and errors (reported from GAMET) 1. …there was an epidemic ,it's COVID19.I followed the news every day during that time because I had never been born with an epidemic. (Put a space after the comma, but not before the comma) 2. I will be doing new activities in the university and get to know seniors, teacher and new friends.I can do everything without mask. (Add a space between sentences) 3. I go a lot of country such as Lao , Japan , Singapore. (Put a space after the comma, but not before the comma) 4. I said don't to go .The situation is not good right now. (Don't put a space before the full stop) Based on the findings, misspelling, grammatical and typographical errors were the three most frequent types of errors found in the narrative essays. To solve these problems, it is suggested that the teachers add some contents to the materials to reinforce the knowledge of capitalization and punctuation marks such as commas, possessive apostrophe, space, full stop. Clustering the Students’ Essays Based on the derived results, the essays were divided into 3-5 groups, which were classified according to the FRE, the number of words, sentences, paragraphs and errors. The outputs of the clustering allowed the teacher to make decisions on which groups together with the number of the students in the group needed more advice. Clustering the Essays Based on FRE The results of the essay clustering based on FRE were shown in Table 7. When dividing the essays into 5 groups, the means of the clusters were 51.25, 58.71, 66.13, 72.34 and 80.51, respectively. The number of the essays in the first through fifth clusters was 6, 9, 29, 29 and 23, respectively. As a rule, score between 81-90 could be interpreted that the level of difficulty of the essays was easy to read, equivalent to US grade 6 students (Pranay & MacDermid, 2017). The high score of the readability, the easier the text is. Therefore, the obtained results provided a guideline for the teachers’ decisions to pay attention to the fifth group with a high readability of 80.51. It was speculated that the students might lack vocabulary knowledge and sentence structures. Table 7. Results of Clustering the Essays Based on the FRE The number of groups Indicator Group 1 Group 2 Group 3 Group 4 Group 5 FRE (Means) 55.73 68.45 79.22 3 groups The number of essays 15 51 30 FRE (Means) 54.44 65.18 71.91 80.51 4 groups The number of essays 12 28 33 23 FRE (Means) 51.25 58.71 66.13 72.34 80.51 5 groups The number of essays 6 9 29 29 23 Clustering Essays Based on the Number of Words The results of clustering the essays based on the number of words were shown in Table 8. Table 8. Results of Clustering the Essays based on the Number of Words The number of groups Indicator Group 1 Group 2 Group 3 Group 4 Group 5 The number of words 166 233 338 3 groups (Means) The number of essays 33 29 34 The number of words 153 219 317 424 4 groups (Means) The number of essays 22 38 30 6 The number of words 49 193 294 348 443 5 groups (Means) The number of essays 1 55 23 13 4 64 PHOOPHUANGPAIROJ & PIPATTARASAKUL / Preliminary Indicators of EFL Essay Writing According to the results, when dividing the essays into 4 groups, the teachers may investigate the short essays which had the average of 153 words written by 22 students in the first group. When dividing the essays into 5 groups, the teachers need to pay attention to the shortest one consisting of only 49 words in the first group. Clustering Essays Based on the Number of Sentences The results of essay clustering based on the number of sentences were shown in Table 9. Table 9. Results of Clustering the Essays Based on the Number of Sentences The number of groups Indicator Group 1 Group 2 Group 3 Group 4 Group 5 The number of sentences 10 16 23 3 groups (Means) The number of essays 25 41 30 The number of sentences 10 14 20 26 4 groups (Means) The number of essays 20 28 37 11 The number of sentences 3 12 18 23 28 5 groups (Means) The number of essays 1 38 31 21 5 When dividing the essays into 5 groups, the results suggested that the teachers examine the short essay containing 3 sentences promptly and find out the causes. In addition, when dividing the essays into 4 groups, the essays with 10 sentences composed by 20 students in the first group also needed investigation as well. Clustering Essays Based on the Number of Paragraphs The results of clustering the essays based on the number of paragraphs were shown in Table 10. Table 10. Results of Clustering the Essays Based on the Number of Paragraphs The number of groups Indicator Group 1 Group 2 Group 3 Group 4 Group 5 The number of paragraphs 3 5 8 3 groups (Means) The number of essays 20 67 9 The number of paragraphs 2 4 7 10 4 groups (Means) The number of essays 6 75 13 2 The number of paragraphs 2 4 6 7 10 5 groups (Means) The number of essays 6 75 6 7 2 The concept of paragraph is indispensable in composing writing tasks. It was reported that some essays were produced more or less than 3-5 paragraphs based on the instructions. For example, the 6 essays in the first group consisted of 2 paragraphs, while 7 essays containing 7 paragraphs belonged to the fourth group and 2 essays with 10 paragraphs in the fifth group. It is interesting that the teachers might look into the details of the contents to check the students ’understanding towards the concept of paragraph because one paragraph must demonstrate at least one central topic. Clustering Essays Based on the Number of Errors Table 11 showed the results of clustering essays based on the number of errors. Table 11. Results of Clustering the Essays Based on the Number of Errors The number of groups Indicator Group 1 Group 2 Group 3 Group 4 Group 5 The number of errors (Means) 1 6 16 3 groups The number of essays 49 45 2 The number of errors (Means) 1 4 8 16 4 groups The number of essays 40 37 17 2 The number of errors (Means) 1 4 7 10 16 5 groups The number of essays 40 29 18 7 2 In relation to the errors, it is recommended that the teachers use one’s own discretion to probe the errors dependent upon the high number of errors found. Provided that the essays were classified into 3 groups, the teachers might focus