- Background of ELT Curriculum in Iran
- Corpus Linguistics: A Breakthrough
- Vocabulary Presentation in ELT Textbooks
- Literature Review Related to Textbook Analysis
A CORPUS-BASED EVALUATION OF VOCABULARY PRESENTATION IN SELECTED PRESCRIBED IRANIAN ENGLISH TEXTBOOKS
by
Reza Gholami
Members of the Supervisory Committee
Dr. Nooreen binti Noordin,
Prof. Dr. Shameem Rafik-Galea
Dr. Habsah Binti Hussin
Statement of the Problem
Background
Objectives
Research Questions
This study aimed to evaluate and examine the appropriateness of vocabulary presentation in the Iranian English textbooks of Junior High school known as the Prospect Series through a corpus-based investigation.
Introduction
• Selecting ELT textbooks for the schools of Iran was formerly unsatisfactory and non-scientific.
• The students’ inability to speak English fluently after graduation is attributed to prescribed textbooks.
• To tackle this, the “Prospect Series” was introduced to develop the students’ oral skills and language use.
• To counter the effects of the former teaching methodology, CLT was adopted by the Prospect Series.
• While formerly not all the four language skills were given underscored but the Prospect Series are claimed to work in these skills.
• By introducing the new series, evaluating them is highlighted regarding many topics and issues, among which the new vocabulary words to see if the textbooks are efficient.
• Yet, to this date there has been no quantitative research elaborating on vocabulary presentation, such as a corpus-based study.
• To do so, various criteria are considered and among all this research analyzing the words frequency, vocabulary loading within and across the studied textbooks, coverage of the 2000 most frequent words from the GSL and AWL words in addition to recycling and distribution pattern of the words in the GSL and AWL within the Prospect Series (Prospects 1-3).
• Vocabulary, as a side skill, provides insightful information about the learners’ performances and abilities.
• The current research is designed to project whether the students studying the Prospect Series could be expected to be efficient users of the language or not.
- The former educational system of Iran lacked efficacy to meet students’ needs, and was stagnant, top-down, and continuously changing
- MoE announced a fundamental reform in 2010 ratified in the late 2012, changing the number of grades and textbooks in schools: primary school (6 grades), junior high school (3 grades), and high school (3 grades).
- As for ELT, the state English textbooks were entirely revised introducing “English for School Series, Prospect” for the junior high school and “Vision Series” for the high school.
- To ensure the quality and efficiency of the newly-introduced ELT textbook in Iran, comprehensive research has to be conducted to assess the success of this reform.
- To fulfill this, the current study adopted a corpus-based method to evaluate vocabulary presentation in the Prospect Series because of paucity of research to this date on vocabulary presentation
Research Question 1: What are the general characteristics of the frequency of the words contained in the Prospects Series?
- Subquestion 1: How many words are introduced in the Prospects Series?
- Subquestion 2: What are the most frequent words contained in the Prospects Series?
Research Question 2: What are the characteristics of vocabulary loading in the Prospects Series?
- Subquestion 1: What are the characteristics of vocabulary loading of each textbook of the Prospects Series?
- Subquestion 2: What are the general characteristics of vocabulary loading in the lessons in each textbook of the Prospects Series?
Research Question 3: To what extent are the words in the GSL and the AWL covered in the Prospects Series?
- Subquestion 1: To what extent are the words in the GSL covered in the Prospects Series?
- Subquestion 2: To what extent are the words in the AWL covered in the Prospects Series?
- Subquestion 3: Which words in the GSL are missing in the entire set of the Prospects Series?
- Subquestion 4: Which words in the AWL are missing in the entire set of the Prospects Series?
Research Question 4: What are the recycling characteristics of the words from the GSL and the AWL in the Prospects Series?
- Subquestion 1: How often are the words in the GSL being recycled in the Prospects Series?
- Subquestion 2: How often are the words in the AWL being recycled in the Prospects Series?
Research Question 5: How is the distribution of the randomly selected words from the GSL and the AWL within and across the entire set of the Prospects Series?
- Subquestion 1: How often are the randomly selected words from the GSL being distributed within and across the entire set of the Prospects Series?
- Subquestion 2: How often are the randomly selected words from the AWL being distributed within and across the entire set of the Prospects Series?
- To determine the characteristics of the frequency of the words contained in the Iranian Junior High School English Language Textbooks (Prospects 1-3)
- To investigate the characteristics of vocabulary loading within and across the Iranian Junior High School English Language Textbooks (Prospects 1-3).
- To identify the extent to which the words in the GSL and the AWL are covered in the Iranian Junior High School English Language Textbooks (Prospects 1-3).
- To identify the recycling characteristics of the words from the GSL and the AWL in the Iranian Junior High School English Language Textbooks (Prospects 1-3).
- To determine the distribution of the randomly selected words from the GSL and the AWL within and across the entire Iranian High School English Language textbooks (Prospects 1-3).
Literature Review
- The population for the mentioned corpus is defined as the prescribed English language textbooks used by junior high school students all over the country.
- The English textbooks of the 1st three years of Junior High school including Prospect 1, Prospect 2, and Prospect 3, in addition to Vision 1 as the only available published book for the next level of high school that is the 2nd three years.
- All these books are published by Textbook Publishing Company of Iran, Tehran (2016).
- This study employed a Purposive Sampling Method (Mukundan & Aziz, 2009).
- These textbooks are the latest ones published in 2013 which are still in use. They are all available and accessible online (at: http://www.chap.sch.ir/) in in PDF formats.
Data Collection Procedure
Purposive Sampling Method
- WordSmith Tools Version 7.0
a corpus-based analysis
descriptive statistics/descriptive analysis
a quantitative approach
- RANGE and Frequency Program
Methodology
- first, all the textbooks were downloaded from the website of Iranian School Books retrievable and accessible at http://www.chap.sch.ir.
- Then, the PDF files were converted into text files ready for being used by WordSmith Tools and RANGE and Frequency Program.
- the lessons related to each textbook were collected and saved as separate files to be representative of the data of each textbook.
- For data analysis, descriptive analyses were obtained.
(Heatley, Nation & Coxhead, 2002)
(Mukundan & Aziz, 2009; Catalán & Francisco, 2008; Koosha & Akbari, 2010; Sodagar, 2010)
RQ 1
Subquestion 2: What are the most frequent words contained in the Prospects Series?
Subquestion 1: How many words are introduced in the Prospects Series?
Subquestion 2: What are the most frequent words contained in the Prospects Series?
Subquestion 1: How many words are introduced in the Prospects Series?
Subquestion 2: What are the most frequent words contained in the Prospects Series?
RQ1: What are the general characteristics of the frequency of the words contained in the Prospects Series?
Prospect 3, 35 functions words Vs 15 content
Function words Shared in Prospect Series: 20 words
All of the top 50 words in BNC belong to the class of function words (Bartsch, 2004).
The top 100 words in BNC account for 45% of all the words in BNC (Cook, 2013)
learning these 100 words; identifying roughly half of the words students encounter (Cook, 2013)
Iran
Prospect 1: 25 function words Vs 25 content words
The top 50 most frequent Words
Prospect 2, 29 function words Vs 21 content words
(a, about, and, are, can, do, he, how, I, in, is, it, she, the, then, to, what, with, you, & your)
Function words shared only by two textbooks: 10 words
The ranking and frequency of the top most frequent words
(Function and Content words) of neither textbook in the Prospect series accord agreeably with that of BNC;
In Iranian textbooks some words have been recycled even more than that of BNC, while others have been either absent or poorly presented.
at, his, my, no, of, on, there, they, we, & where
Content words
Prospects 1-3:
- Tokens: 12,424
- Types: 1179
Non-shared function words: 9 words
did, does, for, from, her, if, not, that, & who
Spain
Results and Discussion
Malaysia
(Catalán & Francisco, 2008)
Function words
Content words Shared in Prospect Series: 6 words
Form 1: 4,730
Form 2: 4,738
Form 3: 5,309
(Mukundan & Khojasteh, 2011)
Changes for ESO:
- Tokens: 32.251
- Types: 3,238
New Burlington:
- Tokens: 40,449
- Types: 3,764
answer, ask, conversation, listen, talking, & teacher
Form 4: 75,154
Form 5: 81420
Form 4: 7788
Form 5: 7994
Content words shared only by two textbooks: 10 words
(Mukundan & Aziz, 2009)
check, examples, friend, like, practice, questions, say, student, work, & yes
Form 1-5:
Tokens: 322,787
Types: 14732
Conclusion
“the”: 1st in BNC, 1st in Prospects 2 and 3, and 3rd in Prospect 1.
“of”, “in”, “it”, and “was”: absent in the Prospect Series’ top ten, contrary to BNC.
“was”: absent in the top 50s in Prospects 1-3.
“of”: 2nd in BNC, absent in Prospect 1, 27th rank in Prospect 2 and 32nd in Prospect 3.
“in”: 5th in BNC, 20th in Prospect 1, 14th in Prospect 2, and 21st in Prospect 3.
“it”: 7th in BNC, 49th in Prospect 1, 12th in Prospect 2, and 13th in Prospect 3.
“your”: not in BNC’s top 50, but the most frequent word in Prospect 1, 17th in Prospect 2, and 9th in Prospect 3.
“and”: 3rd in BNC, more frequent than “the” in Prospect 1, 3rd in Prospect 2, and 4th in Prospect 3.
“if”: 47th in BNC, but only appears in prospect 1 (42nd).
“his”: 27th in BNC, absent in Prospect 2, but 41st in prospect 1 and Prospect 3.
“can”: 48th in BNC but 13th, 6th, and 20th in Prospects 1-3, respectively.
Non-shared function words: 23 words
Conclusion
Although there is a predominance of Function words over content words in top 50s, not all the top 50s are functions words, as they are supposed to be.
address, age, Ali, below, card, city, classmates, doing, English, fill, have, health, job, letters, name, number, old, play, sentences, some, sounds, spell, & write
(Bartsch, 2004; Cook, 2013)
The number of both types and tokens are below standards in Iranian Junior High School Textbooks compared to the same educational textbooks in other countries.
- Produced by WordList (Wordsmith Tools V.7)
- Only the quantity of tokens and types; not the lemmas
Poor recycling of both Function words and Content Words
RQ 2
RQ 2: What are the characteristics of vocabulary loading in the Prospects Series?
Subquestion 1: What are the characteristics of vocabulary loading of each textbook of the Prospects Series?
Subquestion 2: What are the general characteristics of vocabulary loading in the lessons in each textbook of the Prospects Series?
No lemmatization
(Catalán & Francisco, 2008; Mármol, 2011)
WordList tool
Criteria:
Standardized Type/Token Ratio (STTR)
(Mukundan & Aziz, 2009)
Density Ratios (TTR)
(Jin et al., 2012; Mármol, 2011; Nation, 1990)
Consistency Ratios
(Jin et al., 2012)
vocabulary loading and density in the studied Prospect Series could be considered to be reasonable
RQ 3
Subquestion 2: To what extent are the words in the AWL covered in the Prospects Series?
Subquestion 3: Which words in the GSL are missing in the entire set of the Prospects Series?
Subquestion 1: To what extent are the words in the GSL covered in the Prospects Series?
Subquestion 2: To what extent are the words in the AWL covered in the Prospects Series?
Subquestion 1: To what extent are the words in the GSL covered in the Prospects Series?
RQ 3: To what extent are the words in the GSL and the AWL covered in the Prospects Series?
Subquestion 4: Which words in the AWL are missing in the entire set of the Prospects Series?
GSL (West’s, 1953): The Most Regularly-Appear 2000 Words
AWL (Coxhead’s 2000): 570 Word Families (Lemmas)
Missing words listed in Tables 4.20 and 4.21
(Heatley, Nation & Coxhead, 2002)
Missing words listed in Table 4.23
RANGE program
BL1: 1st 1000 most frequent words
BL2: 2nd 1000 most frequent words
BL3: 570 AWL words
The corpus of the Prospect Series was lemmatized
Note: No. of GSL headwords totaled 1983
(Nation, 2001; Mármol, 2011)
(Malaysian textbooks cover 91.3% in Form 1-5 while form 1-3 cover around 68%
(Mármol, 2011; Mukundan & Aziz, 2009)
RQ 4
Subquestion 1: How often are the words in the GSL being recycled in the Prospects Series?
RQ 4: What are the recycling characteristics of the words from the GSL and the AWL in the Prospects Series?
Subquestion 2: How often are the words in the AWL being recycled in the Prospects Series?
Criterion for recycling: 7 times of repetition
(Mukundan & Aziz, 2009; Nation, 2001; Thornbury, 2002)
71.9% of recycle reported by Mukundan & Aziz, 2009 for Form 1-5
List of words with low recycle rate given in Table 4.26
List of words with low recycle rate given in Table 4.29
RQ 5
RQ5
RQ 5: How is the distribution of the randomly selected words from the GSL and the AWL within and across the entire set of the Prospects Series?
Subquestion 1: How often are the randomly selected words from the GSL being distributed within and across the entire set of the Prospects Series?
Subquestion 2: How often are the randomly selected words from the AWL being distributed within and across the entire set of the Prospects Series?
R Software (Version R-3.4.0 for Windows)
(R Core, Team (2013)
Conclusions
Implications of the Findings
Recommendations for Future Research
• A guideline to provide recommendations on the teaching of English.
• A guideline for textbook compilers and designers in Iran
• A guideline for English teachers working for the MoE.
• A framework for the students.
• A guideline for researchers in the field of textbook analysis and corpus linguistics.
- RQ 1:
- Number of types and tokens below standards
- inconsistencies regarding the number of function and content words
- inconsistencies in the no number of shared and non-shared words.
- poor recycling of Function and Content Words.
- inconsistent ranking and frequency of the top most frequent words.
- RQ 2
- rather satisfactory vocabulary loading across the books of different grades but poor loading lesson by lesson.
- RQ 3:
- Poor coverage of GSL and AWL words in the entire corpus and a big load of missing words in both categories
- RQ 4:
- Poor recycling and repetition of the GSL and AWL words across all the studied books.
- RQ 5:
- Poor distribution of randomly selected words across all the textbooks.
Summary and Conclusion
- analyze the part of speech.
- replicate this study and instead of the GSL, other references such as BNC could be applied.
- study other aspects apart from vocabulary, such as prepositions, modal auxiliary words, grammar, syntax, etc.
- could devise checklists specialized for vocabulary presentation and distribute it among the teachers to find out if their ideas resemble the realities regarding the vocabularies as reported in this research.
This research concludes that the textbooks do require revisions with reference to vocabulary presentation although they have been introduced recently.