Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

A CORPUS-BASED EVALUATION OF VOCABULARY PRESENTATION IN SELECTED PRESCRIBED IRANIAN ENGLISH TEXTBOOKS

by

Reza Gholami

Members of the Supervisory Committee

Dr. Nooreen binti Noordin,

Prof. Dr. Shameem Rafik-Galea

Dr. Habsah Binti Hussin

November 23 2017

Statement of the Problem

Objectives

Research Questions

Background of the Study

This study aimed to evaluate and examine the appropriateness of vocabulary presentation in the Iranian English textbooks of Junior High school known as the Prospect Series through a corpus-based investigation.

To determine the characteristics of the frequency of the words contained in the Iranian Junior High School English Language Textbooks (Prospects 1-3)

To investigate the characteristics of vocabulary loading within and across the Iranian Junior High School English Language Textbooks (Prospects 1-3).

To identify the extent to which the words in the GSL and the AWL are covered in the Iranian Junior High School English Language Textbooks (Prospects 1-3).

To identify the recycling characteristics of the words from the GSL and the AWL in the Iranian Junior High School English Language Textbooks (Prospects 1-3).

To determine the distribution of the randomly selected words from the GSL and the AWL within and across the entire Iranian High School English Language textbooks (Prospects 1-3).

  • The problems associated with the Iranian system of ELT include identifying clear goals, the output, the causes, and the nature. Such problems are in part caused by the English education system’s performance or/and they may have stemmed from a rise in the diversity of the students’ needs, as well as the inefficiency of the English textbooks (Maftoon et al., 2010).
  • The method of selecting English language textbooks prescribed for the Iranian students studying at different stages is unsatisfactory and away from being scientific (Janfeshan & Nosrati, 2014).
  • Another issue is the students’ inability to speak English fluently; they are unable to communicate and handle English effectively after graduating from high school and their general English seems to be poor (Dahmardeh, 2009; Sadeghi, 2005; Vaezi, 2008;). Some researchers claim that such inability is because of the prescribed English textbooks they study in public schools.
  • to tackle this issue, the “Prospect Series” was introduced with the aim of developing the students’ oral skills, language use, and communication proficiency in real life situations. Having emphasized that the ELT program in Iran has failed to promote learners’ lifelong communicative abilities, Safari and Rashidi (2015) blamed textbooks, status of English in the educational system, multi-level classes, and pre-service and in-service teacher training for this failure.
  • To counter the effects of previous teaching methodology, communicative Language Teaching (CLT) was introduced by the MoE as the central method for application of the Prospect Series (Janfeshan & Nosrati, 2014).
  • The Iranian language textbooks in public schools avoid presenting topics related to the culture of English speaking countries due to the pro-active strategies adopted to protect the national culture (Mahboudi & Javdani, 2012).
  • Moreover, not all the four language skills are given adequate emphasis in the ELT curriculum. Instead, it is mainly concerned with reading and grammar. Sharifi and Khalkhali (2014) declared that one of the fundamental aspects of the education system is the issue of curriculum. Yet, research evidence suggests that less emphasis is laid on this aspect. Perhaps the most important cause of this research paucity is the emerging knowledge in the curriculum. Evidence suggests that in the recent years, the development of the curricula in Iran is in search of a new identity based on the past experiences, the current conditions, and the requirements of the international community. Thus, the formation of a comprehensive study on different aspects of the curricula and its relations with the other phenomena of the educational system seems inevitable.
  • The changes in the levels of basic education since 2012, including the latest changes are effective for the efficient education system in Iran. Nevertheless, apparently coherent and balanced discussion concerning the effectiveness of this system requires further scrutiny (Sharifi & Khalkhali, 2014). Those involved in the management aspects of the curriculum in Iran are suggested to gain a deeper knowledge towards the balance of the elements of the curriculum and have respect to the balance in the subtlest layers of the design, implementation and evaluation of the curricula (Sharifi & Khalkhali, 2014).
  • Having accentuated that the ELT program in Iran is newly revised by introducing the Prospect Series and the Vision Series in the junior high school and the high school respectively, evaluating the new textbooks is underscored with reference to many topics and issues. As mentioned earlier, the textbooks are blamed for the Iranian students’ incapability of fluent communication; of diverse skills and subskills being introduced by the textbooks to be evaluated, the current research is interested in analyzing the appropriateness of vocabulary presentation in the available Prospect Series in the Iranian junior high school.
  • One of the merits of the vocabulary analysis in a second language textbook is that such an analysis is actually a countable or measurable quality that appears to be a meaningful sense (Milton, 2009). In addition, the ELT textbooks are recommended to be utilized sensibly because they are unable to supply the demands required by all the classroom contexts (Williams, 1983). Nonetheless, in case teachers are asked to evaluate the textbooks, they need a device to base their judgments on and qualify their decisions (Ansary & Babaii, 2002). There are instances that selecting the textbooks is done hastily and without taking into account systematic principles (Ansary & Babaii, 2002).
  • In general, English textbooks are thought to be the most indispensable materials needed for enhancing the students’ language skills while they are also considered as the containers of the vocabulary input (Catalán & Francisco, 2008). In determining the appropriateness of the vocabulary presentation in English textbooks, there is a variety of criteria to be considered. Of all the existing criteria, this research was mainly interested in analyzing the words frequency, vocabulary loading within and across the studied textbooks, coverage of the 2000 most frequent words from the West’s General Service List (GSL) and the Academic Word List (AWL) (Coxhead, 2000), as well as recycling and distribution pattern of the words in the GSL and AWL within the studied textbooks, namely the Prospect Series (Prospects 1-3).
  • Milton (2009) declares that the most essential way in which words differ is the frequency with which they occur in a textbook. The frequency rate is important as it determines which words a learner is likely to encounter and how often these words are to be encountered (Milton, 2009). Furthermore, a strong association exists between the frequency of a certain vocabulary and its learning likelihood according to Meara’s (1992) model of a frequency profile (Milton, 2009) i.e. the higher frequency for a word implies a higher chance for it to be acquired (Milton, 2009). To date, there has been no examination of the words’ frequency of the Prospect Series, comparing them with native references such as the GSL and the AWL.
  • As asserted by Nation (2001a), the 2000 most frequent words in the GSL have been found to be so imperative in language learning that almost anything should be done to teach them and help learners learn them (Nation, 2001a). Milton (2009) declares that words in the GSL are the most beneficial to the learners because knowing these words will enable them to recognize about 80% of any normal text. Furthermore, it is underlined that a significant association exists between the text coverage and comprehension (Milton, 2009), implying that the students will be able to understand a foreign language more efficiently if they have the knowledge of more words. Therefore, identifying the extent to which the words in the GSL and the AWL are covered in addition to identifying the words from the two mentioned reference lists missing in the Iranian English textbooks of junior high school seems an inevitable task as it definitely reveals significant data on how the vocabulary is presented in such textbooks; an account which has never been examined in Iran to this date. Besides, knowing the missing words from the 2000 frequent word list can enable us to enhance the students’ learning by providing lists of the missing words for further classroom practices. To this date, it is not known what proportion of these vocabularies is actually missing in the Iranian textbooks.
  • Moreover, there should be a report on whether the textbooks of different and varying educational levels in Iran have presented an acceptable increase in vocabulary loading and input. It is assumed that each book must increase in difficulty and vocabulary load within and across the whole textbooks of any educational systems. Indeed, there is no clear and informative analysis of the words load and distribution throughout all the Prospects Series (1-3), which motivates further research and investigation.
  • Having discussed the importance of textbook evaluation, especially the Prospect Series being introduced as the latest English language textbooks prescribed for the Junior High School, and considering the fact that vocabulary presentation, as a side skill, provides insightful information about the learners’ performances and abilities, the current research was established to address the mentioned gaps and issues. In details, this research aims at running a corpus-based study of the appropriateness of vocabulary presentation in the Prospect series with reference to the words frequency, coverage of the GSL and AWL words, vocabulary loading, and distribution of the selected words in the entire Prospect series.

.

Research Question 1: What are the general characteristics of the frequency of the words contained in the Prospects Series?

  • Subquestion 1: How many words are introduced in the Prospects Series?
  • Subquestion 2: What are the most frequent words contained in the Prospects Series?

Research Question 2: What are the characteristics of vocabulary loading in the Prospects Series?

  • Subquestion 1: What are the characteristics of vocabulary loading of each textbook of the Prospects Series?
  • Subquestion 2: What are the general characteristics of vocabulary loading in the lessons in each textbook of the Prospects Series?

Research Question 3: To what extent are the words in the GSL and the AWL covered in the Prospects Series?

  • Subquestion 1: To what extent are the words in the GSL covered in the Prospects Series?
  • Subquestion 2: To what extent are the words in the AWL covered in the Prospects Series?
  • Subquestion 3: Which words in the GSL are missing in the entire set of the Prospects Series?
  • Subquestion 4: Which words in the AWL are missing in the entire set of the Prospects Series?

Research Question 4: What are the recycling characteristics of the words from the GSL and the AWL in the Prospects Series?

  • Subquestion 1: How often are the words in the GSL being recycled in the Prospects Series?
  • Subquestion 2: How often are the words in the AWL being recycled in the Prospects Series?

Research Question 5: How is the distribution of the randomly selected words from the GSL and the AWL within and across the entire set of the Prospects Series?

  • Subquestion 1: How often are the randomly selected words from the GSL being distributed within and across the entire set of the Prospects Series?
  • Subquestion 2: How often are the randomly selected words from the AWL being distributed within and across the entire set of the Prospects Series?
  • The former educational system of Iran suffered a lack of efficacy to meet the students’ needs, and did not keep pace with post-modernism, and remained stagnant (Chahardahcheriki & Shahi, 2012), top-down (Ghorbani, 2009), and continuously changing (Sabzian et al., 2013).
  • Hence, the MoE announced a fundamental reform in 2010 which was ultimately ratified in the late 2012 (Kheirabadi & Alavimoghaddam, 2014b).
  • This major reform in the educational system changed the number of grades and textbooks in schools accordingly (Safari & Rashidi, 2015).
  • Regarding the ELT as a key component of this major reform, the state English textbooks, which were based on archaic methods, were entirely revised (Safari & Sahragard, 2015a).
  • The fundamental reform in the educational system led to changes of grades in primary school (6 grades), junior high school (3 grades), and high school (3 grades) (Kheirabadi & Alavimoghaddam, 2014a).
  • The former English textbook series was replaced by “English for School Series, Prospect” for the junior high school and “Vision Series” for the high school.
  • The Prospect series, i.e. Prospects 1-3, was introduced into public schools from grade one to grade three of the junior high school. Prospect 1 has been designed for beginner students within the age range of 11-12 while prospect 2 is used for students with 13-14 years of age, and prospect 3 for students aged 14-15.
  • To ensure the quality and efficiency of the newly-introduced ELT textbook in Iran, comprehensive and objective research has to be conducted to bring better understanding of the success of this reform.
  • In the past decades a movement known as "textbook evaluation" began to emerge whose goal was to construct checklists based on which a book could be analyzed in detail for assuring its usefulness and practicality (Najafi Sarem et al., 2013).
  • Indeed, textbooks are proclaimed to be the containers of vocabulary input as a possible concept in the EFL field (Thornnbury, 2002).
  • Thornnbury (2002) has considered the textbooks as sources for words while claiming that the vocabulary input is realized in the actual content of the books through segregated vocabulary activities, integrated text-based activities, grammar explanations, and task instructions.
  • According to the existing research work concerning the textbook evaluation, diverse ways have been so far proposed by the researchers to assist the instructors in being more methodical and unbiased (Allehyani, Burnapp, & Wilson, 2017a; Margana & Widyantoro, 2017).
  • Concerning the need for evaluating the Prospect Series which has been newly introduced to the ELT curriculum of Iran and because of paucity of research to this date elaborating on vocabulary presentation in the Prospect textbooks (1-3), of the existing methods for evaluation, the current study adopted a corpus-based method to evaluate vocabulary presentation in the Prospect Series.

  • The population for the mentioned corpus is defined as the prescribed English language textbooks used by junior high school students all over the country.
  • The English textbooks of the 1st three years of Junior High school including Prospect 1, Prospect 2, and Prospect 3, in addition to Vision 1 as the only available published book for the next level of high school that is the 2nd three years.
  • All these books are published by Textbook Publishing Company of Iran, Tehran (2016).
  • This study employed a Purposive Sampling Method (Mukundan & Aziz, 2009).
  • These textbooks are the latest ones published in 2013 which are still in use. They are all available and accessible online (at: http://www.chap.sch.ir/) in in PDF formats.
  • Research Design
  • The Study Corpus
  • Data Collection Procedure
  • Research Instrumentation
  • Data Analysis: Techniques and Tools for Measuring
  • WordSmith Tools Version 7.0
  • RANGE and Frequency Program

3.5. Data Collection Procedure

The data collection procedure consisted of several phases. As for the first phase of this study, all the textbooks were downloaded from the website of Iranian School Books retrievable and accessible at http://www.chap.sch.ir. Then, the PDF files were converted into text files so that they were ready for being used by WordSmith Tools (Version 7.0) and RANGE and Frequency Program (Heatley, Nation & Coxhead, 2002) as well as R Software. After collecting the data related to the lessons, the lessons related to each textbook w collected and saved as separate files to be representative of the data of each textbook needed for making a comparison of the textbooks themselves. For data analysis, descriptive analyses were obtained which are detailed in the next section. The overview of the data collection methods has been shown in Figure 3.1.

  • R Software
  • The results proved that almost in all the researched aspects, the textbooks had inadequacies and shortages.
  • The results indicated discrepancies concerning the most frequent words in each textbook as compared to the BNC, inconsistent and poor loading especially within the textbooks, very limited coverage of the GSL and AWL words, unacceptable recycle, repetition, and distribution of the words from the GSL and AWL within the entire set of the studied textbooks.
  • This research introduced a list of the GSL and AWL words missing within the textbooks, as well as a list of the words being recycled fewer than seven times.
  • Samples of the contextual analyses and dispersion of the selected words have been also dealt with.
  • This research concludes that the textbooks do require revisions with reference to vocabulary presentation although they have been introduced recently.

The results of this study could be used as guidelines to provide recommendations on the teaching of English. As Biber, Conrad and Reppen (1994) have suggested, corpus-based research sheds new light on some of our most basic assumptions about English grammar, and as a result it offers the possibility of more effective and appropriate pedagogical applications.

Other than using grammar books, internet sources, dictionaries and the textbooks, teachers may consider using a concordancer. A concordancer is a computer program which is used to find the occurrences of every single word or phrase in a text (Sinclair, 1991). Teachers could also retrieve concordance entries from the accessible website and prepare exercises for their students. The concordancer and concordancing is one example where “the technology can be used to promote autonomous learning.” Such an approach may help in the “empowerment of students’ (Butler, 1990).

In general, the results and findings discussed in this research could be useful, first hand, for the textbook compilers and designers in Iran. First, because there are two books Vision 2 and Vision 3 which have not been compiled and introduced yet, and knowing all the results reported here could help them in designing the textbooks so that the deficiencies and shortages would be compensated in the upcoming textbooks. The textbook compilers can take into account the loading patterns reported here and apply the right method when the lessons in each textbook are organized, as well as across the two upcoming textbooks. As for the coverage of the words from the GSL and AWL, they can take into account the introduced list of missing words so that such important words could be presented in the upcoming textbooks. By this the exposure to the high frequency words would be increased, hopefully resulting in a better learning especially when the students graduate from high school levels and intend to continue their studies by going to colleges and universities. As for the inadequate repetition and distribution of the words from the GSL and AWL, the textbook compilers could also endeavor to apply and implement a better repetition and distribution, at least for the upcoming textbooks, while taking into account the results of this study when they intend to revise the already introduced textbooks of Prospect Series and Vision 1.

In addition to textbook compilers and policy makers, the English teachers working for the MoE could benefit from the findings in diverse ways. First and foremost, they could balance their teaching when they know, for the findings of this research, some lessons are more difficult in terms of vocabulary loading; as such, they could plan to teach those difficult lessons during a lengthier period of time so that their students would not be exhausted and exposed with so many new types and this ensures better learning. In addition, the missing lists of the GSL and AWL could be used by the teachers so that whenever they think it is appropriate, they can add some words to their lesson plans and by this expose their learners with more vocabularies, compensating the very low coverage of these words. For instance, this could be achieved by selecting reading excerpts or listening activities whereby other words which have not been introduced in the book from the GSL ad AWL could be taught to them accordingly. Knowing a list of the words which have been repeated fewer than 7 times could enable the teachers to bring more repetition and use of such words, as introduced in the two lists of the GSL and AWL being recycled fewer than 7 times. This would definitely bring more understanding and exposure to the most important high frequency words and academic words. At last, when the teachers are aware that some words have been distributed in certain textbooks poorly, they can compensate this by creating situations where such words could be taught to them at the levels where these words are distributed weakly or never distributed.

Likewise, the students could benefit from the results by having a list of the words mentioned above so that they could do self-studies and this would help, especially the ones who are more enthusiastic, so that without the teachers’ effort, the learners themselves could plan for a better learning. Generally, researchers in the field of textbook analysis and corpus linguistics could benefit from the findings especially when they deal with Prospect Series and Vision Series and this enables them to have a clear and reliable perspective about the newly-designed textbooks in Iran. The next section elaborates on the recommendations for future research.

This study only focused on presentation of types and tokens, other studies could analyze the part of speech including adjectives, nouns, verbs, and adverbs.

Other research could replicate this study and instead of the GSL, other references such as BNC could be applied when they focus on coverage of high frequency words, their distribution, and recycling.

The future research could focus on other aspects apart from vocabulary, such as prepositions, modal auxiliary words, grammar, syntax, etc.

Researchers could also devise checklists specialized for vocabulary presentation and distribute among the teachers to find out if their ideas resemble the realities regarding the vocabularies as reported in this research.

WordSmith Tools Version 7.0

Afterwards, ten words were randomly selected by utilizing R Software (Version R-3.4.0 for Windows), downloadable from http://wbc.upm.edu.my/cran/bin/windows/base/. In practice R is reported as an open source software environment and programming language for statistical computing and graphics. University Putra Malaysia is a pioneer in providing the Comprehensive R Archive Network (CRAN) mirror among higher education institutions in Malaysia, available at http://wbc.upm.edu.my/cran/. It is asserted that this software provides a wide variety of statistical and graphical techniques, and is highly extensible (R Core, Team (2013).

Using R Software, with precondition of randomly selecting 10 words from a range of 1-790, the software yielded a list of 10 numbers (Appendix 1). By referring to the of lists the words from the GSL presented in the Prospect Series and Vision 1, the numbers given by R Software were assigned to the words and the results are detailed in Table 4.30 and Table 4.31.

  • Developed by Mike Scott (1996), WordSmith Tools is useful for textbook adaptation (Scott, 2016) which provides immediate displays of word frequency lists; concordances, which show all the uses of a given word in its contexts; and lists of keywords (Ghadessy, Henry and Roseberry, 2001).

  • The reliability of WordSmith Tools has been verified by several studies on various corpora which have made use of such tools to analyze texts (Mukundan, 2004; and Scott, 2001 to name a few).
  • To address the objectives of this study, only the WordList and Concord tools were employed.
  • Concordance is such software that can be utilized (Mukundan & Nimehchisalem, 2008) for vocabulary loads, frequencies, input, and so forth.
  • The researchers are able to identify the frequency occurrence of the given vocabulary with the help of the WordList tool of the computer software.
  • WordSmith Tools are actually integrated programs which scrutinize the behavior of the words in a given text.
  • These tools are useful for language teachers, students and researchers if they are interested in examining the language patterns (Mukundan & Norwati, 2009).
  • As regard the Concord Tool, it can be used to vastly analyze learner corpora as well as morphology, any many other features. lexical semantics, collocations, and to some extent, syntax and discourse analysis. A concordance is expedient in comparative studies, in addition to investigating the error patterns (Shamsudin, Zainal, Zaid, & Seliman, 2008).

RANGE (Heatley, Nation & Coxhead, 2002) yields the number of words and a text frequency figure. RANGE comprises three unique lists: list one and two (hereafter L1 and L2) represent the 2000 most frequent words in English, while L3 includes words that are not found in the first 2000 words but are frequent in secondary school and university texts. These lists are based on the GSL and the AWL.

The RANGE differentiates between three unique classes: tokens, types and families. While for some purposes, the tokens and types were considered in this study for the analyses, in other instances only the families (lemmas) were the categories of analysis. RANGE is downloadable and free (http://www.victoria.ac.nz/lals/about/staff/paul-nation).

With reference to the GSL, it is said that it contains 2000 headwords and was developed in the 1940s. The frequency figures for most items are based on a 5,000,000 word written corpus. Percentage figures are given for different meanings and parts of speech of the headword. In spite of its age, some errors, and its solely written base, it still remains the best of the available lists because of its information about the frequency of meanings, and West's careful application of criteria other than frequency and range. The classic list of high frequency words is Michael West's General Service List (1953). The 2000 word GSL is of practical use to teachers and curriculum planners as it contains words within the word family each with its own frequency. For example, excited, excites, exciting and excitement come under the headword excite. The GSL was written so that it could be used as a resource for compiling simplified reading texts into stages or steps. West and his colleagues produced vast numbers of simplified readers using this vocabulary. This is actually a very old list being based on frequency studies done in the early decades of this century. Doubts have been cast on its adequacy because of its age (Richards, 1974) and the relatively poor coverage provided by the words not in the first 1000 words of the list (Engels, 1968).

Vocabulary Loading in Prospect Series

Objectives Two was established to address general characteristics of vocabulary loading in the Prospect Series:

• To investigate the characteristics of vocabulary loading within and across the Iranian Junior High School English Language Textbooks (Prospects 1-3).

In line with this, the following research questions and subquestions were formulated:

• Research Question 2: What are the characteristics of vocabulary loading in the Prospects Series?

• Subquestion 1: What are the characteristics of vocabulary loading of each textbook of the Prospects Series?

• Subquestion 2: What are the general characteristics of vocabulary loading in the lessons in each textbook of the Prospects Series?

Table 4.8 presents the results yielded by the WordList tool, accompanied by WordSmith Tools (Version 7) with reference to Standardized Type/Token Ratio (STTR), Density Ratios (TTR) and Consistency Ratios. Diverse criteria have been employed in this section, such as the Standardized Type/Token Ratio (STTR) to measure the density level of textbooks, in addition to density ratios (TTR) and Consistency ratios to address Research Questions Three and Four. Textbooks with higher percentage of STTR indicate that the textbooks have more types being introduced for every 1000 tokens in the textbooks (Mukundan & Aziz, 2009). It needs to be highlighted that Jin, Tong, Nor, Tarmizi, and Mahmad (2012) used both density ratio and consistency ratio to discuss the vocabulary density and text difficulty in their research. As for the density ratio (TTR), they admit that the highest density ratio indicates that the passages are cramped with large tokens of words with many introductions of new words. As asserted by Nation (1990), in order to calculate the lexical density index (LDI or TTR) of a given text, the number of different words (Types) by the total number of words (Tokens) in the text. This index has been used by many researchers to discuss vocabulary input in the textbooks (Such as Jin et al., 2012 & Mármol, 2011). Moreover, Jin et al. (2012) acknowledge that the consistency ratio can be obtained using a simple formula i.e. to divide the number of tokens by the number of types. By consistency ratio, the obtained statistic shows after how many words a new word has been introduced in the textbook and the lowest rate means that a particular textbook or lesson is difficult as the rate of introduction to new words is frequent. Mármol (2011) admits that the lexical density of a text may indicate its difficulty, approving that texts with low density (less than 40-50%) are considered not dense and relatively easy to understand; on the other hand, texts over 60-70% LDI are lexically dense and more complex to read.

As observed in Table 4.8 and Figure 4.2, there is a steady surge in the total number of tokens from Prospect 1 Prospect 3. There is also an upward trend in introducing the total number of types found in the Prospect 1-3 textbooks, with a noteworthy growth in favor of the Prospect 3 textbook in terms of the sum of types. Again, it is acknowledged for the purpose of these research objectives, unit of analysis is considered to be the word heedless of whether some types belong to a headword or lemma.

Table 4 8: Lexical Density Index for the Prospect Series

As for this section, the current research aimed at determining and identifying the top fifty words (both grammatical and content word types) based on the following Subquestion:

• Subquestion 2: What are the most frequent words contained in the Prospects Series?

The top fifty words for each textbook in Prospect Series taught at Junior High Schools of Iran, alongside their occurrences and percentages with respect to the total number of tokens in the text are tabulated in descending order in Table 4.3, Table 4.4, and Table 4.5.

One noticeable point is that there is a predominance of grammatical words (Function words) over content words in the three textbooks under study. The criteria for deciding on function words lays its basis to Cook (2013) who introduced the fifty most frequent words in English based on BNC. In Prospect 1, the number of function words equates the number of content words, 25 for both. In prospect 2, function words amount to 29 versus 21 content words. In Prospect 3, the number favors functions words with 35 versus 15 content words.

In terms of shared function words, it is observed that a count of 20 words were shared by the three textbooks and they are a, about, and, are, can, do, he, how, I, in, is, it, she, the, then, to, what, with, you, and your. The number of words being shared only by two textbooks was 9 words, including at, his, my, no, of, on, there, they, we, and where. Nonetheless, there were 9 non-shared words which occurred only in one textbook and not the others. This includes did, does, for, from, her, if, not, that, and who.

As regards the content words, it was observed that only 6 content words in the three textbooks are shared and they are answer, ask, conversation, listen, talking, and teacher. Yet, a number of 10 words are shared only between two textbooks including check, examples, friend, like, practice, questions, say, student, work, and yes. Also, 23 words are unique to certain books and are presented in either of the books, which include address, age, Ali, below, card, city, classmates, doing, English, fill, have, health, job, letters, name, number, old, play, sentences, some, sounds, spell, and write.

Table 4 3: The top fifty words in Prospect 1

A closer glimpse over the results presented in Table 4.8 and Figure 4.2 convinces us that Prospect 3 has the highest density level (STTR=30.72) compared to the other two English language textbooks. It implies that at this grade (Grade 9 of Junior High School), students are expected to be ready to handle a larger number of words; therefore, more types are introduced. The fact here is that Prospect 1 has the lowest density level (STTR=22.53) and precedes Prospect 2 which ranks next with an STTR of 24.30, i.e. there is an ongoing growth in the density level from Prospect 1 to Prospect 3, meaning that the last book is relatively more difficult than the other two textbooks in terms of the vocabulary load in the textbooks. To sum up, the first book of the series in the junior high school (Prospect 1) is the least difficult textbook in terms of the vocabulary load in the textbook when STTR is considered, and this difficulty progressively proliferates grade by grade and the last book in the series (Prospect 3) is the most difficult textbook in terms of the vocabulary load in the textbook, with the highest STTR of 30.72.

To this end, it can be claimed that lexical variation between the three books is not similar. Totally, the STTR for the three books is 26.74, which denotes a rather less difficult ratio i.e. the three books have a reasonable loading for the junior high schoolers if only STTR is taken into account; nevertheless, when further detailed analyses related to the density ratio are regarded, it is observed that Prospect 3 has the highest in comparison with the other books in the series meaning that Prospect 3 is cramped with large tokens of words with many introductions of new words. The consistency ratio of the same textbook reads 6.75, implying that after every 6 words, a new word is introduced in the textbook. This ratio is almost lower than the other two books of the series, namely Prospect 1 and Prospect 2 with a consistency ratio of 7.38 and 7.81, meaning that Prospect 3 is difficult textbook as the rate of introduction to new words is rather frequent compared to the other two textbooks, and this should be the case; on the other hand, a point of debate arises when both density and consistency ratios are taken into account approving that there is, though, an imperceptible variation for Prospect 1 and Prospect 2. Although the STTR ratios patterned an upward trend from the first to the last textbook, results related to density and consistency ratios, illuminated in Table 4.8 and Figure 4.3 depict that Prospect 1 has a slightly higher density ratio and a corresponding lower consistency ration, meaning that Prospect 1 is a bit denser than the textbook which follows it, and this should not be the case. To conclude, vocabulary loading and density in the studied Prospect Series could be considered to be reasonable, as rather low indexes were expected for the reason that these textbooks are mostly for the elementary levels, which gradually move toward advanced levels.

Figure 4 3: Density and Consistency Ratios of the Prospect Series as a Whole

Then, to answer Subquestion One, the results yielded by WordSmith Tools Version 7 (WordList) have been tabulated in Table 4.2 and plotted in Figure 4.1, demonstrating the total number of running words (tokens) and the total number of distinct words (types) observed in the textbooks. It needs to be asserted that to fulfil this Objective, only the quantity of tokens and types was taken into account, without lemmas or considering lemmatization of the words for the reason that by lemmatization the types would be clustered in a family, reducing the number of the words from a family showing it only under a headword. As such, Prospect one introduces 3,382 running words and 458 types. In Prospect 2, all the units reach 3,891 tokens. Out of this total, 498 are types. Prospect 3 increases its tokens up to 5,151 and there is an upward trend in presenting the types (763).

Table 4 2: The total number of tokens and types in Prospects 1-3

Characteristics of Vocabulary Presentation in Prospect Series

• Research Question 1: What are the general characteristics of the frequency of the words contained in the Prospects Series?

• Subquestion 1: How many words are introduced in the Prospects Series?

• Subquestion 2: What are the most frequent words contained in the Prospects Series?

Table 4.1: General Description of the Prospect Series (1-3)

Apart from the aforementioned variations observed among the textbooks in terms of function words and content words, and shared and non-shared content and function words, Cook (2013) lists the top fifty most frequent words based on BNC as shown in Table 4.6. As proclaimed by Bartsch (2004), all of the top 50 words in BNC belong to the class of function words, that are typically contrasted with the content words based on presence or absence of lexical content. Cook (2013) also admits that the top 100 words in BNC account for 45% of all the words in BNC, confirming that learning these 100 words assists the learners to identify roughly half of the words they might encounter in English. Bearing the prominence of the 50 topmost high frequency words in BNC in mind and the estimation of knowing 100 words, this study made comparisons between the three textbooks of the Prospect Series to determine how they are presented.

Indeed, by having a glance at the first top ten words listed, discrepancies appear as there are 6 words which are shared while having very inconsistent ranking both within the textbooks and as compared to BNC. Although in BNC, the first word is “the”, this word takes the first ranking in Prospect 2 and 3, but the 3rd place in Prospect 1, which is unusual another odd pattern with respect to ranking is that the words “of”, “in”, “it”, and “was” are absent in the Prospect Series’ top ten frequent words while they are the top ten words in BNC. In Prospects 1-3, “was” do not appear in the top 50 lists. Regarding “of”, it is absent in Prospect 1, while Prospect 2 it is the 27th word and in Prosect 3 takes the 32nd tank, as compared to the 2nd rank in BNC. As for “in” it is the 20th in Prospect 1 as compared to the 5th rank of BNC while in Prospect 2 takes the 14th and in Prospect 3 the 21st rank; again an unwieldly and awkward presentation. With reference to “it”, the rank is 49th in Prospect 1, 12th in Prospect 2, and 13th in Prospect 3, while being 7th in BNC. As for “your” which is the most frequent word in Prospect 1, it never appears among the top 50 words of BNC; meaning that this word has been recycled extraordinarily, even more than “that”. The case of “your” takes the 17th position in Prospect 2 and the 9th rank in Prospect 3, which means the three textbooks insist on using it, even more than that of BNC. Moreover, the word “and” in Prospect 1 is even more frequent than “the”, which should not be the case and when it comes to presentation of “and”, it only presents appropriately in Prospect 2 (3rd rank), similar to that of BNC, but the 4th in Prospect 3.

There are many other discrepancies when we compare Prospect Series with the other two textbooks and BNC. For example, “if” appears only in prospect 1, and it takes the rank of 42nd, even higher than that of BNC (47th). Whereas “his” is absent in Prospect 2, in the other two textbooks has the rank of 41st, just less frequent than that of BNC (27th). The word “can” is another sample because in BNC it is the 48th frequent word while in the Iranian textbooks it takes the rank of 13th, 6th, and 20th in Prospects 1-3, respectively; more frequent than BNC. Indeed, this discussion so far suffices to claim that the ranking and frequency of the top most frequent words of neither textbook in the Prospect series accord agreeably with that of BNC and in Iranian textbooks some words have been recycled even more than that of BNC, and there are others which have been either absent or poorly presented.

The same awkward pattern can be observed evidently for the presentation of content words: whereas the word “name” stands 9th in Prospect 1, it does not appear amongst the top 50 high frequent words in the other two textbooks. “Teacher” is the 10th frequent word in textbook 1 appears as the 15th and the 38th in the other two textbooks, respectively. “student” is the 17th frequent word among the top 50 of Prospect 1, while being 9th in the second textbook but totally absent in the list of textbook three. There are many other discrepancies regarding the content words when only the shared content words in the list are taken into account. “Answer” in Prospects 1 and 3 is the 30th, while being the 25th in Prospect 2. Taking into account the content words shared by three textbooks, again an incongruent pattern can be document. “Answer” takes the rank of 31st in book 1, 25th in book 2, and 29th in book 3. Also, “listen” undergoes incongruent repetition in the three sets of textbooks; in Prospects 1, 2, and 3, it takes the ranks of 15th, 18th, and 17th. The words “ask”, “conversation” and “talking” reveal diverse repetition and ranking within each textbook. Nonetheless, the fact is that none of the abovementioned content words even appear in the list of twenty most frequent nouns, verbs, and adjectives in BNC as proposed by Cook (2013) as presented in Table 4.7. To conclude, the textbooks under study suffer from poor presentation of both function words and content words, within and across the books, and particularly when compared and contrasted against BNC.

Figure 4 1: The Total Number of Tokens and Types in Prospects 1-3

Loads of Thanks for Your Attention and Patience

Learn more about creating dynamic, engaging presentations with Prezi