Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Week 2- Team presentation

No description
by

Elizabeth Padilla

on 24 February 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Week 2- Team presentation

References
Mia's Speakers notes
Week 2-
Team B-Squad
Presentation

Does the administration of a test by individual classroom teachers affect the test’s reliability and validity? If so, how?
What types of assessment have evolved with language programs?
Ruth's Speakers notes
Luis's Speakers Notes
Aida's speakers notes
Elizabeth's Speakers notes

Childs, R. A., & Umezawa, L. (2009). When the teacher is the test proctor. Canadian Journal of Education 32(3): 618-651. Retrieved from ProQuest Central
Echevarria, J., Vogt M., & Short, D.J. (2004). Making content comprehensible for English learners: the SIOP model, 2e (2nd ed).: Prentice-Hall, Inc.
‘Essentials for Proctoring Examinations.” (2006). Ereach testing &
remediation. Retrieved from uakron.edu
Goldberg, M. F. (2004, January). The test mess. Phi Delta Kappan, 85(5):
361-366. Retrieved from ProQuest Central
Introduction to Issues in Language Assessment and Terminology. (2011). Retrieved from http://www.press.umich.edu/pdf/0472032011-intro.pdf
References

Tremendous pressure for testing companies and state education agencies to create
narrow tests quickly and to score them rapidly,
leading to more errors than are tolerable.
Pressure to produce good scores or suffer a loss of funding, reputation, or even jobs will continue to result in unethical behavior and suspect data
Survey results on unethical test administration practices:
Providing hints
Pointing out mismarked items
Providing more time than allowed
Providing instruction during the test, and
Changing students' answers




Reliable (gives stable results on every application)
Valid (measures what it purports to measure)
Using a proctor reduces cheating; improves test reliability
Research shows that how a test is administered affects the results
Proctors trained on administering specific tests make the testing experience more comfortable
Students’ scores are better; fewer complaints

Yes, the administration of a test by individual classroom teachers affect the test’s reliability and validity. How?


Slide 1
According to Goldberg, a reliable test gives stable results on every application and a valid test measures what it purports to measure (Goldberg, 2004). Using those definitions, does the administration of a test by individual classroom teachers affect the test’s reliability and validity? Yes.
First, companies and entities that provide tests, like ereach, note that using a proctor reduces cheating and therefore improves test reliability. Previous research shows that how a test is administered affects the results (Childs & Umezawa, 2009).
We have all taken the PRAXIS tests and I think you would agree that how well a proctor is trained in administering a test can make the testing experience more comfortable. When you know how long you have for a test and how much time you have left, you can gauge your time in answering. Student’s scores are better and there are fewer complaints.

Slide 2
“There is tremendous pressure for testing companies and state education agencies to create narrow tests quickly and to score them rapidly, leading to more errors than are tolerable. And the pressure to produce good scores or suffer a loss of funding, reputation, or even jobs will continue to result in unethical behavior and suspect data [by those administering the tests]” (Goldberg, 2004)
“In Pedulla, Abrams, Madaus, Russell, Ramos, and Miao's (2003) survey of more than 4,000 elementary and secondary teachers in the United States, teachers reported that the following "unethical test administration practices" had happened in their schools:
1. providing hints (7 to 15%, depending on the test's stakes for schools and students),
2. pointing out mismarked items (8 to 15%),
3. providing more time than allowed (12 to 19%),
4. providing instruction during the test (3 to 9%), and
5. changing students' answers (1 to 2%).
These practices were reported more frequently in states where tests are not used to make decisions about students and schools” (Childs & Umezawa, 2009).



Formal Assessment:
Standardized test "high stakes"
Norm-referenced
Proficiency
Objective
Summative
Traditional tests
-Assessment refers to a variety of ways of collecting information on a learner's language ability or achievement.

-Standardized: provide measure of student's proficiency using international benchmarks.

-Norm-referenced: is desgined to measure global language abilities. The purpose of an NRT is to spread students out along a continuum of scores so that those with low abilities in a certain skill are at one end of the normal distribution and those with high scores are at the other end, with the majority of the students falling between the extremes.

-Proficiency tests: assess the overal language ability of students at varying levels. This test tell us how capable a person is in a particular language skill area. (i.e. TOEFL)

(Introduction To Issues In Language Assessment And Terminology, 2011)
-Objective: is scored by comparing a student's responses with an established set of acceptable / correct responses on an answer key. The scorer does not require particular knowledge or training in the examined area.

-Summative: is administered at the end of the course to determine if students have achieved the objectives set out in the curriculum.

-Traditional assessments: are given to the students by the teachers to measure how much the students have learned.

(Introduction To Issues In Language Assessment And Terminology, 2011)



(Introduction To Issues In Language Assessment And Terminology, 2011)

Validity Evidence
Content Related
Criterion Related
Construct Related
Wrong Interpretations
Types of Languages Assessment:
Formal Assessment:
Policy Makers
Test Improvement = Educational Improvement
ELL Performance
Underperformance
Drop-Out rate higher

Citizens have negative impressions
Motto: Fix the schools and students
Vice Fix the Test
Exams with ELLs in Mind
Performance Better
Visual Cues
Cultural Neutral
Not Language in intensive
Fairbairn: Performance increase with visual and linguistic support
Fix the Exam vice the students and school
Article Discussion: Why the states Aren't Measuring up
According to Popham there are three types of validity of Evidence: Content Related, Ciriterion Related, and Construct Related.
Content Related: Adequatedly repsents the content of the curricular aim being measured
Cirterion Related: accureately predicts a student's performance on an external criterion
Construct Realted: "empricial evidence conifrms that an inferred construct exist and that a given assessment procudere is measuring the inferred construct accurately" (Popham, p. 89, 2011)
Reliability = Consistency
Stability Consistency of results among different testing occasions

* Alternate Form Consistency of results among two or more different
forms of a test

* Internal Consistency in the way an assessment instrument’s
items function
There are 3 types of reliability evidence

Stability Reliability-Stability, as a form of reliability, refers to consistency of test results over time. To get a fix on how stable an assessment’s results are over time, we usually
test students on one occasion, wait a week or two, then retest them with the same
instrument.

Alternate Form reliability- deals with the question of whether two or more allegedly equivalent test forms are, in fact, equivalent. In the classroom, teachers rarely have reason to generate two forms of a particular assessment instrument

Internal consistency- does not focus on the consistency of students’ scores on
a test. Rather, internal consistency deals with the extent to which the items in an
educational assessment instrument are functioning in a consistent fashion.

Classroom teachers are advised to become generally familiar
with the key notions of reliability, but not to subject their own classroom tests
to reliability analyses unless the tests are extraordinarily important.





Does the test measure what it is supposed to measure?

Reliability is an essential characteristic of a good test

Informal Assessment:
Classroom, " low-stakes"
Criterion- referenced
Achievement
Direct
Subjective
Formative
Alternative, authentic
Informal Assessment:
(Introduction To Issues In Language Assessment And Terminology, 2011)
(Introduction To Issues In Language Assessment And Terminology, 2011)
Informal Assessment-
Kid watching- its purpose to assess student's use of language. As they interact, read write. In a variety of informal settings throughout the year.
Anecdotal records- To provide an ongoing record of each student's performance in specific situations at set times.
Behaviors’ checklists- To track student's development by nothing which behaviors’ on a checklist are becoming part of student's learning repertoire.
Portfolio assessment- to document in a variety of ways how a student has developed as a reader and write over the course of the year.
Conferencing- To provide opportunities for teacher and student to discuss and assess student's development.
Peer assessment- To involve students in the evaluation process and to build their evaluative and interactive skills
Self assessment- To empower students by making them responsible for their own learning.

Assessments which have evolved with language program may include traditional paper and pencil written testing and also native and target languages ' cross comparisons.
Coherent program such as ELL, enable the instruction, rigorous teaching and effective learning of the target languages, i.e., English, by simultaneous and back and forth comparative and synergistic cross-teaching and referring to the vocabulary and grammar of the native language. Some assessment tests will require functional and grammatical language comprehension ability where as others might call for in-depth critical analysis and synthesis of information in a short answer essay to assess the comprehension of the ELL students through their writing.
Assessment is a continuous process that you use constantly as you set goals, identify student's needs, and plan instructional programs. The evolution of assessment will constantly change for us to understand the type of assessment in how we will select the best possible assessment tool for our students.
Realiability & Validity
What is a high order question?
What is your personal opinion about the reliability of the tests proctored by the individual teacher?
which unethical administration practice, would you as a teacher, most likely consider for ELL students?
How much will assessments evolve after we move to a 21st century classroom?
Do you believe that informal assessments will be eliminated or changed? If yes , which ones and explain?


Popham, J.W. (2011). Classroom Assessment: What Teachers Need to Know (6th ed.). Retrieved from The University of Phoenix eBook Collection database.
The Assessment of Thoughtful Literacy in NAEP: Why the States Aren't Measuring Up. (2009, Februay 2009). The Reading Teacher, 62(5), pp. 372-381. doi:10.1598/RT.62.5.1
Inclusive Achievement Testing for Linguistically and Culturally Diverse Test Takers: Essential Considerations for Test Developers and Decision Makers. (2009, Spring). Educational Measurement: Issues and Practice1, (), pp. 1-24. University of Phoenix Reserve Electronic Readings.
(“Inclusive Achievement Testing for Linguistically and Culturally Diverse Test Takers: Essential Considerations for Test Developers and Decision Makers,” 2009).
(“Inclusive Achievement Testing for Linguistically and Culturally Diverse Test Takers: Essential Considerations for Test Developers and Decision Makers,” 2009).
(Popham, 2011)
(“The Assessment of Thoughtful Literacy in NAEP: Why the States Aren't Measuring Up,” 2009).
The Reading test scores since be invalid by the states as stated in Fairbain and Fox's Article(“The Assessment of Thoughtful Literacy in NAEP: Why the States Aren't Measuring Up,” 2009).
Applegate and others argue that when it comes to high stakes testing ELLs are not included in the NORM group (“Inclusive Achievement Testing for Linguistically and Culturally Diverse Test Takers: Essential Considerations for Test Developers and Decision Makers,” 2009).
Full transcript