Use of Test Scores to evaluate Students

No description

on 21 March 2014

Transcript of Use of Test Scores to evaluate Students

In the 1800's UK (cheating issues), under President Nixon (cheating again) and under Reagan (failed and didn't stay) (Gratz-2009)
Use of Test Scores to evaluate Students
by: Lailaa Pienkos, Karl Poitras, Ryan Hubscher, Samantha Miraflor, Jerry Cerone, Marie Penelope Nezurugo‎, Marcin Garbulinski and Francesca Rodriguez-Abante‎
October 30th, 2013
McGill University
Using assessments for different purposes
promoting learning
guiding instructional decision making
diagnosing learning and performance problems
promoting self-regulation
determining what students have learned
Promoting learning
use assessments as:
- motivators
- mechanisms for review
- to influence cognitive processing
- learning experiences
- feedback
Guiding instructional decision
example: a quick pretest can help determine a suitable point at which to begin instruction
1. To assess students’ knowledge and understanding before teaching a topic
2. To monitor students’ learning throughout a lesson or unit
This gets ongoing information about the appropriateness of the instructional objectives and the effectiveness of the
instructional strategies
Diagnosing learning and performance problems
Promoting self-regulation
Determining what students have learned
Evaluation time!
Why evaluations are important
Consistent information about the knowledge , skills and abilities you are trying to measure.
How do we determine reliability in classroom assessments?
Test-retest reliability
Scorer reliability
Internal Consistency reliability
Enhancing reliability of classroom Assessments
Define each task clearly
Administer the assessment in similar ways
Avoid assessing students learning when they are ill
Include several tasks in an assessment
Important Qualities of a Good Assessment (RSVP)

Similar content that is scored in the same way for everyone
Does it measure what it is supposed to measure.(content, predictive and construct)
Content Validity
Predictive Validity
Construct Validity
Are the assessment instruments and procedures easy to use
As a student have you ever been assessed in a way you thought was unfair?

Assesses what students know and can do
before or during instruction.
Assesses what students have achieved
after instruction
to make final decisions.
standardized tests
have been
designed specifically
to identify special academic, social and emotional needs of students.

Certain of these tests
require training

in their use and
are often administered and interpreted by specialists.
teacher-developed assessment
instruments can
provide diagnostic information that teachers can use to help students improve as well.

: students must be aware of how well they are doing as they study and learn

: students must be able to assess their final performance accurately
Self-regulation consists of:
Assessments of students’ achievement are not only used to make major decisions about students, they are also used to
make decisions about teachers and schools

such high-stake assessments are a source of considerable
Assessments not only used to evaluate students...
Teachers almost certainly use
one or more formal assessments
to determine whether students have achieved instructional objectives. Such information is essential when
assigning final grades

School counselors and administrators may use assessment results for making
placement decisions
Table 12.3 (p.325)

Interpreting different types of scores

Table 12.3 (Omrod et al., 2010, 331)

Give students a practice or pre-test
Encourage students to do their best
Allow memory aids when the objectives do not require students to commit information to memory
Try to eliminate time limits
Be available to answer questions during the assessment
Use the results of several assessments to make decisions (e.g. to assign grades)

Keeping Students Test Anxiety at a Facilitative Level

What to Do:

Table 12.3 (Omrod et al., 2010, 325)

Potential problems:

Scores are not easily understood without prior knowledge of statistics

Used for:

Describing a student’s standing within the norm group

How it is determined:

By determining how far the performance is from the mean with respect to standard deviation units

Standard Score

Table 12.3 (Omrod et al., 2010, 325)

Potential problems:

Scores may be inappropriately used as a standard that all students must meet.

Scores are often inapplicable when achievement at the secondary level or higher are being assessed.

Used for:

Explaining norm-referenced test performance to people unfamiliar with standard scores.

How it is determined:

By equating a student’s performance to the average performance of students at a particular age or grade level.

Age/Grade Equivalent Score

Table 12.3 (Omrod et al., 2010, 325)

Potential problems:

Criteria for assessing mastery of complex skills may be difficult to identify

Used for:

Determining whether specific instructional objectives have been achieved

How it is determined:

By comparing performance to one or more criteria or standards for success

Criterion-Referenced Score

Table 12.3 (Omrod et al., 2010, 325)

Potential problems:

Scores may be difficult to interpret without knowledge of how performance relates to either a specific criterion or a norm group.

Used for:

Often used in teacher-developed assessment instruments

How it is determined:

By calculating the number or percentage of correct responses or points earned

Raw Score

Table 12.3 (Omrod et al., 2010, 325)

Potential problems:

Overestimate score differences near the mean

Underestimate differences at the extremes

Used for:

Explaining norm-referenced test performance to people unfamiliar with standard scores

How it is determined:

By determining the percentage of students at the same age or grade level who obtained lower scores

Percentile Rank Score

Keeping Students Test Anxiety at a Facilitative Level

What Not to Do:
Stress that students’ competence is being evaluated
Keep the nature of the assessment a secret until the day it is administered
Remind students that failing will have dire consequences
Insist that students memorize trivial facts
Give more questions that take more time than they have to complete the task
Hover over students, watching them closely as they respond
Evaluate students based on one assessment

Table 12.3 (Omrod et al., 2010, 331)
Taking Students Diversity into account
Developmental differences
Cultural Bias
Language differences
Students with Special needs

Assessing teacher performance and salary by student's grades
(and why it's flawed)

Logical and statistical side
Relativity is difficult to assess and account for : Disparity in student's level will affect teachers unevenly
Real life example:
Ray Emery at ,922% (21 GP)
Ben Bishop at ,920% (22 GP)
Emery's record: 17-1-0
Bishop's record: 11-9-1

Progression as suggested by
U of Kentucky's Bill James is
not linear but rather is up and down either trending up or
down over time much like
a wave.
(Not like
(More like this)
Moral side
It dehumanizes the teachers and students, having no regards for how exterior factors can affect teacher/student performance.
The teaching corpus and therefore
the content is highly impacted by
what the testmakers believe is
is important regardless whether it
is or not (at a local, global or
personal level)
The ''test'' culture tends to be very
math and science driven. AKA economically driven. This tendency in trying to from as many new economic agents as possible is denaturing the learning process.
Types of Standardized Tests
Been here done that
Achievement Tests
Assess how much students have learned from the things they have been taught in the classroom
Focus on curriculum
Content validity
Embraces memorization
Allow us to see how well students perform compared with the performance of students elsewhere
Ability Tests
Assess a general or specific capacity to learn
Also known as IQ tests
Used for prediction and to estimate how well students are likely to learn in the future
Everyday experiences rather than curriculum
Specific Aptitude Tests
Assess how well students are apt to perform in a particular area
Examples: art, music, or auto mechanics
Used by schools to select students for specific programs
Help with the counseling of students about future educational plans
''Merit'' bases programs have already been developed in the past and have failed successively (Gratz-2009)
Other incentives have
worked better to improve overall performance, such
as paying more for
''tougher'' schools and
taking more
Various Types of Assessments
"The process of observing a sample of students' behavior and drawing inferences about their knowledge and abilities." (Omrod, 2006)
Paper-pencil Assessment
Traditional Assessment
Standardized Tests
Informal Assessment
Formal Assessment
Let's look back at how far we've come
Teacher-Development Test
Authentic Assessment
Performance Assessment
Let's look back at how far we've gone
Assessment, done right this time!
Four girls were walking home from a Halloween party in 2002. They were walking by an old abandoned factory that stood next to a field. The factory was said to be haunted and many people in the area refused to set foot inside the factory grounds. When they got to the middle of the field, one of the girls said it would be fun to explore the old factory. The other girls were scared at first, but eventually one of them agreed to do it just for fun so they could tell their friends at school about it. Two of the girls climbed over the fence and the other two girls waited outside for them. After about twenty minutes had passed, the two remaining girls started getting worried. Suddenly, they heard blood-curdling screams coming from inside the old factory. It sounded like their friends. Terrified, the two girls ran all the way home. The two girls who went into the factory were never seen again. Today, the factory still stands they say that if you dare to enter the grounds on Halloween night, then you too will vanish, never to be seen again.
A Halloween Tale

Name three types of assessments.
Formal informal, paper pencil, performance assessment, traditional, authentic, standardized, teacher development
Name four characteristics of a good assessment.
Why is progression not a good metric to evaluate teachers?
Because progression is not linear.

Daily observations.
Planned ahead of time;
Used to provide information for a specific instructional purpose;
Students can prepare for it in advance.
Students write down answers to questions;
Can involve questions, topics, or problems.
Perform a skill to demonstrate knowledge about the topic in question.
Evaluate basic skills which can be applied in the outside world.
Measuring the actual knowledge that is required from the students;
Applied in "real-life" context.
Tests constructed by test construction experts or test publishing companies;
Used in many different schools and classrooms to determine general progress.
Evaluate learning abilities specific to one's classroom.
Airasian, P. W. (1994). Classroom assessment (2nd e.d.) New York: McGraw- Hill.

Bender, T.A. (1997). Assessment of subjective well-being during childhood and adolescence. In G.D. Phye (Ed.) Handbook of classroom assessment: Learning, achievement, and adjustment. San Diego, CA: Academic Press.

Omrod, J.E. (2010). Principles of educational psychology (2nd CanadianEdition) Toronto: Pearson Education Canada.
Full transcript