### Present Remotely

Send the link below via email or IM

CopyPresent to your audience

Start remote presentation- Invited audience members
**will follow you**as you navigate and present - People invited to a presentation
**do not need a Prezi account** - This link expires
**10 minutes**after you close the presentation - A maximum of
**30 users**can follow your presentation - Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

### Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.

You can change this under Settings & Account at any time.

# Fall12: INF1240 Surveys & Experiments

Lecture Slides, Week 6, Oct.15 2012

by

Tweet## Sara Grimes

on 15 October 2012#### Transcript of Fall12: INF1240 Surveys & Experiments

INF1240H: Research Methods

Surveys, Experiments & Quasi-Experiments Surveys Experiments & Quasi-experiments Quantitative research: sampling in ways that allow generalization:

distribution of population across "KNOWN" categories

everyone in the population has an equal and known chance of being included

i.e. probability sampling (*can be submitted to statistical analysis using probability theory - dispersion, distribution, "normal" etc. 4 Steps to framing your sample:

defining your target population

locating a sample frame

choosing random probability or non-random method

deciding on the sample size VALIDITY RELIABILITY Defining your terms

(+ e.g. wording your questionnaire) Sample size: Relationship between sample size and sampling error isn't direct. Choice of sample size depends on desired confidence level (19/20).

For "general population" (e.g. of Canada) a sample of 2000 will usually suffice. Larger samples would not yield significantly more accurate data. e.g. construct validity: are you actually measuring what you claim to be measuring? can your study be reproduced?

Two major types of surveys:

Descriptive: Draws a picture (describes); documents current conditions or attitudes

Analytical: Tries to describe and explain WHY certain situations exist. Survey Design and Administration Surveys: Pros

Can investigate problems in realistic settings Reasonably cost-effective, easy to orchestrate

Yields large amounts of data - possible to produce generalizable findings

Existing survey data & instruments can be of huge help (census, Stats Can, etc.) Operationalizing a cause-effect relationship Studies of determinacy - examine why things change over time; establish the degree of association b/w 2 independently measured variables (independent & dependent variable). Experiments/Quasi-experiments - differ in terms of the setting, intervention (yes/no), control (group, sample), & type of relationship under study. Designing Experiments:

If X causes a change in Y, how can I observe this relationship in such a way to confidently say whether X significantly contributes to that change? Experiments:

Operationalization

of "change" Videogame Violence Debates ©2008 Gaygamer.net A question of methodology? Correlation vs. causality Operationalization is extremely crucial in surveys & experiments Sampling Generalization NOT THE SAME THING Experiments try to establish causality Group Exercise

In groups of 3-4:

How would you "sample" for your own research projects?

How does this relate to how you are operationalizing your variables? Normal Probability Distributions (bell curve) Probability Based Sampling: A statement that attempts to describe a pattern of behaviour or relationship which is, on average, correct i.e. REPRESENTATIVE of the population as a whole. Simple Random: e.g. generate phone numbers, draw out of a hat: each respondent is selected randomly, one at a time, independent of another and without replacement.

Systematic Random: Determine the number of entries and number of respondents to be selected, then divide. E.g. call every 60th name in the phonebook.

Stratified Random: If some characteristic(s) are known, can structure sampling to reflect total population (e.g. ensure 52% females, accurate representation of ethic diversity, etc.) Confidence Level: Measured through a Bell Curve. How confident you are that the sample(s) represents the whole. The standard in the social sciences is approx. 19/20…i.e. confidence level of 95% that the sample is representative. Confidence Interval: The full range of where the sample might really be. Used to calculate the margin of error. Margin of Error: The more we sample, the more incremental the margin of error becomes … that’s why huge surveys (over 10,000) aren’t needed, even when the population is large. Population:That collection of individuals, communities, or nations (or subcultures, or target market) about which one wishes to make a general statement.

Sample Frame: The list from which you draw a sample.

Sample: A segment selected from a population; it is then interpreted (if probability based) to represent that population. Quasi-experiments: should at least "say something about what is possible in ordinary settings and have some interest to practitioners" (Knight, p.75).

True experiments - "routinely criticized" - for artificiality Knight: "Arguably, the main difference between 'true' and 'quasi' experiments lies in the amount of success the researcher has - and wants to have - in designing...threats [e.g. contamination, learning effect] out of the study" (p.75) Quasi-Experimental Size Matters, but it's also relative However, if representative subgroups are important, may need bigger sample to ensure generalizability....

BUT - you can not eliminate the sampling error altogether, results are never 100% representative.

Sampling error: the extent to which the sample differs from the population it was drawn from. Based on sample size and extent of homogeneity/diversity of population. Margin of error: the "amount" of random sampling error, + or -.

Pollsters try to aim for a margin of error of + or - 2.5%, on a sample size of 1500 respondents

Confidence level: e.g. 19/20: were the study to be replicated over & over, results would be accurate 95% of the time The variable thought to be the “Cause” in a “cause-effect” relationship. It is a variable that has been chosen as a possible influence on variations in a dependent variable.

E.g. gender: a study that identifies “gender” as its independent variable is positing that gender is at the root of differences between male responses and female responses (opinions, attitudes, behaviours) on a particular question (dependent variable). Multi-Stage Cluster Sampling

Community is randomly selected, then within this community, various units are randomly selected for study.

– E.g. Throw a dart on a map of Canada and visit wherever the dart falls – Randomly pick a city block. Go there, and randomly pick some houses to visit and survey.

Convenience: Whoever is there at the time, place – whoever is available. Online surveys, mall intercepts, etc.

Quota: “Fancy” form of convenience – systematic selection based on a pre-defined criteria (e.g. want to have even number of men, women)

Snowball: Useful when trying to reach people who are hard to get – find one, ask them if they can connect you with someone similar (e.g. tattoo artists, supermodels, drug addicts) Non-Probability Based Every member of the "population" has an equal or known chance of being chosen for a study. Findings are representative, and respondents are randomly selected. External validity. People in the population do NOT have an equal chance to participate in the study. Not generalizable, not randomly selected, not representative. No external validity.

Surveys: Cons

Independent variables cannot be manipulated –

Causal relationships can NOT be determined –

Inappropriate wording and question order can create biased results

Some survey research becoming increasingly difficult to conduct (low response rates - 30%) Must meet 3 conditions:

the independent variable precedes the dependent variable in time;

no intervening variables or spurious linkages;

clear explanatory rationale of the direction of influence. A variable thought to be influenced by other variables. It is the “Effect” in a potential cause-effect relationship.

E.g. Video game playing – If males are found to play more hours a week of video games than females – video game play is the dependent variable, thought to vary by gender. Additional Variables:

Control Variables: A variable that is taken into account in exploring the relationship between an independent and dependent variable.

Three types:

Intervening variable: links the independent variable to a dependent variable – the “middle man”.

Conditional variable: a variable that accounts for a change in the relationship between and independent and dependent variable when the conditions change.

Source of spuriousness (or “confounding”) variable: Variable that’s viewed as a possible influence on both the independent and dependent variables, in such a way that it accounts for the relationship between them. Independent Variable: Dependent Variable: Agenda: Oct.15, 2012

Assignment 2 Due

Blog visit: http://1240blog.blogspot.ca/

Lecture: surveys, experiments & quasi-experiments

If time: in-class exercise

Full transcriptSurveys, Experiments & Quasi-Experiments Surveys Experiments & Quasi-experiments Quantitative research: sampling in ways that allow generalization:

distribution of population across "KNOWN" categories

everyone in the population has an equal and known chance of being included

i.e. probability sampling (*can be submitted to statistical analysis using probability theory - dispersion, distribution, "normal" etc. 4 Steps to framing your sample:

defining your target population

locating a sample frame

choosing random probability or non-random method

deciding on the sample size VALIDITY RELIABILITY Defining your terms

(+ e.g. wording your questionnaire) Sample size: Relationship between sample size and sampling error isn't direct. Choice of sample size depends on desired confidence level (19/20).

For "general population" (e.g. of Canada) a sample of 2000 will usually suffice. Larger samples would not yield significantly more accurate data. e.g. construct validity: are you actually measuring what you claim to be measuring? can your study be reproduced?

Two major types of surveys:

Descriptive: Draws a picture (describes); documents current conditions or attitudes

Analytical: Tries to describe and explain WHY certain situations exist. Survey Design and Administration Surveys: Pros

Can investigate problems in realistic settings Reasonably cost-effective, easy to orchestrate

Yields large amounts of data - possible to produce generalizable findings

Existing survey data & instruments can be of huge help (census, Stats Can, etc.) Operationalizing a cause-effect relationship Studies of determinacy - examine why things change over time; establish the degree of association b/w 2 independently measured variables (independent & dependent variable). Experiments/Quasi-experiments - differ in terms of the setting, intervention (yes/no), control (group, sample), & type of relationship under study. Designing Experiments:

If X causes a change in Y, how can I observe this relationship in such a way to confidently say whether X significantly contributes to that change? Experiments:

Operationalization

of "change" Videogame Violence Debates ©2008 Gaygamer.net A question of methodology? Correlation vs. causality Operationalization is extremely crucial in surveys & experiments Sampling Generalization NOT THE SAME THING Experiments try to establish causality Group Exercise

In groups of 3-4:

How would you "sample" for your own research projects?

How does this relate to how you are operationalizing your variables? Normal Probability Distributions (bell curve) Probability Based Sampling: A statement that attempts to describe a pattern of behaviour or relationship which is, on average, correct i.e. REPRESENTATIVE of the population as a whole. Simple Random: e.g. generate phone numbers, draw out of a hat: each respondent is selected randomly, one at a time, independent of another and without replacement.

Systematic Random: Determine the number of entries and number of respondents to be selected, then divide. E.g. call every 60th name in the phonebook.

Stratified Random: If some characteristic(s) are known, can structure sampling to reflect total population (e.g. ensure 52% females, accurate representation of ethic diversity, etc.) Confidence Level: Measured through a Bell Curve. How confident you are that the sample(s) represents the whole. The standard in the social sciences is approx. 19/20…i.e. confidence level of 95% that the sample is representative. Confidence Interval: The full range of where the sample might really be. Used to calculate the margin of error. Margin of Error: The more we sample, the more incremental the margin of error becomes … that’s why huge surveys (over 10,000) aren’t needed, even when the population is large. Population:That collection of individuals, communities, or nations (or subcultures, or target market) about which one wishes to make a general statement.

Sample Frame: The list from which you draw a sample.

Sample: A segment selected from a population; it is then interpreted (if probability based) to represent that population. Quasi-experiments: should at least "say something about what is possible in ordinary settings and have some interest to practitioners" (Knight, p.75).

True experiments - "routinely criticized" - for artificiality Knight: "Arguably, the main difference between 'true' and 'quasi' experiments lies in the amount of success the researcher has - and wants to have - in designing...threats [e.g. contamination, learning effect] out of the study" (p.75) Quasi-Experimental Size Matters, but it's also relative However, if representative subgroups are important, may need bigger sample to ensure generalizability....

BUT - you can not eliminate the sampling error altogether, results are never 100% representative.

Sampling error: the extent to which the sample differs from the population it was drawn from. Based on sample size and extent of homogeneity/diversity of population. Margin of error: the "amount" of random sampling error, + or -.

Pollsters try to aim for a margin of error of + or - 2.5%, on a sample size of 1500 respondents

Confidence level: e.g. 19/20: were the study to be replicated over & over, results would be accurate 95% of the time The variable thought to be the “Cause” in a “cause-effect” relationship. It is a variable that has been chosen as a possible influence on variations in a dependent variable.

E.g. gender: a study that identifies “gender” as its independent variable is positing that gender is at the root of differences between male responses and female responses (opinions, attitudes, behaviours) on a particular question (dependent variable). Multi-Stage Cluster Sampling

Community is randomly selected, then within this community, various units are randomly selected for study.

– E.g. Throw a dart on a map of Canada and visit wherever the dart falls – Randomly pick a city block. Go there, and randomly pick some houses to visit and survey.

Convenience: Whoever is there at the time, place – whoever is available. Online surveys, mall intercepts, etc.

Quota: “Fancy” form of convenience – systematic selection based on a pre-defined criteria (e.g. want to have even number of men, women)

Snowball: Useful when trying to reach people who are hard to get – find one, ask them if they can connect you with someone similar (e.g. tattoo artists, supermodels, drug addicts) Non-Probability Based Every member of the "population" has an equal or known chance of being chosen for a study. Findings are representative, and respondents are randomly selected. External validity. People in the population do NOT have an equal chance to participate in the study. Not generalizable, not randomly selected, not representative. No external validity.

Surveys: Cons

Independent variables cannot be manipulated –

Causal relationships can NOT be determined –

Inappropriate wording and question order can create biased results

Some survey research becoming increasingly difficult to conduct (low response rates - 30%) Must meet 3 conditions:

the independent variable precedes the dependent variable in time;

no intervening variables or spurious linkages;

clear explanatory rationale of the direction of influence. A variable thought to be influenced by other variables. It is the “Effect” in a potential cause-effect relationship.

E.g. Video game playing – If males are found to play more hours a week of video games than females – video game play is the dependent variable, thought to vary by gender. Additional Variables:

Control Variables: A variable that is taken into account in exploring the relationship between an independent and dependent variable.

Three types:

Intervening variable: links the independent variable to a dependent variable – the “middle man”.

Conditional variable: a variable that accounts for a change in the relationship between and independent and dependent variable when the conditions change.

Source of spuriousness (or “confounding”) variable: Variable that’s viewed as a possible influence on both the independent and dependent variables, in such a way that it accounts for the relationship between them. Independent Variable: Dependent Variable: Agenda: Oct.15, 2012

Assignment 2 Due

Blog visit: http://1240blog.blogspot.ca/

Lecture: surveys, experiments & quasi-experiments

If time: in-class exercise