**AQA PSYA4: Research Methods**

**Application of Scientific Method**

**Data Analysis and Reporting**

**Designing Investigations**

**Science**

**Validating new knowledge**

**Research methods and concepts**

**Reliability, validity and sampling**

**Ethical issues**

**Spearman's Rho**

**Chi Squared**

**Mann Whitney U Test**

**Wilcoxon T Test**

Evaluation of Scientific Methods

Can psychology claim to be a science?

Are the goals of science appropriate for psychology?

Reductionism and Determinism

Scientific research is desirable

Psychology uses scientific method

Lack of objectivity and control

Nomothetic vs Idiographic

Scientific methods haven't worked

Qualitative research

Evaluation

Peer Review

Peer Review and the Internet

Finding an Expert

Anonymity

Publication Bias

Preserving Status Quo

Can't correct old research

Experiments

Laboratory Experiments

Field Experiments

Natural Experiments

Experimental Design

Self-Reporting Methods

Observational Studies

Correlational Analysis

Case Studies

Other Research Methods

Human

Non-Human

Moral Justification

Existing constraints

Ethical Issues

Code of Conduct

Dealing with Ethical Issues

Reliability

Validity

Sampling Techniques

Experimental Research

Experimental Research

Observational Techniques

Observational Techniques

Self-Report Techniques

Self-Report Techniques

Opportunity Sample

Volunteer Sample

Random Sample

Stratified and Quota Samples

Snowball Sampling

Inferential Analysis, Probability and Significance

Descriptive and Inferential Statistics

Analysis and Interpretation of Qualitative data.

Allows psychologists to produce verifiable knowledge about behaviour that wasn't just commonsense or 'armchair psychology'.

Psychologists can generate models that can be falsefied conduct well controlled experiments to test these models.

It has been questioned if that just using a scientific method turns psychology into a science or not.

Miller (1983

Suggests using scientific method is just 'dressing up' psychology as a science.

While they may be using scientific tools that doesn't make it a science and that it is better described as a psuedoscience.

Hard to measure things objectively as object of study reacts to the researcher which can lead to problems such as expermenter bias or demand characteristics.

Heisenberg (1972)

Other sciences also face this problem.

Unable to measure a subatomic particle without the presence of the reasearchers and the measurement of it affecting it.

Called uncertainty princluple and similiar to an experimenter effect but in physics.

R.D. Laing

Claimed it was innapropriate to view a person experiencing distress as a chemical system gone wrong.

Treatment of individuals could only be successful if they were treated as such.

Science takes a more nomothetic approach, trying to find similarities and group people together.

Psychological appraches to treating mental illnesses have had a best moderate success suggesting that maybe the goals of science aren't always appropriate.

Even more qualitative methods are still scientific in their aim to be valid. they are made objective by being compared with others as a means of verification.

Reductionist

Reduces complex phenomena to simple variables to allow the study of causal relationships between them.

Deterministic

Deterministic in search of causal relationships because otherwise scientific research could not be used as a method of understanding behaviour.

Assessment of scientific work by other who are qualified in the field.

To ensure that the conduction and publishing of research is of a high standard.

Three main purposes:

Allocation of research funding

Publication of research in scientific jounals

Assessing the research rating of university departments.

With the increasing availability and fast pace of information on the internet it is hard to maintain quailty of information.

On many sites such as Wikipedia and Philica it is the reader that decides what is valid or not. Though scientific research should really only be peer reviewed by people with appropriate qualifications, with the use of the internet everyone is becoming able to add to the information flow - true or not.

Conducted in a laboratory or laboratory-like setting.

Advantages

Disadvantages

High in internal validity

Replicability

Causational

Low in mundane realism

can be low in external validity

experimenter bias/demand characteristics.

Advantages

High mundane realism

High external validity

Experimenter effects reduced

Disadvantages

Demand characteristics

Lower internal validity

Ethical issues (deception)

Experiment conducted in a much more natural environment.

Experiment that takes advantage of existing IV's.

Correlational not causal.

Advantages

Disadvantages

Low demand characteristics

high in mundane realism

Time consuming

Sometimes little valid results produced

Hard to determine causation

Low internal validity

Independent Groups

Matched Pairs

Repeated Measures

Not always possible to find an appropriate expert to review a research proposal.

Smith (1999)

If an individual does not properly understand the material, this could lead to poor research being passed.

Anonymity usually practised so reviewers can be honest and objective.

Anonymity can lead to rivals settling scores or trying to bury others research though as social circumstances inevitably affect objectivity.

This has led to some journals practicing open reviewing to combat this.

Journals prefer to publish positive results, suggested that this is due to editors wanting to publish research with important implications.

Journals also tend to not publish replications of study, which suggests journals are as bad as newspapers for wanting eye-catching stories.

Leads to publication bias that in turn leads to misrepresentation of facts.

Richard Horton

"The mistake of course, is to have thought the peer review any more than a crude means of discovering the acceptablily - not the validity - of a new finding"

Science generally resistant to change and therefore likely to ignore any large shifts in opinion that research might create, prefering to instead ignore that research.

Brooks (2010)

Points to peer-reviewed research that was subsequently debunked but is still used in debate in parliment.

Once research has been published the results remain in the public view evem is they are later shown to be wrong or just poor research.

Different people in different conditions.

Advantages

Disadvantages

Advantages

Disadvantages

Participants matched to person in other condtion based on various criteria.

Criteria could be things such as age, gender or IQ.

Advantages

Disadvantages

Each participant tested under all conditions.

No order effects

Less demand characteristics

Hard to make comparison

More people needed

Individual differences reduced

Lower order effects

Time consuming

People never totally similar

Easy to compare

Fewer people needed

Increased demand characteristics

Need counterbalancing

There are other methods such as:

Content analysis: a kind of observational study

Cross-cultural research: comparng the effects of different cultural practices on behavior

Meta-Analysis: combines results of many studies to reach overall conclusions.

Detailed studies into a single individual, institution or event.

Generally longitudinal.

Uses information from a range of sources such as the person concerned and also from their family and friends.

Allows study of complex interaction between many factors.

Difficult to generalise from case studies.

Can be unreliable due to heavy use of recollection.

Correlation does not show causation.

Useful for identifing relationships between co-variables.

Other intervening variables could explain why the variables are linked.

May lack validity.

Interviews

Questionnaires

Structured

Unstructured

Open Questions

Closed Questions

Use sampling methods:

Time sampling

Event sampling

Naturalistic or controlled.

Observer bias possible.

Mututally exclusive behavioural categories to record instances of behaviour.

Questions developed as a response to previous answers.

Real-time, face-to-face.

Set of questions given to participants.

More repeatable

Provide qualitative data.

Rich insight

Harder to analyse

Easier to analyse

Provide qantitative data

Advantages

Advantages

Advantages

Advantages

Advantages

Disadvantages

Disadvantages

Disadvantages

Disadvantages

Disadvantages

People from target population that are most easily available.

Usually people who are geographically close.

Quick

Easy

100% attendance

Not representative

Sample bias

Participants volunteer to take part in study.

Immediate consent

Participants easy to get

Sample bias (volunteer bias)

Participants selected randomly from target population.

Quick

Easy

Potentially unbiased

Time consuming

Can be biased if people refuse participation

Advantages

Sample bias reduced

Disadvantages

Time consuming

May not get representative sample

Systematic Sample

Pick participants with a system

Identify a few suitable participants, ask them to point in direction of other possible candidates.

Useful when hard to find suitable participants (e.g. eating disorder studies)

Prone to bias

Strata identified within a population.

Predetermined number ofparticipants taken from each group in proportion to representation in target population.

Stratified : Done by random techniques

Quota : Done using opportunity sampling.

Representative sample

Ecological validity

Generalisability

Time consuming

Large sample needed

Opportunity sample can lead to bias

Informed Consent

Deception

Right to Withdraw

Protection from Harm

Confidentiality

Privacy

Giving the participant all the information they need to make an informed decision about whether or not to participate in the study.

Can cause demand charachteristics if participant too much.

Issues if study could cause harm or distress and they haven't given full consent.

Deliberately misinforming participant.

Stops participant being able to give informed consent.

Giving participants the right to leave the study if they feel uncomfortable.

Important if participant has been deceived.

Could create sample bias.

Covers all physical and psychological harm.

Important that no permanent damage caused.

Hard to guarantee that no harm will be caused.

Participant has right to keep personal data secret.

Can be hard as often publishing research can make the identity of participants obvious.

Sometimes hard not to invade participants privacy.

If behavior is being observed in own home situation, privacy becomes a big issues.

Respect

Competence

Responsibility

Integrity

Standards of privacy, confidentiality and informed consent.

Respect for the dignity and worth of all persons.

Deception only acceptable when needed to protect the integrity of the research.

Maintain high standards.

Responsibility to protect from harm.

Responsibility to debrief participants to identify an unforeseen harm and arrange for any assistance if needed.

Should be honest and accurate.

Should bring instances of misconduct to the attention of the BPS.

Informed Consent

Deception

Right to Withdraw

Protection from Harm

Confidentiality

Privacy

Presumptive consent:

Group of people similar to participants asked if they would be okay with the study.

Used to presume if the participants would/wouldn't agree to participate.

Asked to sign document agreeing to participate.

Right to withdraw given.

Need for deception judged by ethics committee.

Debriefed at end of study.

Participant can withdraw information after debrief.

Can withdraw if they feel uncomfortable.

May feel unable to, especially if they have been paid for participation.

Researchers try to avoid harmful situations.

Try to keep risk of harm lower than that of day life.

Participants kept anonymous.

If participants need to be able to be distinguished from each other, alias or number used.

Observation only acceptable informed consent.

Allow participants private space as much as possible.

If psychologist deemed to have deviated from acceptable ethics, they may be disbarred from BPS and therefore unable to practice psychology.

Several reasons for using animals in psychological research:

Offer greater control and objectivity in studies

Can be used when experiments would not be viable for humans (e.g. Harlow's monkeys)

Enough physiology and evolutionary past in common to be able to draw conclusions about humans from animal testing.

Sentient Beings

Speciesism

Animal Rights

There is evidence that animals other than primates are sentient and yet they are still used in research.

Brain-damaged human individuals are not sentient, yet they would not be used in research without consent.

There is evidence to suggest animals can respond to pain even if they may not be able to feel it.

Peter Singer (1990)

Discrimination on basis of species is no different from ageism or racism.

Gray (1991)

We have no duty of care towards animals and so speciesism is not equivalent to human isms such as racism.

Singer's view is utilitarian (Greater good)

Reagan (1984)

Animals have right to be treated with respect and should not be used in research.

No circumstances under which animal research is acceptable.

Having rights is dependent upon having responsibilities in society, animals do not have responsibility and so it can be said they do not have rights.

Animals (Scientific Procedure Act:

Requires animal research only takes place in licensed facilities.

To acquire license, must meet criteria:

Research important enough to justify research.

Cannot be done using other methods.

Minimum numbers of animals used.

Discomfort or suffering kept to minimum.

3 R's:

Reduction

Replacement

Refinement

Russel and Birch (1959)

Replicability

When experiment repeated, all conditions must be the same or a change in result could be due to a changed result.

Consistency

Two or more observers should get the same record.

Extent to which the observers agree is called inter-observer reliability.

Reliability of observations can be improved though training observers to use the coding system/behaviour checklist appropriately.

Internal Reliability: all questions should be measuring same thing.

External reliability: Measure of consistency.

Inter-interviewer reliability: Whether two interviewers produce the same outcome.

Split-half method: Compare performance on two halves of a questionnaire.

If test is assessing same thing in all questions then scores should have a close correlation between both halves.

Test-retest method: Given questionnaire and then given it again after an amount of time.

If measure is reliable outcome should be the same.

Lab experiments aren't always low in external validity and field experiments aren't always high in it.

Validity dependent more on how artificial and contrived an experiment is, also how aware the participant is of being studied.

Mundane Realism

Generalisability

Internal Validity

External Validity

How applicable an experiment is to the real world.

How far the results can be applied to the general population.

The more representative the sample it is, the more generalisable it is.

High in internal validity if the experiment measures what it intended to measure.

Affected by internal validity.

External validity deals with how applicable the results are to different times (historical validity), cultures (population validity) and other people and situations (ecological validity).

Observations cannot be valid if the coding system/behavioural checklist is flawed.

Observer bias: when what is observed is affected by expectations of observer.

Observational studies likely to have high ecological validity due to more natural behaviors - not always the case though.

Face Validity

Concurrent Validity

Does the test lok like it's measuring what it is intended to meaure?

Estabilished by comparing performance on new questionnaire with previously established test on same topic.

If performance has high correlation then this is evidence of high concurrent validity.

Probability

Inferential Tests

Descriptive

Inferential

Measures of Central Tendency

Measures of Dispersion

Graphs

Selecting the right test

Justifying your choice

Stating conclusions

Inferential tests allows researchers to work out the likelihood of something occurring.

There are different significance levels that psychologists can choose but usually a 5% significance level is chosen.

Significance levels are degrees of uncertainty.

A 5% significance level means there is a 5% chance of the results occurring even if there is no real association between the samples.

Help to draw conclusions about populations based on samples drawn from them.

Allows us to infer if a pattern is due to chance or not.

Observed and Critical Values

Observed Value

Critical Value

By using statistical tests we create a test statistic. The test statistic for any set of data is called the observed value as it is based on the observations made.

Critical values are the values compared against the observed value to determine if it is significant or not.

The critical value is the number the observed value must reach in order to be significant.

Finding the appropriate critical value:

Degrees of freedom (df)

One-tailed or two-tailed test

Significance level

In most cases this can be gained by looking at the number of participants in the study (N)

if directional hypothesis: One tailed

If non-directional hypothesis: Two-tailed

Usually P=0.05 (5%)

Observed value either needs to be more or less than this value. (It should be stated under each table which)

Choosing which statistical test

Depends on research design and level of measurement.

Justifying choice of test

Levels of Measurement

How to choose which test to use:

No

Yes

Yes

Yes

No

No

Nominal Data?

Correlation?

Independent Groups?

Chi-Squared

Spearman's Rho

Mann-Whitney U

Wilcoxon T

Identify level of measurement with reference to actual data.

State whether a test of correlation or difference is needed and justify this.

If a test of difference is used, state if it is independent groups or repeated measures and justify this.

Nominal

Ordinal

Interval

Data separated into categories

e.g. Tall, Short

Data is ordered in some way

Difference between each item not always the same

e.g. Ordering people by height

Data measured using units of equal intervals

e.g. Counting correct answers

Mean:

Median:

Mode:

Calculated by adding all values and dividing by the number of values.

Can be unrepresentative if there are extreme values.

Not suitable for nominal data.

Middle value of an ordered list.

Not affected by extreme values.

Not all values reflected in the median.

Not suitable for nominal data.

Most common value.

Not useful when there are several modes.

Only measure suitable for categorised data.

Suitable for other data types too.

Range

Standard Deviation

Difference between highest and lowest value.

Easy to calculate.

Affected by extreme values.

Spread of data around the mean.

More precise as all data taken into account but extreme values not expressed.

Bar Chart

Scattergram

Height of bar = frequency

Suitable for all levels of measurement.

Dot or cross shown for each pair of values.

Suitable for correlational data.

Bottom left to top right = positive correlation

Top left to bottom right = negative correlation

No pattern = No correlation

Are the data nominal, ordinal or interval?

Is a correlation involved or is there a difference between two data sets?

Is the design repeated measures or independent groups?

Spearman's

Chi-Squared

Mann-Whitney

Wilcoxon

Test of correlation needed as hypothesis stated a correlation.

Data are ordinal or interval.

Therefore the appropriate test is Spearman's Rho.

Data has been put into categories and therefore is nominal.

The results are independent in each cell and the expected frequencies in each cell are greater than 4.

The appropriate test is therefore chi-squared.

A test of difference is required because the hypothesis states a difference.

The design was independent groups and the data was ordinal or interval.

Therefore Mann-Whitney is a suitable test.

A difference test is required as the hypothesis states a difference between the two conditions.

The design is repeated measures or matched pairs and the data scores are interval or ordinal data.

Therefore, a wilcoxon test is appropriate.

Key features of conclusions:

State observed value.

Say if this is greater or less than the critical value.

State whether the null hypothesis can be accepted or rejected.

For Spearman and Chi-Squared this means observed value must be greater than critical value to reject null.

For Mann-Whitney and Wilcoxon observed value must be lower than critcal value.

Restate hypothesis you are accepting.

Summarising qualitative data

Inductive

An iterative process

General Principles

Validity and Reflexivity

1. Read and reread data, trying to understand meaning communicated and perspective of participants.

2. Break the data into meaningful units.

3. Assign label/code to each unit. These will form the basis of your categories.

Each unit may be given more than one code/label.

4. Combine the simpler codes into larger categories/themes.

5. Check can be made on categories by collecting new data and applying them to categories. If the new data fits then they represent the data appropriately.

6. Final report shoudl discuss and material to illustrate the themes.

7. Conclusions can be drawn, which may include new theories.

Reflexivity is the term used to describe the extent to which the process of research reflects a researcher's values and thoughts.

Reflexivity

Validity

May be demonstrated using triangulation.

Triangulation

Compares results from a variet of different studies of the same thing/person.

The studies will likely have used different methodology.

If the results agree this supports their validity, if they disagree this can lead to further research to increase understanding.

Difficult to summarise, cannot use measures of central tendency or spread.

Must instead try to identify repeated themes.

Most qualitative research aims to be inductive, finding theories FROM the data.

Less commonly used is a deductive approach. Used in triangulation, previously existing categories are used to categorise new data.

The data must be gone through repeatedly which takes a lot of time.

The main intention is the find a way to order the data in a way that represents the participants' perspective.

Errors

Type 1

Type 2

Type 1 errors are false positives.

These occur when the significance level is too lenient.

Type 2 errors are false negatives.

They occur when the significance level is too stringent.

i.e. the null hypothesis is accepted even when there is a difference.

i.e. the null hypothesis is rejected even when there is no difference.

Procedure:

Step 1 : State alternative and null hypothesis

Step 2: Record data, rank each co-variable and calculate the difference.

Step 3: Find the observed value of rho (correlation coefficient)

Step 4: Find the critical value of rho

Step 5: State the conclusion

Null hypothesis:

There will be no correlation.

Alternative hypothesis:

There will be a correlation (non-directional)

Or

There will be a correlation in this direction (directional)

Rank each variable separately from low to high.

If there are two or more of the same number then calculate the mean of the ranks that those numbers would have been given and this is the rank for all of them.

Can be useful to use a table for all the values.

Participant Number

Variable 1

Rank A

Variable 2

Rank B

Difference between A and B

d^2

Using the formula:

and the values already calculated in the table, calculate the value of rho.

If the hypothesis is directional, a one-tail test is used.

If the hypothesis is non-direction, a two-tailed test is used.

Use the table of critical values to find the critical value.

When comparing the observed value to the critical values only the number counts not the sign.

The sign just shows if the correlation is positive or negative.

If the observed value is greater than or equal to the critical value then the null hypothesis is rejected and the alternative hypothesis is accepted.

However, if the hypothesis states a positive correlation and a negative correlation is found then we still need to accept the null as it does not meet the criteria of the alternate.

Same the other way around.

Same the other way around.

Repeated measures design.

Independent groups design.

The expected frequencies for each shouldn't be less than 5, so for a 2x2 table this would mean no less than 20 participants.

Procedure

Step 1: State alternative and null hypothesis

Step 2: Draw up contingency table

Step 3: Find observed value by comparing observed and expected frequencies for each cell.

Step 4: Add all values in final column

Step 5: Find the critical value of chi square

Step 6: State the conclusions

Null hypothesis: There will be no difference.

Alternative hypothesis:

There will be a difference (non-directional)

Or

There will be a difference in this direction (directional)

Cell A

Cell B

Cell C

Cell D

Variable 1

Variable 2

Totals

Totals

A + B

A + C

B + D

C + D

A + B + C + D

The value of chi-squared is the sum of the values in the last column.

Calculate degrees of freedom:

(Number of rows - 1) x (Number of columns - 1)

In this case:

df = (2-1) x (2-1) = 1

Look up the value on critical values table, whichever value corresponds to the degrees of freedom (right hand column) and the significance level (top rows) is the critical value.

If observed value is less than critical value then null hypothesis accepted. If observed value is more than or equal to critical value then we reject the null hypothesis and accept the alternative.

If the direction is not in the direction that the hypothesis suggested though, the null must still be accepted and the alternate rejected.

Procedure

Step 1: State alternative and null hypothesis

Step 2: Record the data in a table and allocate points.

Step 3: Find observed value of U

Step 4: Find the critical value of U

Step 5: State the conclusion

Null hypothesis: There will be no difference.

Alternative hypothesis:

There will be a difference (non-directional)

Or

There will be a difference in this direction (directional)

To give points, take each score individually and compare it to all the other scores of the OTHER group.

Every time a score in the other group is higher than the score you're looking at: 1 point

Every time a score in the other group is the same as the score you're looking at: 0.5 points

The observed value of U is just the lowest score value.

In this case, that would be 5.5

N1 = Number of participants in group 1

N2 = Number of participants in group 2

The critical value tables for Mann-Whitney compare N1 against N2 so that's how you find the critical value.

Your N1 against your N2 in the table = the critical value for your test.

You need to pick the right table though, for one-tailed directional hypothesis) or two-tailed (non-directional hypothesis) and for the right significance level, usually 5% unless stated otherwise.

If the observed value is less than or equal to the critical value then the difference is significant and so the null hypothesis can be rejected and the alternative hypothesis accepted.

If using a directional hypothesis though, difference must be the direction stated in the hypothesis in order to reject the null, otherwise the null must still be accepted even if the difference is significant as it is significant in the wrong direction.

Procedure

Step 1: State alternative and null hypothesis

Step 2: Record the data, calculate the differences between scores and rank

Step 3: Find the observed value of T

Step 4: Find the critical value of T

Step 5: State conclusion

Null hypothesis: There will be no difference.

Alternative hypothesis:

There will be a difference (non-directional)

Or

There will be a difference in this direction (directional)

Work out the difference between the results and then rank from low to high.

If there are tied ranks then work out the mean of the ranks that would have been given and use that.

Ignore signs, -1 and 1 count as the same difference.

If the difference is zero, omit this result.

T = the sum of the ranks of the less frequent sign.

In this case, the less used sign is minus.

So T = 1

which is the rank of the only minus difference, -1.

N = 4 (as one result was omitted)

Look up critical value in tables against N and under the correct tail and significance level.

If observed value of T is less than or equal to the critical value then the difference is significant and the null hypothesis can be rejected and the alternative hypothesis is accepted.

If using a directional hypothesis in order to accept the alternate hypothesis and reject the null, the difference must be in the correct direction to match the hypothesis or the null must be accepted and the alternative rejected.