**1995**

**2000**

**2010**

**1990**

**2005**

**Data analysis techniques**

What is Analysis?

August 2, 1990- Iraq invade Kuwait

Stages of analysis

**2015**

Did you know?

Montana has about three times as many cows as they do people

February 15th, 2005- Youtube is Launched

Chad Hurley, Steve Chen and Jawed Karim are the founders

Youtube is a revolutionary website, in which ordinary people can post videos for the world to see

April 19th, 2005- Pope Benedict XVI is elected

He was elected on the second day of the conclave

His real name is Joseph Ratzinger

He was not actually inaugurated until April 24th

He resigned in 2013

January 22nd, 2006- Kobe Bryant Scores Big

Los Angeles Lakers player, Kobe Bryant scores a career high of 81 points in a single game against the Toronto Raptors

This is the second most in a single game, ever scored by an NBA player

April 16th, 2007- Virginia Tech massacre

A gunman by the name of Seung-Hui Cho killed 32 people and injured 23 others

After doing so he committed suicide

This was the deadliest mass shooting in modern American history

June 14th, 2007- NBA Championship Upset

61st NBA Championship

Spurs Vs. Cavaliers- the spurs sweep the Cavaliers in a 4 to 0 victory

November 4th, 2008- Obama is Victorious

Barack Obama and Joe Biden defeat John McCain and Sarah Palin, and become the President and Vice President of the United States

Barack Obama is the first black president the United States has had

Obama is considered one of the most successful and most liked presidents that the United States has ever elected

April 2009- Swine Flu Discovered

Swine is discovered and it causes a widespread panic

People under 60 were the only ones who were at danger of catching it

Did you know?

If Bill Gates gave every single dollar of his fortune to the United States government, it would only cover the United States for 15 days.

May 2nd, 2011- Osama Bin Laden Is Shot

Osama Bin Laden was shot and killed by United States Navy seals in one of his compounds

This ends the hunt for one of the United States most infamous Terrorist leaders

December 14th, 2012- Sandy Hook Elementary Shooting

Adam Lanza went into a Newtown elementary school and shot and killed 26 people

20 of the fatalities were children

April 15th, 2013- Boston Marathon Bombing

Two terrorists attack the Boston Marathon with two bombs, killing 3 people and injuring 283

One of the terrorists was killed, and the other was captured

March 2014- Ebola Scare

Ebola starts to infect people in Africa, causing a wide spread panic especially in the United States

Future Predicted Events

May 2015: Shaye Metzger scores a perfect on the Ap Us History Exam

February 1st, 2015: Pittsburgh Steelers win the SuperBowl

November 2024: Shaye Metzger becomes the President of the United States, with his running mate Justin Imwalle, and his First Lady, Kate Upton

December 11th, 2038: Shaye Metzger celebrates his 40th birthday, and gets put into a cryogenic sleep to be sent into space

October 31st, 2130: Shaye Metzger makes a colony on Jupiter

September 56, 23,456.23 Shaye Metzger dies by an Alien Murderer

How do we analyse data?

After transcribing data .. go through 3 stages of qualitative data analysis

Qualitative data

Quantitative data

Transcribing

- Data reduction

- Displaying data

- Drawing conclusions and verifying data

Data reduction

Involves reducing large amounts of data in manageable chunks

Most common form of data analysis in data reduction = coding

Coding = organise raw data (sentences, phrases, words from questionnaires) into categories

Categories given clear valid heading and rule for inclusion

Rule for inclusion helps guide which data you place in each category

I.e researching 'factors affecting talent development in football' .. category called 'importance of parental tangible support' ... .. rule inclusion can be statement made refers to concrete support given to player from parent (purchasing kit or transport to games)

Can be positive or negative influence on players development

Coding

Coding data is breaking it down into smaller parts .. then put it back together in parts that relate to each other .. before making sure categories are valid

Involves line by line analysis of data in minute detail and used to generate categories

Three stages of coding - open, axial and selective

Open coding

Data is broken down and examined .. aim is to identify all the key statements from interviews that relate to aims of your research and research problem

After identifying key statements .. start to put key points that relate to each other into categories .. but need to give each category a suitable heading

When you start to organise data under different catergories .. coding process has begun

Axial coding

Next stage is to put data back together

Part of process involves re reading data you have collected so you can make precise explanations about your areas of interest

You need to refine the categories you started during open coding

May develop new categories

To allow refinement of codes at this stage .. have to ask more questions about the categories and codes you have created

Some questions to consider

- Can i relate certain codes together under a more general code?

- Can i place codes into a particular order?

- Can I identify any relationships between different codes?

Selective coding

Final stage of coding and involves aiming to finalise categories (and codes) so you can group them together

When grouping .. you will different diagrams to show how your categories link together

Key part - select a main category which will form vocal point of diagram

Also need to look for data that contradicts previous research .. rather then data that supports it

Helps you make better arguments and draw more conclusions based on your data

Other techniques

Also electronic packages used to analyse data (in early stages of research career it is unlikely that you will use these .. better to use manual analysis

ATLAS

NVivo

Displaying data

Different ways to display data

Way you display data will affect the argument or point you are trying to make

Different types of diagrams .. network .. venn .. radical ... and cycle

Network diagrams

Show hierarchical relationships between different ideas

Venn diagram

Consist of two or more overlapping circles

Show how different topics relate to each other

Radical diagram

Also known as spider diagram ... illustrates a relationship where each item is linked to a central item

Diagram can be thought of as a simple organisation chart that starts from centre rather then the top

Cycle diagrams

Shows the stages in a process as a continuous cycle

Process is shown as a circle with break at each stage .. and arrowhead to show direction of process

Drawing conclusions and verifying data

Conclusions must be valid and reliable

Two common techniques to achieve this are triangulation and member checking

Triangulation

Refers to using different data collection methods in the same study

I.e you could use interviews or questionnaires .. or you could use the same interviews with different types of participant (athletes and coaches)

Alternatively .. ask different researchers to collect data and independently draw conclusions before checking their findings with each other

Member checking

During member checking .. you complete your data analysis draw conclusions relating to the aims of the study

Then show the analysis to participants who took part in the research so that they have understood and communicated everything correctly

If analysis is correct .. you can claim the data is valid

How can triangulation be a problem?

Quantitative data analysis techniques

Organising data

Different methods of organising data during quantitative analysis

Methods include range, rank order distribution, simple frequency distribution and grouped frequency distribution

Range

Range is the distance of highest ad lowest value collected

Calculate range by subtracting lowest and highest value

Rank order distribution

Means placing your data into a ordered list from the lowest to the highest in a single column .. ensuring you include all the scores

Its used when the number of participants is then or equal to 20

Simple frequency distribution

Used when the number of participants is greater then 20 and when range is less then or equal to 20

Use with a table that has two columns .. one for raw data scores (x) and one for frequency scores (f)

Frequency column is number of times that particular score was achieved

Grouped frequency distribution

Quantitative research often works with ranges greater then 20

Similar to simple frequency distribution .. but table columns x and f (x= groups of scores .. f= frequency)

To keep data on single sheet of paper .. 10-20 groups of scores (ideal number is 15)

Need to decide interval size of each group (i = range/15

Displaying data

Different ways to display data .. graphs, histograms, bar charts and frequency graphs

Must understand distribution curves before statistical analysis

Normal, positively and negatively skewed

Normal - most of the examples in a data set are close to the average .. whilst few examples are at one extreme or the other

Characteristics:

- curve has single peak

- bell shaped

- average lies at centre of distribution .. and distribution is symmetrical around the mean

- two tails extend indefinently and never touch the x axis

- shape determined by mean and standard deviation

Measures of central tendency and variability

Numbers that describe what is average or typical of the distribution

Measures include mean, median and mode (measures of central tendency) and standard deviation (measures of variability)

Identification of outliers

Important concept = outliers

Results that are radically different form what you would consider as normal scores

Statistics drawn from data that contains outliers can be misleading

Standard deviation

Number that indicates how much each of the values in the distribution deviates form the mean

If data points are all close to the mean .. then SD is close to zero

If many data points are far from the mean .. than SD is far from zero

If data values are equal .. than SD is zero

Data analysis

Common data analysis is statistical tests

Statistics are time consuming

Two types of statistics: descriptive and inferential

DS - measures of central tendency and variability

IS - asses relationships or differences between data sets .. further divided in subgroups .. parametric tests and non parametric tests

Tests - selecting, types and explanations

Parametric test - t tests

Most common .. dependent t test and independent t test

When you complete t test and want to see if your result is significant or not .. need to know whether you are completing a one tailed test or two tailed test

Dependent t-test

Dependent t test (paired samples) examines significant differences between two sets of related scores .. such as whether the mean high jump scores of one group are different when measured pre and post training

Independent t-test

Most frequent used t-test .. used when you have two groups and are trying to discover if the mean scores of two groups can be considered to be significantly different

Suitable when data is interval or ratio .. when groups are randomly assigned and when variance (spread) in both groups is equal

Parametric test - Pearson product moment correlation coefficient

Correlation is the value of the relationship between two or more variables

Can be positive or negative .. and depends on the direction of the line when results are plotted

PPMCC test is suitable when you have interval or ratio data and trying to identify a relationship between two variables

Test of association .. looks at whether two or more variables are related

Can be used two ways .. try to find out relationship between two variables .. or try to predict one score from another

Benefits of imagery?

Non parametric test

If data is non parametric .. t-tests cannot be used

Wilcoxin matched pairs signed ranks test used in place of dependent t-test and Mann Whitney U test is sued in place of independent t test

Wilcoxin matched pairs signed t test

Used when you are trying to find out if there is a significant difference between two scores taken from the same participant (or from matched participants)

Used when data is ordinal (ranked) .. and use following instructions

1) disregard an results for participants who scored same in both conditions .. then count up number of paired scores left .. becomes n score

2) calculate difference between the two scores of each participant .. assigning plus or minus signs (d)

3) rank the differences .. giving the smallest a rank of 1