Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Transcript of Untitled Prezi
The researcher's helping hand
Radiation shielding using Pterygoplichtys disjunctivus’ bones and cartilages
Statistical Treatment to be used:
Analysis of Variance
The science of data. It involves collecting, classifying, summarizing, organizing, analyzing, and interpreting numerical interpretation (McClave, et. al. 2011).
measures of central tendency
Measures of variability
measures of shape
measures of association
hypothesis testing for single population
analysis of variance
hypothesis testing for two population
Statistics in a
portion of a whole and, if properly taken, is the representative of the whole (Black, 2013).
a set of units (usually people, objects, transactions or events) that we are interested in studying.
a descriptive measure of the population.
a descriptive measure of the sample.
Descriptive statistics includes statistical procedures that we use to describe the population we are studying.
Inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample.
Quantitative data are measurements that are recorded on naturally occurring numerical scale.
Qualitative data are measurements that cannot be measured on natural numerical scale; they can only be classified into one of a group of categories.
The highest level of data measurement.
Same properties as interval data but have an absolute zero.
(e.g. height, weight, time, K)
The distances between consecutive numbers have meaning and the data are always numerical.
(e.g. Celsius and Fahrenheit Scales)
Refers to quantities that have a natural ordering. The ranking of favorite sports, the order of people's place in a line, the order of runners finishing a race or more often the choice on a rating scale from 1 to 5.
refers to categorically discrete data such as name of your school, type of car you drive or name of a book.
The mean (or the average) is equal to the sum of all the values in the data set divided by the number of values in the data set.
Population mean : u
Sample mean: x
The median is the middle score for a set of data that has been arranged in order of magnitude.
The mode is the most frequent score in our data set
Divides a group data into 100 parts.
Widely used in reporting test results.
Divides a group of data into four parts.
Nominal - Mode
Ordinal - Median
Interval/Ratio (not skewed) - Mean
Interval/Ratio (skewed) - Median
when to use
The range of a data set is the difference between the largest and smallest data values.
Determines how close the data in the distribution are to the middle of the distribution. Using the mean as the measure of the middle of the distribution, the variance is defined as the average squared difference of the scores from the mean.
The standard deviation is simply the square root of the variance. This is an
especially useful measure of variability when the distribution is normal or
Measure of the degree of relatedness of the variables.
Pearson Product Moment Correlation
>measures the linear correlation of two (sample) variables.
Skewness: indicator used in distribution analysis as a sign of asymmetry and deviation from a normal distribution.
Skewness > 0 - Right skewed distribution - most values are concentrated on left of the mean, with extreme values to the right.
Skewness < 0 - Left skewed distribution - most values are concentrated on the right of the mean, with extreme values to the left.
Skewness = 0 - mean = median, the distribution is symmetrical around the mean.
Kurtosis - indicator used in distribution analysis as a sign of flattening or "peakedness" of a distribution.
Kurtosis > 3 - Leptokurtic distribution, sharper than a normal distribution, with values concentrated around the mean and thicker tails. This means high probability for extreme values.
Kurtosis < 3 - Platykurtic distribution, flatter than a normal distribution with a wider peak. The probability for extreme values is less than for a normal distribution, and
the values are wider spread around the mean.
Kurtosis = 3 - Mesokurtic distribution - normal distribution for example.
Simple random sampling -Whole population is available.
Stratified sampling (random within target groups) -There are specific sub-groups to investigate (eg. demographic groupings).
Systematic sampling (every nth person) - When a stream of representative people are available (eg. in the street).
Cluster sampling (all in limited groups) - When population groups are separated and access to all is difficult, eg. in many distant cities.
Quota sampling (get only as many as you need) - You have access to a wide population, including sub-groups
Proportionate quota sampling (in proportion to population sub-groups) - You know the population distribution across groups, and when normal sampling may not give enough in minority groups
Non-proportionate quota sampling (minimum number from each sub-group) -There is likely to a wide variation in the studied characteristic within minority groups
Purposive sampling (based on intent) - You are studying particular groups
Expert sampling (seeking 'experts') - You want expert opinion
Snowball sampling (ask for recommendations) -You seek similar subjects (eg. young drinkers)
Modal instance sampling (focus on 'typical' people) - When sought 'typical' opinion may get lost in a wider study, and when you are able to identify the 'typical' group
Diversity sampling (deliberately seeking variation) - You are specifically seeking differences, eg. to identify sub-groups or potential conflicts
Snowball sampling (ask for recommendations) - You are ethically and socially able to ask and seek similar subjects.
Convenience sampling (use who's available) - You cannot proactively seek out subjects.
Judgment sampling - (guess a good-enough sample) You are expert and there is no other choice.
data points should be independent from each other.
z-test is preferable when n is greater than 30.
the distributions should be normal if n is low, if however n>30 the
distribution of the data does not have to be normal.
the variances of the samples should be the same (F-test).
all individuals must be selected at random from the population.
all individuals must have equal chance of being selected.
sample sizes should be as equal as possible but some differences are allowed.
Data types that can be analyzed with z-tests
Data types that can be analyzed with t-tests
data sets should be independent from each other except in the case of the paired-sample t-test
where n<30 the t-tests should be used
the distributions should be normal for the equal and unequal variance t-test (K-S test or Shapiro-Wilke)
the variances of the samples should be the same (F-test) for the equal variance t-test
all individuals must be selected at random from the population
all individuals must have equal chance of being selected
sample sizes should be as equal as possible but some differences are allowed
ANOVA is a statistical technique that performs a similar function to the t-test, but is capable of dealing with more than two levels or more than one factor.
If tcalc > ttab, we reject the null hypothesis
chi square goodness of fit test
Used to compare observed data with data we would expect to obtain according to a specific hypothesis.
The chi-square test is always testing the null hypothesis, which states that there is no significant difference between the expected and observed result.
Chi-square requires that you use numerical values, not percentages or ratios.
For example, if, according to Mendel's laws, you expected 10 of 20 offspring from a cross to be male and the actual observed number was 8 males, then you might want to know about the "goodness to fit" between the observed and expected. Is it by chance or due to other factors?
Chi-Square Test for Independence
The test is applied when you have two categorical variables from a single population.
It is used to determine whether there is a significant association between the two variables.
For example, in an election survey, voters might be classified by gender (male or female) and voting preference (Democrat, Republican, or Independent). We could use a chi-square test for independence to determine whether gender is related to voting preference.
The runs test can be used to decide if a data set is from a random process.
Purpose: Detect Non-Randomness
Randomness is one of the key assumptions in determining if a univariate statistical process is in control.
If the randomness assumption is not valid, then a different model needs to be used.
Ratio or Interval
Ordinal or Nominal
The Wilcoxon signed-rank test is the nonparametric test equivalent to the dependent t-test. It is used to compare two sets of scores that come from the same participants.
As an example ANOVA is a parametric method while Kruskal Wallis is the corresponding non-parametric method which has to be used in case the assumption of normality is rejected in the before the use of ANOVA tests
The Mann-Whitney U test is used to compare differences between two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed.
It is often considered the nonparametric alternative to the
independent t-test although this is not always the case
Unlike the independent-samples t-test, the Mann-Whitney U test allows you to draw different conclusions about your data depending on the assumptions you make about your data's distribution.
For example, you could use the Mann-Whitney U test to understand whether attitudes towards pay discrimination, where attitudes are measured on an ordinal scale, differ based on gender (i.e., your dependent variable would be "attitudes towards pay discrimination" and your independent variable would be "gender", which has two groups: "male" and "female").
This can occur when we wish to investigate any change in scores from one time point to another, or when individuals are subjected to more than one condition.
For example, you could use a Wilcoxon signed-rank test to understand whether there was a difference in smokers' daily cigarette consumption before and after a 6 week hypnotherapy programme
Non-Parametric alternative to one-way ANOVA
A one-way anova may yield inaccurate estimates of the P-value when the data are very far from normally distributed. The Kruskal–Wallis test does not make assumptions about normality
Example: Possible differences in graded
performance to three separate activities (e.g.,
final examination score, composite score for all
homework problems, final project score) in a high
school Logo programming language class.
The Friedman test is the non-parametric alternative to the one-way ANOVA with repeated measures.
A researcher wants to examine whether music has an effect on the perceived psychological effort required to perform an exercise session.
To test whether music has an effect on the perceived psychological effort required to perform an exercise session, the researcher recruited 12 runners who each ran three times on a treadmill for 30 minutes.
At the end of each run, subjects were asked to record how hard the running session felt on a scale of 1 to 10, with 1 being easy and 10 extremely hard. A Friedman test was then carried out to see if there were differences in perceived effort based on music type.
A nonparametric version of the Pearson product-moment correlation. Spearman's correlation coefficient, (also signified by rs) measures the strength of association between two ranked variables.
In a competition/comtest. Spearman Rank Correlation Coefficient can indicate if judges agree to each other's views as far as talent of the contestants are concerned (though
they might award different numerical scores) - in other
words if the judges are unanimous.