Descriptive statistics describe a set of data.

Measures of Central Tendency

Measures of central tendency attempt to mark the center of a distribution. Three common measures of central tendency are the mean, median, and mode.

the mean is the average of all scores in the distribution (add all scores and divide by the number of scores)

the median is the central (middle) score in a distribution (write the scores in ascending or descending order and find the score in the middle (if there are an odd number of scores, there will be only one median; if there are an even number of scores, the median is the average of the middle two scores)

the mode is the score that appears most frequently (a distribution may have more than one mode, for example, a bimodal distribution has two scores that appear equally frequently and more frequently than any other score)

The mean is the most commonly used measure of central tendency, but its accuracy can be distorted by some extreme scores or outliers. When a distribution includes outliers, the median is often used as a better measure of central tendency.

Unless a distribution is symmetrical, it is skewed. When a distribution has an extreme score that is very high, it is positively skewed. When it has an extreme score that is very low, i is negatively skewed.

Measures of Variability

Common measures of variability include the range, variance, and standard deviation. Measures of variability attempt to depict the diversity of the distribution.

the range is the distance between the highest and lowest score in a distribution

the variance and standard deviation are closely related; standard deviation is simply the square root of the variance (both measures essentially relate the average of any score in the distribution from the mean; the higher the variance and standard deviation, the more spread out the distribution).

Z Scores

Being able to compare scores from different distributions can sometimes be important. In order to do so, you must convert scores from the different distributions into measures called z scores.

Z scores measure the distance from the mean in units of standard deviation. Scores below the mean have negative z scores, while those above the mean have positive z scores.

The normal curve is a theoretical bell-shaped curve for which tarea under the curve lying between any two z scores has been predetermined.

approximately 68% of scores fall within one standard deviation

approximately 95% of scores fall within two standard deviations of the mean

almost 99% of the scores fall within 3 standard deiations of the mean

Percentiles indicate the distance of a score from 0. Someone who scores at the 50th percentile (better than 50% of the people that took the test) has a z score of 0, and someone who scores at the 98th percentile has an approximate z score of +2.

Inferential Statistics

The purpose of inferential statistics is to determine whether or not findings can be applied to the larger population from which the sample was selected. The extent to which the sample differs from the population is known as sampling error.

Inferential statistics are used to help psychologists decide when their findings can be applied to the larger population.

Many inferential statistics exist such as t-tests, chi square tests, and ANOVAs (analysis of variance).

They all take into account both the magnitude of the difference found and the size of the sample. All of these tests yield a p value.

The p value gives the probability that the difference between groups is due to chance. The smaller the p value, the more significant the results.

Scientists have decided that a p value of 0.05 is the cutoff for statistically significant results. A p value of 0.05 means that a 5% chance exists that the results occurred by chance. A p value can never equal 0 because we can never be 100% certain that the results did not happen due chance.

A p value can be computed for any correlation coefficient. The stronger the correlation and the larger the sample, the more likely the relationship will be statistically significant.

Statistical Measures

Analysis of Psychological Experiments

**Statistics**

A frequency distribution is a graph that shows how many participants had each quality, etc.; it can easily be turned into a line graph (called a frequency polygon) or bar graph (called a histogram).

The y-axis always represents frequency, while whatever thing or quality you are interested in studying is on the x-axis.

Correlations

A correlation measures the relationship between two variables.

If two things are positively correlated, the presence of one thing predicts the presence of the other (they change in the same way).

Negative correlation means that presence of one thing predicts the absence of the other (they change in opposite ways).

When no relationship exists between two things, no correlation exists.

Correlations may be either strong or weak.

The strength of a correlation can be computed by a statistic called the

correlation coefficient

. Correlation coefficients range from -1 to +1 (where -1 is a perfect negative correlation, and +1 is a perfect positive correlation).

Graphically, the -1 and +1 (and all numbers in between) are the slope of the line of best-fit (or regression line) on the scatter plot (the closer the points come to falling in a straight line, the stronger the correlation). When the line (drawn through the scatter plot that minimizes the distance of all points from the line) slopes upward (from left to right) it indicates a positive correlation; a downward slope is evidence of a negative correlation.

Both -1 and +1 denote equally strong correlations; the number 0 denotes the weakest possible correlation (no correlation), which means that knowing something about one variable tells you nothing about the other.

Statistical Hypotheses

There are two types of statistical hypotheses.

Null hypothesis: the null hypothesis, denoted by H , is usually the hypothesis that sample observations result purely from chance.

Alternative hypothesis: the alternative hypothesis, denoted by H or H , is the hypothesis that sample observations are influenced by some non-random cause (typically the IV).

0

a

1