### Present Remotely

Send the link below via email or IM

• Invited audience members will follow you as you navigate and present
• People invited to a presentation do not need a Prezi account
• This link expires 10 minutes after you close the presentation

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

You can change this under Settings & Account at any time.

# Statistics

Ch. 1 and 2 Summary
by

## Ivanti Galloway

on 5 September 2012

Report abuse

#### Transcript of Statistics

What is Statistics? How to describe
and Summarize Statistics population: total set of observations that can be made
census: study that obtains data from every member of a population
sample: subset of a population
units: things in a population
subject: units that are people
variable: characteristic of a unit
dichotomous variable: variable without only two possible outcomes
quantitative variable: variable assigns a number to each individual
statistic: number derived from population (denoted with Greek letters like
parameter: number derived from the population
random (sampling) error: difference between sample value and true population value
experiment: procedure which results in a measurement or observation
random experiment: experiment which outcome depends on chance
execution/trail: each repitition of an experiment Parameter Vs. Statistics population sample parameter
exact number statistic
estimated number so there can be error Say there's box of cards with either 1 or 0 written on each card. You pull out 25 cards in each random experiment to figure out how many 1s are in the box. You pull out 14 1s and 11 0s. This give the sample percentage 14/25=.56=56%. However if the true population value of the 1s is 60%, then 54%-60%=-4% gives us our random sampling error.
suppose you performed this experiment 3 more times.
The second time the sample random error is +4%, the third
time -8%, and the last 20%. the average comes out to...
AV: -4%+4%-8%+20%= +3% 4 The cancellation of negative and positive results in a small number. To get a "typical size" of a random sampling error, ignore the size and get the mean of the absolute values (MA): 4%+4%+8%+20%= 9%
4 MA is difficult to deal with theoretically
because the absolute value function is not
differential at 0. So gernerally the root mean square is used.
(RMS): (-4%) +(+4%) +(-8%) +(+20%) = 124%= 11.14%
4 The RMS of all possible random sampling errors is called standard error size n and its percentage p to estimate the population percentage of a dichotomous population, we show that SE= (1- ) 1 n 2 n p inferential statistic: estimate of an unknown parameter made by examining a random sample
sampling theory: examining the different samples that are possible and likely from a population with a known parameter 2.1 Variables and Data Sets observations: actual values of variables
measurement: observations that are numbers
data set: collection of observations of observation or measurement s of a variable forms *goal of statistics is to use information provided by a data set to study the population from which it came.
* arranging data into charts, tables, and graphs along with computations of various descriptive numbers about the data
*reasoning in environment where one does not know, or cannot know, all facts needed to reach conclusion with complete certainty
* 2.2 Categorical Data categorical quantitative Variables gender
religion
race
occupation
blood type height
weight
age
years of education
2.3 Ordinal Data ordinal-categorical variables ranked in meaningful order nominal-categorical data that are not ordinal; 2.3 example 20 students STA 290 class
D B D B F C A B B D
A B B C+ C C+ B A A B+
*ordinal data grade tally frequency relative
frequency A
B+
B
C+
C
D+
D
F llll l llll ll ll ll lll l 4
1
7
2
2
0
3
1 .2
.05
.35
.1
.1
0
.15
.05
1 20 median: divides the ordered list into two equal parts, halfway between 10th and 11th grade, B
mode: most accruing, B 2.4 RATIO DATA
ratio data: quantitative data for which it's meaningful to form quotients
ex. person's age
ratio v. ordinal: The data set of grades from the previous chart are ordinal. If the grades were assigned point value (A=4, B=3, C=2, etc.) then the set would be ratio
Ratio Variable can be discrete or continuous
Discrete Data-There are gaps between possible values. A variable that can have only integers will be discrete
Continuous Data-The data can be described by points in an interval of the line. There are no gaps between possible values. 2.5 Frequency Tables and Histograms
The STA grades chart was an example a tally and frequency table
Histograms are graphically displayed based on the frequency table.
Unlike bar graphs which has the height of the bar proportional to the frequency, histograms bar's area's are proportional to the frequency. Histogram 1 2 3 4 2 3 4 5 6 7 5 and 6 have no height on this graph up aren't left out of the horizontal scale.

The horizontal scale is uniform

The histogram fits the horizontal scale and not the other way around.

In this case the center of each bar is possible value (2,7) and the boundaries occur at impossible values like 2.5.

continuous data sets put frequency tables in groups or sets called intervals or classes Sturge's Rule
2 n K-1 *only a guide 2.6 Grouped Data and Sturge's Rule absolute density d=f/w
Height of rectangle=Density= Frequency
Class Width
relative density d%=d/n=f%/w
Height of rectangle=Relative Density=Relative Frequency
Class Width 2.7 Stem and Leaf Plot
way of organizing numbers that makes them easier to read

Data
12 45 43 25 13 22 63 29 34
56 14 34 11 23 13 39 25 23

Stem and Leaf Plot
1 1 2 3 3 4
2 2 3 3 5 5 9
3 4 4 9
4 3 5
5 6
6 3
2.8 Five-Number Summary
Rank 1 2 3 4 5 6 7 8
Value 2 3 4 4 5 5 6 10 Range from minimum value min=2 to the maximum value max=10. These numbers can used as a starting point of a five number summary of as data set. The average (1+2+...+8)/8=4.5. There's no number in the middle so the average of the fourth and fifth value (4.5) is the median. The median separates the values in to equal parts.
lower-ranking values: {2,3,4,4}
higher-ranking values: {5,5,6,10}
if there is an odd number of values in the set include the median in both parts
The median of the lower half is called the first quartile Q1=(3+4)/2=3.5 The median of the high half is called the third quartile, Q3=(5+6)/2=5.5. Sometimes Q2 is used for the median.

The five-number summary includes the min, Q1, med, Q3, max in ascending order. In this case 2, 3.5, 4.5, 5.5. 10.
The range=max-min
Interquartile range (IQR)=Q3-Q1 2.9 Box Plot
2 3 4 5 6 7 8 9 10 2.10 The mean
the mean is the average of all the values in the set.
Sample mean: x= x /n
Population mean: u= x /N
Grouped Mean: x= x(f/n)
n
i=1 n
i=1 2.11 Variance
Data set 1 48 49 50 50 51 52
Data set 2 0 10 50 50 90 100
These data sets have the same median, mode, and mode. In order to describe the data set well we also need to measure its variability. Mean Absolute Deviation:
MAD=(1/n) x -m i=1 n l l i l l l l SAMPLE STANDARD DEVIATION:
n= (x -x) 1 n-1 l i=1 n 2 l l l l population standard variation
Full transcript