Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

SPSS PROJECT

This project is designed to familiarize you with gathering data and then describing and analyzing it using SPSS..
by

sondos sa

on 23 December 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of SPSS PROJECT

PROJECT
This project is designed to familiarize you with gathering data and then describing and analyzing it using SPSS..
The SPSS Windows And Files
SPSS Windows
SPSS Statistics is a software package that can be used to perform data entry, analysis, to create tables and graphs. SPSS is capable of handling large amounts of data and can perform all of the analysis and much more.
Frequency Analysis.
2-Data View
The Simple Bar Graph
One-Sample T Test
Chi-square test
H1: The mean difference between two samples is not equal to, greater than, less than 0.
SPSS
SPSS Statistics has three main windows, plus a pull-down menu at the top. These allow you to see your data, see your statistical output, and see any programming commands you have written. Each window corresponds to a separate type of SPSS file.
1- Variables View..
Contains descriptions of the attributes of each variable in the data file
- Variable Name:
The name of each SPSS variable in a given file must be unique.
- Variable Type:
The kind of data to be recorded (e.g., strings of characters, numeric values, or special numbers like dates). The contents of the Variable Type dialog box depend on the selected data type.
- Column Width:
The width of a variable is the number of characters SPSS will allow to be entered for the variable.
- Decimals:
The decimals of a variable is the number of decimal places that SPSS will display.
- Label:
The label of a variable is a string of text to identify in more detail what a variable represents.
- Values:
for categorical data we often need to know which numbers represent which categories. To indicate how these numbers are assigned, one can add labels to specific values (e.g., 1 = male and 2 = female).
* Click in the Value field to type a specific numeric value.
* Click in the Label field to type the corresponding label.
* Click on the Add button to add this pair of value and label to the list.

value labels can be seen in the Data View by clicking on this icon in the tool bar , which switches between the numeric values and their labels.

- Columns:
determines how wide the variable column should be in Data View mode.
- Align:
The alignment property indicates whether the information in the Data View should be left-justified, right-justified, or centered.
- Missing:
We sometimes want to signal to SPSS that data should be treated as missing, even though there is some other numerical code recorded instead of the data actually being missing.

- Measures:
describes the level of measurement (e.g., nominal, ordinal, or scale)
One of the primary ways of looking at data file is in Data View, so that you can see each row as a case and each column as a variable.
Pull-Down menus
* File Menu:
From the
file
menu you can open several different existing files or a database file such as an excel file or read in a text file. You can also save any changes to the current file.
* Edit Menu:
From the
edit
menu, you can cut, copy, paste, insert variables, insert cases, or use find in the
Data Editor
window.
* Data Menu:
The
data
menu allows you to define variable properties, sort cases, merge files, split files, select cases and use a variable to weight cases.
* Transform Menu:
The
transform
menu is where you will find the options to do some computations on variables, to create new variables from existing ones or recode old variables.
* Analyze Menu:
The
analyze
menu is where all statistical analysis takes place. From descriptive statistics to regression analysis to nonparametric tests.
* Graphs Menu:
The
graph
menu is where you can create high resolution plots and graphs to be edited in the chart editor window or you can create interactive graphs.
* Window:

From the
window
menu you can change the active window. The window with a check mark is the active one. In this case it is the data editor window.
* Help:
The
help
menu allows you to get help on topics in SPSS or to ask the statistics coach some basic questions.
The Output Viewer collects your statistical tables and graphs, and gives you the opportunity to edit them before you save or print them.
WE WILL ANALYZING THESE DATA USING SPSS..
Defining Variable And Entering Data
We need to create the
variables
first. Then enter the
data
by hand.

For each string variables we shall change the type of the variable to "
String
" Then we assign labels to values.
As we mentioned before..

For example.. we label "males" as "M" and "females" as "F".
Data Manipulation
Data files are not always organized in a form to meet specific needs. And we may wish to select a specific subjects to analysis.
- To Compute The Average "
Length
" For The Female.
- First we need to recode
string
variables "
Gender
" into
numeric
variables.
*
Transform
> Recode into different variable > Select the variable "
Gender
" > Click the
Old and New Values…
button > Recode
M
into
0
and
F
into
1
(as shown in figure) > After clicking
Continue
button > Type the new variable name "
NewGender
" (from the output variable section) >
OK
.
From the
Data menu
we can select the case we are looking for.
* select
Select Cases…
* Click the
If condition is satisfied
option.
* Click the
If…
button. Double click on the variable "
NewGender
", then write this condition (
NewGender = 1
)
* Click the
continue
button, Then click the
OK
button.
NOW
, simply we can compute the average
length
for the
female
.
Analyze > Descriptive Statistics > Frequencies > Double click on the variable "
Length
" > Click the
Statistics…
button. Select the
Mean
> Continue > OK.
- SPSS will delete all cases except that under the condition ( NewGender = 1 )
1
2
4
3
3- Output Viewer
Visual Binning
This facility lets you interactively create groups (bins, categories) from a continuous variable and visually control the process.
* From the
Transform
menu select the
Visual Binning
.
* Select the variable "
Length
" to move it to the variables to
Bin:
box
* Click the
Continue
button.
* The
Visual Binning
window opens.
* Select the variable "
Length
" from the
Scanned Variable
List:
* Click the
OK
button.
* Click the
Make Cutpoints…
button.
* Type (
10
) in the
First Cutpoint Location:
box, and (
4
) in the
Width:
box.
* The
Number of Cutpoints:
(
3
), and the
Last Cutpoint Location:
(
18
) will automatically occur.
* Click the
Apply
button.
* Click the
Make Labels
button, and then type "
Binned
" in the
Binned Variable:
box.
Frequency analysis is a descriptive statistical method that shows the number of occurrences of each response chosen by the respondents.
- To Make a Frequency Table, Compute a Central Tendency or Dispersion For The "Length"
* Click the
Analyze
menu, point to
Descriptive Statistics
, and select
Frequencies
.
* Select the variable "
Length
" to move it to the
Variable(s):
list box
* Select the
Display frequency tables
check box.
* Click the
Statistics…
button. Select the
Mean
,
Median
,
Mode
,
Variance
and
Standard deviation
.
* Click the
Continue
button, then click the
OK
button.
It is a graph which displays the data by using vertical bars of various heights to represent frequencies.
- To Make a Bar Graph For The Variable "Length"
* Pull the variable "
Length
" to drop it on the
X-axis
.
* Click the
Graphs
menu, select
Chart Builder…
* Select
Bar
, and then select the
Simple bar
.
* Click the
OK
button.
Crosstabs Analysis.
Cross tabulation (or crosstabs for short) is a statistical process that summarizes
categorical
data to create a contingency table.
- TO PERFORM CROSSTABS ANALYSIS TO KNOW HOW MANY
Females
HAS A
Brown
EYE COLOR
* Click the
OK
button.
* Click the
Analyze
menu, point to
Descriptive Statistics
, and select
Crosstabs…
.
* Select the variable "
Gender
" to move it to the
row(s):
list box, and the variable "
EyeColor
" to move it to the
Column(s):
list box.
We conclude that there is 5 from 14
Females
has a
brown
eye color.
CHI-SQUARE is a quantitative measure used to determine whether a relationship exists between two categorical variables.
H0: There is no relationship exists between the two variables.
H1: There is a relationship exists between the two variables.
- in statistical significance testing, the
p-value
is the probability of obtaining a test statistic at least as extreme as the one that was actually observed.
We will often "
reject the null hypothesis
" when the
p-value
turns out to be
less
than a certain
significance level.
- To perform crosstabs analysis to apply Chi-square to determine whether a relationship exists between the variable "Gender" and the variable "EyeColor" with 95% significance level:
* Click the
OK
button.
* Click the
Analyze
menu, point to
Descriptive Statistics
, and select
Crosstabs…
.
* Select the variable "
Gender
" to move it to the
row(s):
list box, and the variable "
EyeColor
" to move it to the
Column(s):
list box.
* Click the
Statistics…
button. Select the
Chi-square
chick box.
The
P-value
is
greater than
the
significance level
, We will
accept
the
null hypothesis
. So, we conclude that no relationship exists between "
Gender
" and "
EyeColor
" at the
5
percent level of significance.
H0: There is no relationship exists between the "
Gender
" and the "
EyeColor
".
H1: There is a relationship exists between the "
Gender
" and the "
EyeColor
".
p-value = 0.648
0.648 > 0.05
Stacked Bar graph
Stacked bar graph is a graph that is used to compare the parts to the whole. The bars in a stacked bar graph are divided into categories. Each bar represents a total.
- To make a stacked bar graph between the "Gender" and the "Weight"
* Click the
OK
button.
* Click the
Graphs
menu, select
Chart Builder…
* Select
Bar
, and then select the
Stacked bar.
* Pull the variable "
Gender
" to drop it on the
X-axis
.
* Pull the variable "
Weight
" to drop it on the
Stack: set color
. (At the upper right corner)
The basic idea of the
One-sample t test
is a comparison between the average of the sample (observed average) and the population (expected average).
H0: difference between observed and expected mean is 0.
H1: difference between observed and expected mean is not 0.
What is SPSS ?
We will often "
reject the null hypothesis
" when the
p-value
turns out to be less than a certain significance level.
- To perform the one-sample t test to compare the mean "Length" with 15 (expected mean).
* Click the
OK
button.
* Click the
Analyze
menu, point to
Compare means
, and select
One-Sample T Test…
.
* Select the variable "
Length
" to move it to the
Test Variable(s):
list box.
* Enter the expected mean
(15)
in the
Test Value
box.
H0: difference between observed & expected mean is 0.
H1: difference between observed & expected mean is not 0.
We will
reject
H0. So, we conclude that the average
Length
of the sampled population is statistically significantly different from
15
at the
5
percent level of significance.
p-value = 0.013
0.013 < 0.05
Confidence interval are defined as:
Confidence Interval
The confidence interval generates a
lower
and
upper
limit for the mean.
- To Calculate The Confidence Interval For The Variable "DaysInHospital"..
* Click the
OK
button.
* Click the
Analyze
menu, point to
Compare means
, and select
One-Sample T Test…
.
* Select the variable "
DaysInHospital
" to move it to the
Test Variable(s):
list box.
The confidence interval for the variable "DaysInHospital" is (4.04 , 5.22).
Paired-Samples T Test
It is used to test if an observed difference between two means is statistically significant for data has normal distribution.
H0: The mean difference between two samples is equal to 0.
We will often "
reject the null hypothesis
" when the p-value turns out to be less than a certain significance level.
- To perform the Paired-sample t test to compare between the mean "
Length
" with the mean "
DaysInHospital
".
* Click the
OK
button.
* Click the
Analyze
menu, point to
Compare means
, and select
Paired-Samples T Test…
* Select the variables "
Length
" and "
DaysInHospital
" to move them to the
Paired Variable(s):
list box.
The P-value is less than the significance level, We will
reject
the null hypothesis. So, we conclude that the mean difference between the "
Length
" and the "
DaysInHospital
" is significantly different at the
5
percent level of significance.
H0: The mean difference between the "
Length
" and the "
DaysInHospital
" is equal to 0.
H1: The mean difference between the "
Length
" and the "
DaysInHospital
" is not equal to 0.
p-value = 0.000
0.000 < 0.05
Independent-Samples T Test
An independent-samples t test is an inferential statistical test that determines whether there is a statistically significant difference between the means in two unrelated groups
We will often "
reject the null hypothesis
" when the p-value turns out to be less than a certain significance level.
H0: There is no statistically significant difference between the two groups on the dependent variable.
H1: There is a statistically significant difference between the two groups on the dependent variable.
- To perform the Independent-sample t test to compare the mean "
Length
" with the "
Gender
"
* Click the
Continue
button, and then click the
OK
button.
- To perform the Independent-sample t test to compare the mean "
Length
" with the "
Gender
".
* Click the
Analyze
menu, point to
Compare means
, and select
Independent-Samples T Test…
.
* Select the variable "
Length
" to move it to the
Test Variable(s):
list box.
* Select the variable "
Gender
" to move it to the
Grouping Variable(s):
list box.
* Click the
Define Groups…
button. Enter (
0
) in the
Group 1 box
, and (
1
) in the
Group 2 box
.

The P-value is
greater
than the significance level, We will
accept
the null hypothesis. So, we conclude that there is no statistically significant difference between the –
female and male
- mean "
Length
" at the 5 percent level of significance.
H0: The mean "Length" for the –female and male- are equal.
H1: The mean "Length" for the –female and male- are different.
p-value = 0.982
0.982 > 0.05
Simple Linear Regression
Simple linear regression is a method to determine the relationship between a
dependent
variable (Y) and one
independent
variable (X).
The linear equation for simple regression is as follows:
a = intercept
b = slop
Scatter Plot
A scatter plot determine if there is a linear relationship between variables or not.
** We will insert a new variable " Nweight" which contains the babies's numerical weight (as shown in figure)
- To make a scatter plot for the variable "
Length
" against the variable "
Nweight
"
* Click the
OK
button.
* Click the
Graphs
menu, select
Chart Builder…
* Select
Scatter/Dot
, and then select the
Simple Scatter plot
.
* Pull the variable "
Length
" to drop it on the
Y-axis
, and the variable "
Nweight
" to drop it on the
X-axis
.
Adding a straight line to the scatter plot
- TO ADD A LINE TO THE SCATTER PLOT
* The fit line was appeared on the
chart
, then close the
Chart Editor
window.
* Double-click the chart in the
Output Viewer
window to open the
Chart Editor
menu.
* Right-click the
chart
, and select
Add Fit Line at Total
.
Predicting Value Of Dependent Variable
- To run a simple regression analysis, to predict the baby's length if his/her weight is 20 kg
* Click the
Statistics…
button. Select the
Estimates
chick box.
* Click the
Analyze
menu, point to
Regression
, and select
Linear…

* Select the variable "
Length
" to move it to the
Dependent:
box, and the variable "
Nweight
" to move it to the
Independent(s):
box.
* Click the
Continue
button, and then click the
OK
button.
b= 1.772
a= -18.811
Predicting the baby's length if his/her weight is 20 kg
The values of a and b should be substituted in the linear equation with X = 20
- TO PREDICT THE BABY'S LENGTH USING THE COMPUTING FUNCTION
* Click the
OK
button.
* Click the
Transform
menu, and select
Compute Variable…
* In the
Target Variable:
box, type [
Predicted
]
* In the
Numeric Expression:
box, type [
-18.811 + 1.772 (20)
]
The predicted Length is 16.63
Coefficient Of Determination
In statistics, the coefficient of determination, denoted R2 and pronounced R squared, indicates how well data points fit a line or curve, giving a value between 0 and 1
- TO COMPUTE R SQUARED
* Click the
Analyze
menu, point to
Regression
, and select
Linear…

* Select the variable "
Length
" to move it to the
Dependent:
box, and the variable "
Nweight
" to move it to the
Independent(s):
box.
* Click the
Statistics…
button. Select the
R squared change
chick box.
* Click the
Continue
button, and then click the
OK
button.
R squared = .888 , There is a strong relationship between the length and weight
Correlation Coefficient
The correlation coefficient is a measure of the linear correlation (dependence) between two variables X and Y, giving a value between +1 and −1
- TO COMPUTE THE CORRELATION COEFFICIENT
* Click the
Analyze
menu, point to
Descriptive Statistics
, and select
Crosstabs…

* Select the variable "
Length
" to move it to the
row(s):
list box, and the variable "
Nweight
" to move it to the
Column(s):
list box.
* Click the
Statistics…
button. Select the
Correlations
chick box.
* Click the
Continue
button, and then click the
OK
button.
R is very close to +1, x and y have a strong positive linear correlation, such that as values for x increases,
values for y also increase.
R = .942
The Population Parameters Confidence Interval
In statistics, a confidence interval (CI) is a type of interval estimate of a population parameter and is used to indicate the reliability of an estimate.
- To Calculate The Confidence Interval For The Parameters a and b ..
* Click the
Continue
button, and then click the
OK
button.
* Click the
Analyze
menu, point to
Regression
, and select
Linear…

* Select the variable "
Length
" to move it to the
Dependent:
box, and the variable "
Nweight
" to move it to the
Independent(s)
box.
* Click the
Statistics…
button. Select the
Confidence interval
chick box.
The CI for a ( -24.051 , -13.572 ) and for b ( 1.513 , 2.031 )
One-Way ANalysis Of VAriance
(ANOVA)
Analysis of Variance is a statistical method used to test differences between two or more means.
H1: There is at least one mean is different.
H0: All of the population means are equal.
The ANOVA test procedure produces an
F-statistic
, which is used to calculate the
p-value
. As described if
p < .05
, we
reject
the null hypothesis.
ANOVA Table
- To Perform ANOVA Test To Compare The Mean "
Length
" Among The Three Groups In "
Weight
"
** We shall first recode string variables "Weight" into numeric variable.
* Transform > Recode into different variable > Select the variable "
Weight
" > Click the Old and New Values… button > Recode
Light
(L) into 0,
Medium
(M) into 1 and
Heavy
(H) into 2 (as shown in figure) > After clicking
Continue
button Type the new variable name "
NewWeight
" (from the output variable section) >
OK
- TO RUN THE ANOVA TEST
* Click the
Analyze
menu, point to
Compare means
, and select
One-Way ANOVA…

* Select the variable "
NewWeight
" to move it to the
Factor:
box.
* Select the variable "
Length
" to move it to the
Dependent List:
box.
* Click the
OK
button.
The P-value is greater than the significance level, We will accept the null hypothesis. So, we conclude that there is no statistically significant differences in the mean "Length" in the different groups exist at the 5 percent level of significance.
H0: The mean "
Length
" for all groups in "
Weight
" are equal.
H1: At least one of the mean "
Length
" for the groups in "
weight
" is different.
p-value = 0.090
0.090 > 0.05
Binomial Test
In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories.
H0: The number of observations in each category is equal to that predicted by a biological theory.
H1: The observed data are different from the expected.
We reject the null hypothesis when the p-value is less than .05
- To run the binomial test to show that the number of the
male
is equal to the number of the
female

The variable "
Gender
" was previously recoded in the variable "
NewGender
"
* Click the
OK
button.
* Click the
Analyze
menu, point to
Nonparametric Test
, and select
Binomial…

* Select the variable "
NewGender
" to move it to the
Test Variable List:
box.
* Type (
.50
) in the
Test Proportion:
box.
The P-value is
greater
than the significance level, We will
accept
the null hypothesis. So, we conclude that there is
no
statistically significant differences between the number of the
females
and
males
at the
5
percent level of significance.
H0: The number of the female and male are equal.
H1: The number of the females and males are different.
p-value = 1.000
1.000 > 0.05
The Cumulative Distribution Function
The Cumulative Distribution Function describes the probability that a real-valued X with a given probability distribution will be found at a value less than or equal to x.
- To Compute P ( X ≤ 7 ) with
binomial
distribution, p = 0.6
* Click the
OK
button.
* Click the
Transform
menu, select
Compute Variable…

* Select the
CDF & Noncentral CDF
from the
Function group:
list box.
* Double-click the
Cdf.Binom
from the
Function and Special Variables:
list box, then the expression will occur in the Numeric Expression: box
* Type (
7
) instead of
quant
, (
27
) instead of
n
, and (
0.6
) instead of the
prob
* Type "
prob
" in the
Target Variable:
box.
P ( X ≤ 7 ) = 0.00
The Probability Distribution Function
In statistics, a
probability distribution
assigns a probability to each measurable subset of the possible outcomes of a random experiment, survey, or procedure of statistical inference.
To Compute 〖 , where x represents the variable "Baby", with
binomial
distribution, p = 0.6
* Right-click the
case1
> Select
Insert Cases
> let the variable "
Baby
" starts with
0
**
First
we need to insert new
Row
.
* Pull the variable "
Baby
" to drop it in the
Numeric Expression:
box, click the (
**
) button and type the power (
3
), then click the (
*
) button.
* Click the
Transform
menu, select
Compute Variable…
* Select the
PDF & Noncentral PDF
from the
Function group:
list box.
* Double-click the
Pdf.Binom
from the
Function and Special Variables:
list box, then the expression will occur in the
Numeric Expression:
box.
* Pull the variable "
Baby
" to drop it instead of
quant
, Type (
27
) instead of
n
, and (
0.6
) instead of the
prob
* Type "
PDF
" in the
Target Variable:
box.
* Click the
OK
button.
** Compute the Sum for the new variable "PDF"
Analyze > Descriptive Statistics > Frequencies > Double click the variable "
PDF
" > Click the Statistics… button > Select the Sum.
References
http://my.ilstu.edu/mshesso/SPSS/tutorial.html
http://www.slideshare.net/itstraining80/spss-statistics-how-to-use-spss#
http://cstpr.colorado.edu/students/envs_5120/essential_stat_ch9.pdf
http://people.ysu.edu/gchang/SPSSE/SPSSOneSampleTTest.pdf
http://www2.cob.ilstu.edu/longfel/TO%20CALCULATE%20CONFIDENCE%20INTERVALS%20USING%20SPSS.doc
http://www.faculty.sfasu.edu/cobledean/biostatistics/lecture4/pairedsamplehypothesis.pdf
http://en.wikipedia.org/wiki/Confidence_interval
http://www.slideshare.net/shoffma5/t-test-for-two-independent-samples
Finally
And special thanks to ..
I wish I'd had a chance to say a proper thanks to every single person who has ever read this project.
Dr. Lamia Balhadji
Sondos Husamuddin Sagor
Full transcript