Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Stata

No description
by

sunny zhao

on 16 November 2017

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Stata

Stata
*
comments

display
math_equation_you_want_to_calculate

summarize
var
tabstat
var1 var2
, by

group

statistics(n mean var semean sd min p25 median p50 p75 max range iqr)

*
discrete data
tab
var1 var2
, row col

*
& = and
count if
conditional1
&
conditional2
*
| = or
list if
conditional1
|
conditional2


summarize
Tests:
One Sample Test
Two Samples Test
Poisson - Discrete
* probability of observing a value
less than x
from a Normal(mean, sd) distribution
display norm
prob
((
x
-mean)/sd)
* probability of observing a value
greater than x
from a Normal(mean, sd) distribution
display 1-norm
prob
((
x
- mean)/sd)
* probability of observing a value
between
x
and
y
from a Normal(mean, sd) distribution
display normprob((
y
-mean)/sd)
-
normprob((
x
-mean)/sd)
*
value
such that the probability less than that value is
prob
from a Normal(mean, sd) distribution
display
inv
norm(
prob
)*sd + mean
// Here
prob
is a
probability
, a number 0<=
prob
<=1
Normal - Continuous
* prob. of observing
exactly x
successes from n trials with probability of success p
display binomial
p
(n,x,p)
* prob. of observing
x or fewer
successes from n trials with probability of success p
display binomial(n,x,p)
* prob. of observing
x or more
successes from n trials with probability of success p
display binomial
tail
(n,x,p)
* prob. of observing
between
x
and

y
successes from n trials with probability of success p
display binomial(n,
y
,p)
-
binomial(n,
x-1
,p)
* prob. of observing
less than x
or
greater than y
successes from n trials with probability of success p
display binomial(n,
x-1
,p)
+
binomialtail(n,
y+1
,p)
Binomial - Discrete
*
ci for
means
(normally distributed)
ci
means

var
, level(95)
*
ci for
means
(poisson distributed)
ci
means

var
, poisson level(95)
*
ci for
proportion
ci
proportions

var
, level(95)
*
ci for
variance
ci
variances

var
, level(95)
*
ci for
standard variance
ci
variances

var
,
sd
level(95)
ci
ci
i
Two Sample

One Sample
Distributions:
Binomial Distribution
Normal Distribution
Poisson Distribution
Student's T Distribution
Chi-squared Distribution
Aims:
have
proba
bility
want
value
have
value
want
proba
bility
note: which
side
?
left
or
right
x = random variable
p = probability
x = random variable
p = probability
*
Want
Prob.(p)
that X
<=
x
p
= chi2(df,
x
)
*
Want
Prob.(q)
that X
>=
x
q
= chi2
tail
(df,
x
)
*
Want
Value(x)
that P(X
<=
x
)=
p
x
=
inv
chi2(df,
p
)
*
Want
Value(x)
that P(X
>=
x
)=
q
x
=
inv
chi2
tail
(df,
q
)
Chi-square - Continuous
*
Want
Prob.(p)
that X
<=
t
p = t(df,
t
)
*
Want
Prob.(q)
that X
>=
t
q = t
tail
(df,
t
)
*
Want
Value(t)
that P(X
<=
t
)=
p
t
=
inv
t
(df,
p
)
*
Want
Value(t)
that P(X
>=
t
)=
q
t
=
inv
t
tail
(df,
q
)
T - Continuous
*

Prob. that X
=
k
pk
= poisson
p
(mean, k)
*
Want
Prob.(p)
that X
<=
k
p
= poisson(mean, k)
*
Want
Prob.(q)
that X
>=
k
q
= poisson
tail
(mean, k)
*
Mean that P(X
<=
k)
mean =
inv
poisson(k,
p
)
*
Mean that P(X
>=
k)
mean =
inv
poisson
tail
(k,
q
)
graph & chart
hist
var
, bin(
#
)
stem
var
, lines(
#
)

*
box plot in
separate charts
, in
one chart
graph box
var1
,
by
(
var2
)
graph box
var1
,
over
(
var2
)

graph hbox
var

*
bar chart of
count
or
percentage
graph bar (
count
)
var
, over(
var
)
graph bar (
percent
)
var
, over(
var
)

*
spine plot
scc install spineplot
spineplot
var1 var2
, percent
distributions * (
value
+
proba
bility
)
sunnyzhaosifang@gmail.com
This file will be updated, please visit: http://prezi.com/ei58xetk9gpi/?utm_campaign=share&rc=ex0share&utm_medium=copy
Any Feedback is welcome! :)
LinkedIn: https://www.linkedin.com/in/sunnyzhaosifang
Facebook: https://www.facebook.com/sunnyzhaosifang
Seeking a position as statistician, data analyst, data scientist, or similar.
Sunny Zhao (Sifang)
*

Prob. that X
=
k
pk
= binomial
p
(n, k, pi)
*
Want
Prob.(p)
that X
<=
pi
p
= binomial(n, k,
pi
)
*
Want
Prob.(q)
that X
>=
pi
q
= binomial
tail
(n, k,
pi
)
*
Want
Value(pi)
that P(X
<=
pi
)=
p
pi
=
inv
binomial(n, k,
p
)
*
Want
Value(pi)
that P(X
>=
pi
)=
q
pi
=
inv
binomial
tail
(n, k,
q
)
*
Want
Prob.(p)
that X
<=
x
p
= normal( (
x
-mean)/sd )
*
Want
Value(x)
that P(X
<=
x
)=
p
x
=
inv
normal(
p
)*sd + mean
More for Normal
More for Binomial
Confidence Interval (CI)
* (
mean
+
proportion
+
variance
+
standard deviation
)
confidence interval
ci
- compute from dataset
ci
i

- compute from summary statistics
-
immediate
form of ci
For:
mean
proportions
variance
standard variance
*
ci for
means
(normally distributed)
ci
i

means

#obs #mean #sd
, level(95)
*
ci for
means
(poisson distributed)
ci
i

means

#exposure #events
, poisson level(95)
*
ci for
proportion
ci
i

proportions

#obs #succ
, level(95)
*
ci for
variance
ci
i

variances

#obs #variance
, level(95)
*
ci for
standard variance
ci
i

variances

#obs #variance
,
sd
level(95)
by
group
sort
group

by
group
: ci
means

var
by
group
: ci
proportions

var
by
group
: ci
variance

var
by
group
: ci
variance

var
,
sd
ANOVA, Chi_square
Regression and Correlation
Z procedure for one mean
ztest var==value, sd(sigma)
T procedure for one mean
ttest var==value, sd(sigma)
Decision
:
P-value (
0.0979
) > =(0.05) so we fail to reject the null hypothesis

Conclusion
:
We do not have sufficient evidence to conclude that the true mean of (context of the problem) is different from
5
days.
Reject H0 if p-value < alpha
Do not reject H0 if p-value > alpha
Reject H0 if p-value < alpha
Do not reject H0 if p-value > alpha
Decision
p-value =
0.6822
> = 0.05 so we fail to reject H0

Conclusion
We do not have sufficient evidence to conclude that the true average (data usage) is
less than

5

GB.
Non-Parametric
TESTs (one sample + Two Samples)
* (
mean
+
proportion
+
variance
)
Mean
Variance
Proportion
For:
mean
proportion
variance
Full transcript