Applications of

statistics

to

medicine

and the health sciences, including

epidemiology, public health, forensic medicine, and clinical research

.

Why we need to know !

Types of Data

(results of observations)

Role of Researcher

Prepare raw data (master table)

Prepare statistical requests

Read and interpret statistical output

Evaluate the statistical methods used

variability of living creatures !

The Research Process: an Eight-steps Model

Phase I: deciding what to research

Step I: formulating a research problem

( FINER )

Phase II: planning a research study

Step II: conceptualizing a research design

Step III: constructing an instrument for data collection

Step IV: selecting a sample

Step V: writing a research proposal

Phase III: conducting a research study

Step VI: collecting data

Step VII: processing and displaying data

Step VIII: writing a research report

Age

Temperature

Blood Pressure

Quantitative data

Depends on measuring the quantities

Numerically represented (age, length, weight, . . . .etc)

Have clearly and uniformly defined units

Continues VS. Discrete

Race

Stages of CRC

Being a Hypertensive

Qualitative data

Depends on describing the characteristics (sex, race, occupation, severity, different types of scores)

Nominal

Having a disease or not

Sex

Martial status

0 or 1

Ordinal Data

Stage of CRC

Rating Scale

The first, third and fifth person in a race

Quantitative data

Age = 37.5 Y

Temperature = 38.1 C

BMI= 35.3 kg/m2

Qualitative data

Age = Adult

Temperature = Fever

BMI = obese

Transformation of Data

Measures of central tendency & Dispersion

Quantitative data

"Normally distributed"

Arithmetic mean (Mean)

sum of values (x) divided by the number of observations (n)

Normally distributed data

Normally distributed data

"Truths about the general nature of reality,"

Dispersion from the mean

SD

The more the SD the more the variation and vice versa

More sensitive than the range (affected by every value)

There is no range for SD

The value of SD has nothing to do with the goodness of data

Range is NOT the mean ± SD

When the SD is > ½ mean Data are mostly not normal

Extension to the qualitative variable !

Proportion

Frequency Table

SEM

(m) is just is an estimation of this unknown “M”

how much the former is a good representative of the latter ?

Sample size ?

The standard error of mean (SEM) = SD / sqr N

0

1

2

5

6

25

25

29

32

41

94

Mean = 26

0

24

24

25

25

25

25

26

28

28

30

Mean = 26

Standard Deviation (SD)

A measure of how spread out numbers are.

Square root of the Variance.

Variance

The average of the squared differences from the Mean.

Why Squared?

Degree of freedom

Quantitative data

"Non-normally distributed "

Median

Middle value of a ranked array

Suitable for not normal data

Non-Normal Data

Dispersion

0

1

2

5

6

25

25

29

32

41

94

0

24

24

25

25

25

25

26

28

28

30

Median = 25

Median = 25

Centiles

Quartiles

Deciles

Percentiles

Suitable with median

Quartiles

Inter-quartile range: (IQR)

It represents the range including the 2nd & 3rd quartiles

It represents the middle ½ of data

Visualization of data

Qualitative (Categorical) data

Continues quantitative data

Practice

Compared with

normotensive

and

prehypertensive

Hispanic

women, hypertensive participants were older and had less than a

high school education

(Table 1).

They also had a higher number of cardiovascular risk factors, such as

family history of diabetes, stroke, and/or myocardial infarction, treated hypercholesterolemia, treated diabetes, and history of CVD

.

Hypertensive Hispanic women had a higher BMI

(30.3±6.0kg/m2)

than prehypertensive

(29.0±5.4kg/m2)

and normotensive participants

(27.3±5.4kg/m2)

.

Finally, hypertensive Hispanic women

were not currently smokin

g,

did not engage in moderate to strenuous activity

, and were more likely

to be a nondrinker/past drinker.

Statistical inference

95% Confidence Interval

Increase your Confidence Level

Increase your Sample Size

1. Confidence interval

Measures of disease frequency

Incidence

A rate

Time in denominator

Over Time ( Cohort or RCT )

Person-per-year Follow up

Prevalence

Looked at one time

Cross-sectional

Frequency

Occurrence of a repeating event per unit time.

Cumulative Risk

A proportion

Absolute & Relative Risk

Absolute risk difference

The difference between the observed risks (proportions of individuals with the outcome of interest) in the two groups

Relative Risk

Risk Ratio

The ratio of the risk of an event in the two groups

Odd Ratio

The ratio of the odds of an event

Practice

If 40% of a treated group has a positive response versus just 10% of the placebo group, what are the

risk ratio and odds ratio

(for treatment vs. placebo)?

In a study that enrolled 1000 women for 0.5 years, 10 women developed breast cancer. What was the

incidence rate for breast cancer

In a study that enrolled 1000 women for 0.5 years, 10 women developed breast cancer. What was the

cumulative risk of breast cancer

What measure of disease frequency can be calculated from

cross-sectional studies

?

What measure of disease frequency can be calculated from

case-control studies?

Biology is not physics !

We deal with ?

Fixed Events !

Random events,

Probability !

Statistics is "internal" to medicine !

Diagnosis, prognosis and treatment

Statistical analysis neither makes medical judgment nor prescribes

Planning for services and carrying out clinical trials

Can not be quantified

Summarizing Data

Global picture

Analyze results

using statistical tests

To take

“decisions”

To

communicate

with each other and with our patients.

Confidence interval

When it comes to reality ! Normal distribution !

ؤCI of subject, Use SD

CI of mean, Use SEM

p-value

**Introduction to Medical Statistics**

Ahmed Elgebaly

How to choose the right statistical test?

Medical Statistics

"The Art of Prediction"

We muddle through life

making choices

based on

incomplete information

Statistics is all about

quantifying uncertainty!

MBBCH student, Faculty of Medicine, Al-Azhar University

RSDD of Medical Research Society "MRS"

Senior Researcher at MRGE

International peer-reviewed articles and book chapters

Email: Ahmedelgebaly94@gmail.com

Statistics

Descriptive Statistics

Our aim is to get

simple measurements

from the

crude set of data

Inferential Statistics

The science of drawing

statistical conclusions

from

specific data.

1. Descriptive Statistics

Any summary measurements has

the central

and

the spread

Normal Distribution

"Truths about the general nature of reality"

When data tends to be

around a central value

with no bias left or right

The

"Bell Curve"

Many things closely follow a Normal Distribution "Blood pressure, Height, ..etc"

When

1- mean = median = mode

2- symmetry about the center

3- 50% of values less than the mean and 50% greater than the mean

The science of drawing

statistical conclusions

from specific data.

FACTORS CONTROL THIS ERROR

VARIATION

WITHIN THE SAMPLE “SD”

Sample size

Sampling Error

"Standard Error"

Based on Your Data

Is it Right to said “The mean for population equals ……?"

“The mean for population lies between….”

We are just

sampling

so we have a

margin of error

In Medical Research, we are dealing with

Inductive

not Deductive reasoning!

Our goal is to

capture the true effect

(because we are just sampling).

Our

sample is the best estimate

but we would always have

a margin of error.

Confidence Interval

"We are pretty sure that our population

lies within this range

"

Estimating confidence interval is the one of the most effective way in

statistical inference.

Confidence interval is to try to include

the true mean

within a range

(because I cannot give you a single estimate and say this is the true effect

).

A

95% CI

should include the

true effect

95% of the time

How to improve your confidence level?

95% CI = Mean +/- 2 SE

2. Hypothesis Testing

Could these observations have occurred by chance?

Types of Hypothesis

The Null hypothesis

is that the observations are purely due to

chance

Alternative hypothesis

is that the observations are due to

real effect

P-Value

If the null hypothesis is true; what is the

probability of observing

such

extreme results

The Probability

that the

Null hypothesis is true

P-Value

The

smaller

the p-value, the stronger

the evidence against the null

A significant of level less than 0.05

Learning Objectives

Bio-medical Statistics: What & Why?

Descriptive Statistics

Statistical Inference

Learning Objectives

Bio-medical Statistics: What & Why?

Descriptive Statistics

Statistical Inference

Learning Objectives

Bio-medical Statistics: What & Why?

Descriptive Statistics

Statistical Inference

References

1. Harris, Michael and Gordon Taylor. Medical Statistics Made Easy. Print.

2. Gonick, Larry and Woollcott Smith. The Cartoon Guide To Statistics. Print.

3. "Introduction - Handbook Of Biological Statistics". Biostathandbook.com. N.p., 2016. Web. 31 Aug. 2016.

Everything we know is only some kind of

approximation