Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


PREVENT/ Lääke-epidemiologia. Lyhyt johdatus

Tarkastelen esimerkkejä tutkimuksista maailmalla ja mahdollisuuksia HY:ssa

Jari Haukka

on 19 September 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of PREVENT/ Lääke-epidemiologia. Lyhyt johdatus

Tiedot reseptilääkkeiden ostosta voidaan saada Kelan rekistereistä.

Tieto kuolinpäivästä ja syystä Tilastokeskuksen kuolemansyytilastosta
Sairaalahoidot THL:sta
HUSLAB:in ja muiden HUS:n yksiköiden tietoja voidaan käyttää kliinisen lääke-epidemiologian tutkimuksessa.
Uuden lain mahdollistamat biopankit tuovat lisää dataa tutkimuksen käyttöön
19.9. 2013
Jari Haukka, Hjelt-instituutti

Tarkastelen esimerkkejä tutkimuksista maailmalla
Mitä, miksi ja miten?

Pharmacoepidemiology: Pharmacoepidemiology is the study of the use of and the effects of drugs in large numbers of people.

Kliinisen lääke-epidemiologian voidaan ajatella tutkivan lääkkeiden vaikutsta suuressa joukossa potilaita.
Voidaan ajatella että potilaista on saatavilla runsaasti mittaustietoa, koska he ainakin jossain vaiheessa ovat olleet esim. erityissairaanhoidossa.
Epidemiologisen tutkimusasetelman yhdistäminen yksityiskohtaiseen kliiniseen tietoon luo mahdollisuudet hyvään tutkimukseen.

Mitä (kliininen) lääke-epidemiologia tutkii?
Valtakunnalliset ja HUS:n tietovarastot kliinisen lääke-epidemiologian tiedon lähteinä
Esimerkki joka valottaa lääke-epidemiologiseen tutkimukseen ja tulosten tulkintaan liittyviä ongelmia
Mitä käsitellään
Vain pieni osa potilaista mukana tässä analyysissa
Yli 5% todennäköisyys saada tutkittu hoito
Valtakunnalliset ja HUS:n tietovarastot
Goal: To Obtain Valid and Precise Information on Association Between Exposure and Disease Using a Minimum of Resources
Research question involves a
prevention, treatment, or
causal factor.

Moderate or large effect expected.
Trial not ethical or feasible.
Trial too expensive.
Research question involves a prevention or treatment.
Small effect expected.
Ethical and feasible.
Money is available.
Little known about disease.
Evaluate many exposures.
Disease is rare.
Disease has long induction and latent period.
Exposure data are expensive.
Underlying population is dynamic.
Little known about exposure.
Evaluate many effects of an
Exposure is rare
Underlying population is fixed.
Disease has short induction and latent period.
Current exposure.
Want high-quality data.
Disease has long induction and latent period.
Historical exposure.
Want to save time and money.
Select study design
Examine rates of disease in relation to a population-level factor.
Population-level factors include summaries of individual population members, environmental measures, and global measures.
Study groups are usually identified by place, time, or a combination of the two.
Limitations include the ecological fallacy and lack of information on important
Advantages include low cost, wide range of exposure levels, and the abilityto examine contextual effects on health.
Examine association at a single point in time, and so measure exposure prevalence in relation to disease prevalence.
Cannot infer temporal sequence between exposure and disease if exposure is a changeable characteristic.
Other limitations may include preponderance of prevalent cases of long duration and healthy worker survivor effect.
Advantages include generalizability and low cost.
Ecologic studies
A classical ecologic study examines the rates of disease in relation to a factor described on a population level. Thus, “the units of analysis are populations or groups of people rather than individuals.”

The lack of individual-level information leads to a limitation of ecologic studies known as the “ecological fallacy” or “ecological bias.” The ecological fallacy means that “an association observed between variables on an aggregate level does not necessarily represent the association that exists at the individual level.”

In other words, one cannot necessarily infer the same relationship from the group level to the individual level.
Ecological fallacy
We study the following paper:

Ahern, Thomas P., Lars Pedersen, Maja Tarp, Deirdre P. Cronin-Fenton, Jens Peter Garne, Rebecca A. Silliman, Henrik Toft Sørensen, and Timothy L. Lash. 2011. “Statin Prescriptions and Breast Cancer Recurrence Risk: A Danish Nationwide Prospective Cohort Study.” Journal of the National Cancer Institute. doi:10.1093/jnci/djr291. http://jnci.oxfordjournals.org/content/early/2011/08/01/jnci.djr291.abstract.

Please answer the following questions:
How was the study population defined
How was the statin exposure defined (yes/no, amount)
Was there any dose-response checked
Which limitations of ther study your think are the most important
Case Study: Cohort Study
We study the following paper:

Please answer the following questions:
How was the study population defined?
Did choice of model have any effect on results?
If there differences, how would you interpret them?
Which research question is, in you opinion,: most relevant (p 267, left column):

1) Estimate the average treatment effect in a population whose distribution of risk factors is equal to that for the t-PA-treated patients only .

2) Estimates the average effect of treatment in the entire study population, that is, for patients who were and were not treated with t-PA.
Case Study: Cohort Study
Case Study:
Case-control study
Kurth, Tobias, Alexander M. Walker, Robert J. Glynn, K. Arnold Chan, J. Michael Gaziano, Klaus Berger, and James M. Robins. 2006. “Results of Multivariable Logistic Regression, Propensity Matching, Propensity Adjustment, and Propensity-based Weighting under Conditions of Nonuniform Effect.” American Journal of Epidemiology 163 (3) (February 1): 262 -270. doi:10.1093/aje/kwj047.
Epidemiological Study Designs
Directed Acyclical Graph
1: 1
2: 3
3: 25
4: 543
5: 29281
6: 3781503
7: ~1 000 000 000
How many DAGs?

number of edges
Why to use DAG:
DAG terms
d-separation or
directed global markov condition
Observational (Markov) equivalent
A path p is said to be d-separated (=blocked) by a set of nodes Z if and only if:
p contains a chain “i->m->j” or a fork “i<-m->j” such that the middle node “m” is in Z


p contains “collider” “i->m<-j” such that the middle node m is not in Z and such that no descendant of m is in Z

A set Z is said to d-separate X from Y if Z blocks every path from a node in X to a node in Y
d-separation when E
E=Ø: no

S=smoker, L=lung cancer
B=bronchitis, X=pos. X-ray
C= cough
Markov equivalence
Many Bayesian networks may represent the same statements of conditional independence. They are statistically undistinguishable called Markov equivalent. All equivalent networks share the same underlying undirected graph (called the skeleton) but may differ in the direction of edges that are not part of a collider (v-structure)
Observational eq. DAGs
A set of variables Z satisfies the back door criterion relative to an ordered pair of var. s (Xi,Xj) in a DAG G if
No node in Z:is a descendant of Xi and
Z blocks every path between Xi and Xj that contains an arrow into Xj

If Z is back-door to pair (X,Y) then causal effect of X on Y is:

Z is enough to control for confounding variables (Greenland et al. 1999)
Back door
"node" or "edge"
"link" or "arch"
Describe conditional indepence between variables

Only relevant varsiables in DAG
Qualitative description
Conditional indendence in directed graphs. The three archetypal situations in the definition of d-separation. In the chain and the fork, conditioning on the middle node makes the others independent. In a collider, X and Z are marginally independent, but become dependent once Y is known.
Markowetz and Spang BMC Bioinformatics 2007 8(Suppl 6):S5 doi:10.1186/1471-2105-8-S6-S5
Conditional indendence in directed graphs. The three archetypal situations in the definition of d-separation. In the chain and the fork, conditioning on the middle node makes the others independent. In a collider, X and Z are marginally independent, but become dependent once Y is known.
Markowetz and Spang BMC Bioinformatics 2007 8(Suppl 6):S5 doi:10.1186/1471-2105-8-S6-S5
DAGs could be use in model selection
Which variables should be taken into account when controlling for confounding
M- and Z- bias
Three basic structures
Miguel A. Hernan, Sonia Hernandez-Diaz, and James M. Robins, “A Structural Approach to Selection Bias,” Epidemiology 15, no. 5 (2004): 615-625.

Florian Markowetz and Rainer Spang, “Inferring cellular networks - a review,” BMC Bioinformatics 8, no. 6 (2007): S5.

Jenni Ilomäki et al., “Relationship between alcohol consumption and myocardial infarction among ageing men using a marginal structural model,” The European Journal of Public Health (March 11, 2011), http://eurpub.oxfordjournals.org/content/early/2011/03/11/eurpub.ckr013.abstract.
Now, draw DAG for your own study!
Onyebuchi A Arah, “The role of causal reasoning in understanding Simpson’s paradox, Lord’s paradox, and the suppression effect: covariate selection in the analysis of observational studies,” Emerging Themes in Epidemiology 5 (2008): 5.
Directed acyclic graph (DAG) showing that birth
weight (BW) has a direct effect as well as an indirect
effect via current weight (CW) on blood pressure
DAG showing a scenario where birth weight (BW)
has a causal effect on and shares a common cause –
current weight (CW) – with blood pressure (BP). That
is, the relationship between BW and BP is confounded by CW.
# Example of a descriptive DAG

# WHO data of health measurements in Europe
# (http://www.who.int/whosis/en/)
# GNIncome: Gross national income per capita (PPP international $)
# Phys: Physicians density (per 10 000 population)
# Mort.CHD: Age-standardized mortality rate for cardiovascular diseases (per 100 000 population)
# Healt.GDP: Total expenditure on health as percentage of gross domestic product
# Hosp.Beds: Hospital beds (per 10 000 population)
# HALE.all: Healthy life expectancy (HALE) at birth (years) both sexes

> head(tmp.data,3)
GNIncome Phys Mort.CHD Healt.GDP Hosp.Beds HALE.all
Albania 6000 12 537 6.2 30 61
Armenia 4950 37 498 4.7 44 61
Austria 36040 37 204 9.9 76 71
# Two different algorithms
tmp.ex1<- mmhc(tmp.data,perturb=500,restart=20)
tmp.ex2<- rsmax2(tmp.data)
Two algoritms - two structures
tmp.ex1<- mmhc(tmp.data,perturb=500,restart=20)
tmp.ex2<- rsmax2(tmp.data)
How to connect data and DAG?
One option is to use different algorithms for "learning" the structure of Bayesian networks from the data
Some algorithms implemented in R (http://www.rproject.org), e.g. packages "bnlearn" and "deal"
Marco Scutari, “Learning Bayesian Networks with the bnlearn R Package,” Journal of Statistical Software 35, no. 3 (2010): 1–22.
DAG from (Ilomäki et al. 2011)
Presented by

Jari Haukka, PhD
Sr. lecturer
Hjelt Institute
University of Helsinki
Full transcript