### Present Remotely

Send the link below via email or IM

• Invited audience members will follow you as you navigate and present
• People invited to a presentation do not need a Prezi account
• This link expires 10 minutes after you close the presentation

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

# Lecture 2 Delivered

No description
by

## Robert Walker

on 27 August 2018

Report abuse

#### Transcript of Lecture 2 Delivered

Data Analysis, Modelling, and Decisionmaking

Data
Data: n. (pl.) more than one realization of measured characteristic(s) of an element/unit
Probability Distributions
Probability: n. the numeric value representing the chance, likelihood, or possibility that a particular event will occur.
"The logic of science." -- E. T. Jaynes
Statistical Inference
Statistical inference: n. the theory, methods, and practice of forming judgments about the parameters of a population and the reliability of statistical relationships, typically on the basis of random sampling.
Evidence-based Management
Evidence-based management: the combination of research and relevant data in making managerial decisions.
Graphical Summary
Single Variable
Box-and-whisker plot
Histogram
Dot Plot
Density plot
Pie/bar chart
Numerical Summary
Single Variable
Metric summary
Percentile summary

Two variables
Covariance/correlation
Cross-tabulation
Linear Regression
Slope(s)
Intercept
R-squared
Probability Theory
Discrete Distributions
Binomial Distribution (np)
Poisson Distribution (lambda)
Hypergeometric (n,k,N,K)
Geometric (k,p)
Continuous Distributions
Normal Distribution (mu,sigma)
Uniform Distribution (a,b)
Exponential Distribution (lambda)
Hypothesis Testing
Expense ratios are higher for intermediate government bond funds.
Confidence Intervals
With 95% confidence, the difference in average expense ratios is 0.012 to 0.163 higher for intermediate government funds.
Sound argumentation
Relevance of evidence
Bayes Rule and the role of belief.
Probabilistic statements occupy the role -- in science -- of stating uncertainty in a common language.
Bayesian decision-making is the essence of evidence-based management.
Decision criteria and transparency
Careful attention to process renders clarity in the decision-making process.
Problem definition and careful attention to sequence and logical structure permit disagreement in an explicable framework, e.g. we cannot dispute the facts but we may well dispute their relevance.
Two variables
Mosaic plots
Scatterplots
Samples
Populations
Is the treatment worth the expense?
Little evidence to suggest that car seats are effective for two to six year olds.

Data
Pertinent evidence, examples, and facts.
Warrants
The reason that the evidence supports the claim.
Backing
Rebuttals
Qualifiers
Claim
A statement of opinion to be supported.
Stephen Toulmin's Model of Argumentation
Backing: Evidence supporting the application of a warrant.
Rebuttals: mitigating factors in the disqualification of warrants.
Qualifiers: Limits on the strength or applicability of a warrants.
The most significant cause of death among American children.
Use data to investigate the efficacy of child safety seats.
Claim: Car safety seats do not improve outcomes for children aged 2 through 6
Generalization warrants:
The evidence from a sample implies truth in a population.

Composition warrants:
The evidence contains signs, clues, symptoms, or components of the claim.

Authority warrant:
The evidence is linked to authoritative source interpretations.

Analogy warrant:
Evidence is connected by analogy, event, or precedent.

Causality warrant:
The evidence is caused by or as a result of the claim.

Principle warrant:
The evidence is indicative of a broader, relevant principle.
Just the facts on the size of the problem and the "cures"
The data on all reported crashes with fatalities.
Experimental evidence does not show the advantage of car seats either.
Robert W. Walker, Ph. D.
Associate Professor of Quantitative Methods
BondFunds.xls

A random sample of 184 bond funds.
Fund Number: Identification number unique to each fund.
Type: Intermediate government / short term corporate
Assets: In millions of dollars
Fees: Sales charges (yes or no)
Expense ratio: Ratio of expenses to net assets
Return 2009: Twelve-month return in 2009
3-year return: Annualized returns 2007-2009
5-year return: Annualized returns 2005-2009
Risk: Risk-of-loss classification for the mutual fund.
Data can be acquired from http://www.willamette.edu/~rwalker/GSM5103/data/BEARXSP500.xlsx
The elements of a sound argument are statements of logic replete with bounding conditions.
This is also true of the statements of statistics.
And of science -- when probability enters.
Definitions
Arithmetics
Multiplication
Union and intersection/joint probability
Conditional probability and independence
Bayes' Rule
P(Fees=Yes|Type=IG)=34/87
P(Type=IG|Fees=Yes)=34/54
P(Type=IG and Fees=Yes)=34/184
What do we know?
These programs cost money per participant.
They recover money in unpaid benefits now and over time.
The failure rate is low.
Generally, the benefits (\$860 on average) are less than cost.

BUT
What are the goals? What quantities are we minimizing/maximizing?
How do we translate the quantities that we can measure and assumptions about those that we can't into decisions?

In USA Today (March 18, 2012)
Drug testing welfare applicants nets little.
Net effect is minimal if not costly. Must assume saved benefits.
No drug test, wo welfare: Program protects taxpayer dollars.
Even if costly, funds are only disbursed to those with negative tests.

Remaining Uncertainties
How do we account for those not taking the test?
Some drop from fear and some from not needing TANF but this matters for accounting
For state employees, if performance metrics can't establish reasonable suspicion, then there may be better uses of resources.
An Extended Example using Juries [Resources > Probability > Juries.xlsx

Problem setup: A murder

Three pieces of evidence:
Blood type: P(E1|NG)=0.45
Fingerprints: P(E2|NG)=0.21
DNA: P(E3|NG)=0.017

Bayes rule gives us: P(G|E1,E2,E3)

It all depends on the Prior.....
Drug test:
Two outcomes: + and -
Two statuses: User and non-user
Claim: Car seats do not improve crash outcomes for 2-6 year olds.
Evidence: No difference in the death outcome from broad observational data.
Warrant: Fatal crashes are just like others and the death outcome is just like injuries in relation to car seats.
Probability is :
(1) a priori (known),
(2) empirical (data), or
(3) subjective.
What do we learn
given
the data?
Because what we are learning from data is almost always a
probability distribution
. How likely is a variable to take on particular values?
Full transcript