### Present Remotely

Send the link below via email or IM

• Invited audience members will follow you as you navigate and present
• People invited to a presentation do not need a Prezi account
• This link expires 10 minutes after you close the presentation

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

You can change this under Settings & Account at any time.

# Data collection and representation

No description
by

## Michael Bawden

on 16 November 2016

Report abuse

#### Transcript of Data collection and representation

Data Analysis
Collecting data
Investigative techniques we will look at for collecting data are; census, sampling and observation.

Research and record the meanings and differences between these data collection techniques.
CATEGORICAL VARIABLE
A categorical variable is a variable whose values are categories.

Examples: blood group is a categorical variable; its values are: A, B, AB or O. So too is construction type of a house; its values might be brick, concrete, timber, or steel.

Categories may have numerical labels, for example, for the variable postcode the category labels would be numbers like 3787, 5623, 2016, etc, but these labels have no numerical significance. For example, it makes no sense to use these numerical labels to calculate the average postcode in Australia.

CENSUS
A census is an attempt to collect information about the whole population.

COLLECTING DATA
Investigative techniques we will look at for collecting data are; census, sampling and observation.

CONTINUOUS VARIABLE
A continuous variable is a numerical variable that can take any value that lies within an interval. In practice, the values taken are subject to the accuracy of the measurement instrument used to obtain these values.

Examples include height, reaction time to a stimulus and systolic blood pressure.

DATA
Data is a general term for a set of observations and measurements collected during any type of systematic investigation.

Primary data is data collected by the user. Secondary data is data collected by others. Sources of secondary data include web-based data sets, the media, books, scientific papers, etc

NUMERICAL VARIABLES
Numerical variables are variables whose values are numbers, and for which arithmetic processes such as adding and subtracting, or calculating an average, make sense.

A discrete numerical variable is a numerical variable, each of whose possible values is separated from the next by a definite 'gap'. The most common numerical variables have the counting numbers 0, 1, 2, 3, … as possible values. Others are prices, measured in dollars and cents.

POPULATION
A population is the complete set of individuals, objects, places, etc, that we want information about. A census is an attempt to collect information about the whole population.

Examples include the number of children in a family or the number of days in a month.

SAMPLE
A sample is part of a population. It is a subset of the population, often randomly selected for the purpose of estimating the value of a characteristic of the population as a whole.

For instance, a randomly selected group of eight-year-old children (the sample) might be selected to estimate the incidence of tooth decay in eight-year-old children in Australia (the population).

VARIABLE

(STATISTICS)
A variable is something measurable or observable that is expected to change either over time or between individual observations.

Examples of variables in statistics include the age of students, their hair colour or a playing field's length or its shape.

http://http://www.abs.gov.au/browse opendocument&ref=topBar
Australian Bureau of Statistics
METALANGUAGE GLOSSARY
Frequency histograms and polygons
Frequency distribution tables
Box plots
Worksheets
Dot plots
Shape of data distribution
Software
Excel to R Studio
http://www.wolframalpha.com/widgets/view.jsp?id=25d70df1dbf954506a4f3015a26d03ea
Widgets
Rolling dice
Formulas
Uniform or rectangular : All bar values are equal in length

Normal or Gaussian : Symmetric - Mean = median = mode

Positive skew (skewed right) : Tail points toward positive numbers

Negative skew (skewed left) : Tail points toward negative numbers

bimodal : 2 modes - Larger is major mode other is minor mode.
The difference between the major and minor mode is called
the amplitude. The least frequent value between the modes
is called the antimode.

multimodal : 2 or more modes
Videos
e.g. A bivariate, multimodal distribution