Data Management Portfolio Discrete Continuous Data If there is only a finite number of values possible or if there is a space on the number line between values. Data that can take on any real value in its range. Examples: Integers, population of space caterpillars. Examples: Physical measurements, height of a tree. Graphical Displays of Pie Charts A circle divided into sectors whose areas are proportional to the variables represented. Pie Charts should have a title, a legend and the angles for the sectors should be done properly. The chart above also has labels, indicating the data used to create the graph. Bad Pie Charts This chart is terrible because it is slanted and in 3-dimensions. This makes sections in the front look larger because the perspective adds area. Item C is also "exploded" out of the pie chart, making it look larger and more important. The graph is missing a title. This chart causes aneurysms for the following reasons: It is the wrong chart to use, pie charts should be used for data that altogether makes up a whole, while these statistics do not cover anything of the sort. It is extremely overcrowded, it is not possible to make any conclusions about the last 95 users indicated, colours are reused. Conclusions Pie charts should only be used for data indicating percentages of a whole. Example: Opinions of a Population, Characteristic of a Population, Distribution of Income.

What should not be done with pie charts:

Displaying anything outside of percentages of a whole.

Overcrowding a chart.

Accentuating a single or multiple portions of the pie chart, which would make it seem more important, biasing the result. In many cases, "exploding" a pie chart makes it more confusing and difficult to understand the results.

Slanting or making the pie chart in 3D makes portions seem to be bigger or smaller simply based on the orientation.

Compare data across two or more pie charts. Pictographs A pictograph should have a title and a legend that will explain the scale of the symbols to population. A graph that uses pictures or symbols to represent variable quantities. Bad Pictographs This pictograph simply scales the picture of the frog larger in order to show an increase to population. This makes it appear much bigger than the value actually is because the area is increasing by width too. Because it just scales the picture, it is also lacking a legend. This graph used two different pictures to compare the two values. This is a problem because the picture of the oranges is more full and appear to have greater area in comparison to the apples. Another problem is that the legend does not explain anything significant like scale (1 fruit represents approx. 25 million fruits produced), but rather that apples are apples, and oranges are oranges. The pictures in the legend can barely be made out. Conclusions Pictographs can be used in circumstances where a bar graph is appropriate, and its variable value deals with something that can be visualized. Examples of data that could use pictographs are: Population of Deer in Parks, Number of Visitors to Top 10 Websites.

What to avoid while making a pictograph:

Using a pictograph where it is not appropriate.

Using different symbols for different variables, this might change the area or appearance, rather use a symbol that encompasses all the variables

Scaling an image to indicate a different value. This changes the area quadratically instead of linearly. To indicate fraction values of a symbol, crop a portion out. Line Graphs Broken Line Graphs A graph where points are plotted on a grid and a line or curve of best fit is added. A graph where points are plotted on a grid and are joined with line segments between the points. Bar Graph Stacked Bar Graph Histogram Scatter Plot This is one pie chart from a magazine, summarizing the results of a poll. The graph does not show bias and is an example of a pie chart done properly. However, there is one problem that may arise from the way the data was collected from a poll. It asked a fairly open-ended question: "What's your favourite part of a long road trip?" and it only gave 4 possible answers to the poll. It does not accurately represent the population that enjoys driving or something else than those four options. This is a basic bar graph that includes axis labels, a title, discrete independent variables, and accurately scaled rectangles. The enrolment values could displayed as data values, ignoring the y-axis scale completely. The bars could be sorted by increasing value for easy reference, though with so few variables, this is not even necessary. The idea that Asteroid College and The Slime University have the highest enrolment rates is easily established. Visual displays which can be used to demonstrate an idea about data graphically. Note: Pictographs should rarely be used and are easily replace with bar graphs. This line graph could have been replaced with a bar graph or a histogram. However, a line graph is a good choice here. Since a trend line can be used, information can be extrapolated from the graph. For example, it appears hatching is becoming less and less successful for caterpillars, and if the trend continues, by the year 2305, successful hatching rates will be at 20% In this particular pie chart, statistics are not precise, because exact population is difficult to measure, however, meaningful information can still be extracted, most of the space caterpillar population is within the inner planets and asteroid field. Only about 1/4 are elsewhere. This may not be so easily visible in a bar graph or table. Graphs should usually display meaningful data, with a clearly visible trend or idea that can be extracted. In this scatter plot, there is no correlation between the two variables, which is the idea that should be extracted. It proves that the idea that increasing antenna length of space caterpillars leads to increased bloodthirstiness is wrong. Trend lines and R squared values indicate that there truly is no trend in correlation between the variables. Notice that the scatter plot has an appropriate title, the labels for the axis are necessary, and that the points used to indicate the values for the scatter plot are simple. Sometimes it is necessary to remove outliers, however this did not need to be done in this case. A legend was also not necessary. This scatter plot is has the problem of being overcrowded. Data labels in this case should be abbreviations of the state name. If you ask me, you should never use a pictograph. Conclusions Scatter plots should be used with data that has two variables. There should be axis labels and a title. Intervals should be labeled on both the axis. Scatter plots could be used with data like: Height vs Weight of a Grade Two Class or Tree Age and Height on a Property.

What to avoid with scatter plots:

Over crowding a scatter plot, using too many points (or dots that are too big) can obscure an idea or trend in the plot.

Outliers, which could waste space or skew the results. Notice this is not a scatter plot, the data used broken line graphs cannot have two y-values for a x-value. This particular broken line graph has points to indicate the data points. Line graphs can be simply be scatter plots with trend lines. They are a less specific type of graph than broken line graphs, in which the independent variable has to be something like time, which only has one occurrence. Data that could be used in a line graph could be: Height of Boy During High School, Number of Cars in Kitchener.

It is important to use the type of trend line that best represents the trend in data. Usually using a computer to plot the trend line will result in an accurate result. As with scatter plots, outliers should be excluded to retain accuracy. The y-axis should start at zero in order for the graph not to be misleading. This bar graph, taken from a magazine, suffers from a few problems. Values on the y-axis are highlighted, as if once an arbitrary 350 ppm concentration of atmospheric carbon dioxide is reached, all the horrible things unnecessarily included in the chart space will magically start occurring. More unnecessary pictures and text occupy the space. The y-axis starts at 270 ppm instead of at 0, further biasing the data. Bad Broken Line Graphs Line Graphs Conclusions A broken line graph is a specific type of Line graph in which the data is joined by lines to interpolate the values in between the points. This means that the each value on the independent variable must have only on corresponding dependent value on the graph.

As with line graphs:

No three dimensional effects should be used, they are misleading.

The y-axis should start at zero is possible in order to not bias the data. If correlation is difficult to spot with this scale, perhaps percent change or change per year could represent the trend in data. A bar graph is a graph with two variables, the independent variable should be discrete. The data is represented by rectangles or varying height. Stacked Bar Graph is a bar graph in which the dependent variable has values that can further be differentiated. These values can be represented by different colours of stacked or adjacent rectangles. Which graph should be used? This can be answered by simply asking what the idea behind the graph is. There are two ideas the graphs can try to make clear to the reader. They can either graphically indicate which race is more popular or which gender is more prevalent at the race. In the first graph, the fact that the Slime Cup is the most popular race is easily apparent. It is also possible to tell that the Pink Platypus and Suicidal Speed are not balanced in terms of gender. However, it is not possible to tell something like this for the Memorial Trophy, because being able to compare two similar rectangles with a different starting position is difficult for the human brain. In the second graph, it is very easy to tell which gender is more prevalent in every race, however, it is difficult to tell which race is more popular. Suicidal Speed seems to be the most popular even though it is actually the Slime Cup. This bar graph, taken from a magazine, seems to have a problems. Because the bars have their own data values included, it is not necessary to include a y-axis. In addition to this, the categories for the independent variable seem very ambiguous. The title is "Number of companies finding savings from emission reduction activities", meanwhile one of the categories is "emission reductions". Since the idea behind the graph is that the number of these companies is growing, this could more easily be shown using a stacked bar graph. This is a histogram instead of a traditional bar graph because the independent variable is continuous. Intervals were chosen to classify the data. Histograms are useful for displaying the distributions of populations like in this graph and then forming conclusions from the characteristic of this distibution. Bar graphs are used for comparing the values of variables in order determine the relationship between the variables.

Normal bar graphs can be used for discrete independent variables. Histograms should be used with continous independent variables where the data is classified into bin and frequency.

Stacked bar graphs should be used with discrete data that has more than one value for each independent variable.

Possible data that could be used with each of the graphs:

Normal bar graph: Number of Customers to Top Ten Restaurants, Scores of Participants in Contest.

Histograms: Heights of Students in a Class, Weight Distribution of Rhino Population.

Stacked bar graph: Percentage of Factory Devoted to Producing Certain Models of Cars over Past Ten Years. What should be avoided:

3-D effects or perspective with the bars of the graph. They may mislead the reader or crowd the graph area.

Crowding labels on the x-axis of the graph. If there are more than 8 or 10 variables, it is suggested to use a graph with a horizontal bar orientation.

Stacked bar graphs should be used with caution, it may be difficult to tell a change in height of a section of a bar with a different starting point. Stem and Leaf Plot Box and Whisker Graph A stem and leaf plot is an quick way of storing data that also reveals some characteristics about the data. A stem and leaf plot should be used with numerical data that has two or more digits. It is easily organized in a table using the first digit (or more) as the stem, and a leaf to store all the last digits of the individual data. A stem and leaf plot can also be used as a display of data because conclusions can easily be drawn. Space caterpillars have a large range of slime capacity (9 L to 51 L), yet having a 20 L to 40 L capacity is most common. A stem and leaf plot is a fairly simple tool, there is not much that can be done to mislead people. However, it should be noted that this is useful for quick storage on paper, most software cannot recognize it as data and use if for creating elaborate graphs. A box and whisker graph can be used for showing the distribution of data, and drawing conclusions about it. The picture is fairly explanatory about how a box and whisker graph is made. A set of numerical data is taken and the median and lower and upper quartile values are calculated. This and the upper and lower extremes of the data are used to graph the box and whisker. A box and whisker can be used in unison with other graphs in order to reveal more information about an idea. By itself it can show some important characteristics about the distribution data without the use of something complicated and large, like a histogram.

Examples of data that could be used with box and whisker graphs: Lengths of Cat's Tails, Weight of Walruses. Examples of data that can be used with stem and leaf plots: Capacity of Water Bottles in Bin, Age of People Attending Concert. Thank you! Come again!

Works Cited

"Number Added Together in Addition Problem." ISAT Math Word List. N.p., n.d. Web. 04 Jan. 2013.

"Apu Nahasapeemapetilon." Wikipedia. Wikimedia Foundation, 01 Feb. 2013. Web. 04 Jan. 2013.

"Aviz." Value Of Info Vis. N.p., n.d. Web. 04 Jan. 2013.

"Can Beavers Save Us?" Alternatives Journal Sept.-Oct. 2012: 6. Print.

"Darts." Gallery of Data Visualization. N.p., n.d. Web. 04 Jan. 2013.

"In Praise of Shock and Awe." Clioviz. N.p., n.d. Web. 04 Jan. 2013.

Jen. "BOX-AND-WHISKER PLOT." Jen's Box-and-Whisker Plot Instructions. N.p., n.d. Web. 04 Jan. 2013.

Klass, Gary. "Good Charts." Good Charts. N.p., 2002. Web. 04 Jan. 2013.

McConnachie, D. "Corporates on Climate." Alternatives Journal Jan.-Feb. 2013: 11. Print.

"Poll." CAA Magazine Fall 2012: 3. Print. Matej Goc

### Present Remotely

Send the link below via email or IM

CopyPresent to your audience

Start remote presentation- Invited audience members
**will follow you**as you navigate and present - People invited to a presentation
**do not need a Prezi account** - This link expires
**10 minutes**after you close the presentation - A maximum of
**30 users**can follow your presentation - Learn more about this feature in our knowledge base article

# Graphical Displays of Data

For my Data Management Course.

by

Tweet