The Technology Context – B101
Statistical and graphical summary of data
Academic year
2011/12
Objectives:
- Practices for using charts
- Compare the usage of several types of charts
- Describe data sets using common statistical calculations
- Some common issues when using statistics
Using graphics in presentations
People tend to remember visual elements in this order.
Using graphics in presentations
Use of colors could be associated with emotions, ideas or cultural values.
| | Green: growth, movement |
| | Blue: calm, institutional |
| | Purple: spiritual |
| | Red: power, energy, danger |
Diagrams and graphs illustrate data and/or relationships
Line graphs show time and data relationships.
Domestic Material Consumption and GDP (1990 to 2005)
Source: UK Department for Environment, Food and Rural Affairs. (2006). Sustainable development indicators in your pocket 2006. London: Author.
Diagrams and graphs illustrate data and/or relationships
Compare data with bar charts.
Waste by Sector (1998–9 to 2002–3)
Source: UK Department for Environment, Food and Rural Affairs. (2006). Sustainable development indicators in your pocket 2006. London: Author.
Waste Disposal (1998–9 to 2002–3)
Source: UK Department for Environment, Food and Rural Affairs. (2006). Sustainable development indicators in your pocket 2006. London: Author.
Diagrams and graphs illustrate data and/or relationships
Pie charts are useful to show the composition of data.
Source: South West Water. (n.d.). Domestic water use by activity 2002/3. Retrieved from http://www.swwater.co.uk/media/listimages/i/7/Water_use__pie.gif
Diagrams and graphs illustrate data and/or relationships
Diagrams often show scientific relationships.
Periodic Table of Elements
Source: Folkman, S. (n.d.). Periodic table. Retrieved from http://www.neng.usu.edu/mae/faculty/stevef/info/PTable/a_PeriodicTable.gif
Diagrams and graphs illustrate data and/or relationships
Diagrams often show process relationships.
Flowchart of Nested Decisions
Diagrams and graphs illustrate data and/or relationships
Organisational charts show people relationships.
Source: Kids Turn Central. (n.d.). My family tree. Retrieved from http://www.kidsturncentral.com/clipart/genbears/familytree2sample.gif
Suggestions for using charts
- Use a chart title that at least indicates what and when the data was collected.
- Use a scale that is meaningful for the data.
- Use labels that are simple, but sufficiently descriptive, of the data.
- Indicate units of measurements to clarify the data.
- Consider using shapes or arrows to highlight parts of a chart.
- Revise charts as you would revise text.
What are some useful descriptions about this data set?
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
There are 10 elements in this data set
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
There are 6 variables in this data set
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
There are 10 observations in this data set
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
There are 60 data values in this data set
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
3 types of data values
This data set contains
cardinal data values, because they may be
ranked and compared in a meaningful way.
In contrast,
nominal data values
cannot be ranked or compared.
In between are
ordinal data values, which
may be ranked, but not compared.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
2 types of data sets
This data set contains
cross sectional data of households with Internet access at home in 2005.
In contrast,
time series data are repeatedly collected for the same variables over multiple points in time.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Purpose of data collection
This data set contains
observational data that were
not collected in response to any conditions.
In contrast,
experimental data are collected for specific variables to
measure responses to controlled conditions.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of location
For countries in the survey, an
average of 62.8 percent of
households with children have Internet access at home in 2005.
The average value is the
arithmetic mean of all data values for one variable.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of location
For countries in the survey, the
median value is 65.5 percent for
households with children that have Internet access at home in 2005.
For data sets with an
odd number of elements, the median is the
middle data value when all values are arranged from lowest to highest.
For data sets with an
even number of elements, the median is the
average of the two middle data values when all values are arranged from lowest to highest.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of location
The
mode of a data set is
the value that occurs most frequently. If there are two modes, the data set is said to be
bimodal. If there are more than two modes, the data set is said to be
multimodal.
This data set does
not have modal value because all values occur only once per variable.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of location
The
Nth percentile of a data set is a value that is equal to or greater than Nth percent of all values in the data set.
- Arrange the data values from lowest to highest.
- Calculate the index using this formula: index = ( N / 100 ) * ( number of elements)
- If the index is not a whole number, round up. The Nth percentile is the data value found at the position indicated by the index.
- If the index is a whole number, the Nth percentile is the average of the values at the positions indicated by the index and index + 1.
The
50th percentile is the same as the
median value, which is 65.5 percent for this data set.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of variability
The
range is quick and easy way to indicate the variations in a data set. It is
very sensitive to outlying values.
The range for this data set is
65.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of variability
Standard deviation is a common, but more complex, method of measuring variability. The formula to calculate standard deviation is slightly different if the data set is a complete
population or a
sample subset.
The standard deviation is
23.5 percent for
households with children that have Internet access at home in 2005.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of variability
Approximately 68% of data values will be
within one standard deviation of the average value.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Measures of variability
Approximately 95% of data values will be
within two standard deviation of the average value.
| Percentage of Households with Internet Access at Home (2005) |
| | National | No children | With children | Rural | Suburban | City |
| Greece | 22 | 20 | 27 | 16 | 15 | 29 |
| Czech Republic | 19 | 11 | 33 | 17 | 18 | 22 |
| Spain | 36 | 32 | 44 | 23 | 34 | 42 |
| Italy | 39 | 33 | 51 | 34 | 38 | 41 |
| Ireland | 47 | 41 | 58 | 47 | 49 | 46 |
| United Kingdom | 60 | 55 | 73 | 67 | 64 | 58 |
| Finland | 54 | 46 | 79 | 51 | 56 | 59 |
| Germany | 62 | 56 | 82 | 62 | 61 | 62 |
| Sweden | 73 | 66 | 89 | 72 | 83 | 67 |
| Netherlands | 78 | 70 | 92 | 75 | 78 | 80 |
Source: European Communities (2005). Eurostat information society statistics. Retrieved from http://ec.europa.eu/eurostat
Using statistics
The
data sample should be
representative of the population being measured.
In particular, watch out for data from
self-selecting samples.
For example, taking a survey at a football game to find out how many people know the name of the England team captain.
Using statistics
The
decision to compare numbers or percentages could have different emphasis.
For example, the score of a football game is
Tottenham 2 and
Chelsea 1.
We could say,
Tottenham scored
1 more than
Chelsea, or
...
Tottenham scored
100% more than
Chelsea.
Using statistics
The use of
average or
median values may not be useful descriptions of the data set.
| | 20 |
| | 21 |
| | 21 |
| Median | 22 |
| | 22 |
| Average | 31 |
| | 80 |
Using statistics
The use of
average or
median values may not be useful descriptions of the data set.
The
median is 110 and the
average is 108 for the data set below.
Using statistics
Reverse implication is often false.
For example,
households in densely populated areas are more likely to have
Internet access at home.
However,
not all
households with Internet access at home are
located in densely populated areas.
Using statistics
Implication is
not the same as causation.
For example,
households with children are more likely to have Internet access at home.
However,
having children does not cause an Internet connection at home.