Skip to main content

S2: Data

Histogram of income by age in two cities

Data is everywhere and increasingly drives many aspects of our day-to-day lives. Here we explain the different types of data that can be collected and some ways of illustrating this data.

Definitions

\(\mathbf{Population}\): the total group of individuals or items under consideration.

\(\mathbf{Sample}\): a group of individuals or items chosen from the population.

\(\mathbf{Data}\): the information collected from the sample or population.

\(\mathbf{Statistic}\): a number calculated from the sample data.

\(\mathbf{Parameter}\): a number calculated from the population data.

Types of data

Data may be either qualitative (categorical) or quantitative (numerical)

\(\mathbf{Qualitative\,Data}\) (classified or labeled).

Data is put into non-numerical categories. Blood type, religion, cause of death, are all examples of qualitative data.

\(\mathbf{Quantitative\,Data}\) (counted or measured).

There are two types of quantitative data.

\(\mathit{Discrete\,Data:}\) data is put into categories depending on its counted number; for example, the number of children in a family.

\(\mathit{Continuous\,Data:}\) data is put into categories depending on its measured size; for example, height.

Graphical representation of data

Qualitative/Categorical data is often represented by means of a bar chart or a pie chart.

Quantitative/Numerical Data is often represented by means of a frequency bar chart called a histogram.

Examples

  1. The table shows the percentage of Australian imports from various countries. This data can be represented on a pie chart so that comparisons are easier.
Pie chart of imports to Australia
Country Imports %
China 22
Japan 20
South Korea 8
India 8
USA 5
UK 4
New Zealand 4
Others 29
  1. A group of school students were surveyed to find the number of children in their families. This data can be represented using a histogram.
No. of Children Frequency
1 13
2 21
3 11
4 4
5 3
6 1
7 1
Total 54
Histogram of number of children in family

Exercise 1

Label each of the following as either a categorical or numerical variable. For the numerical variables label each as either discrete or continuous.

  1. Hair colour
  2. A persons religion
  3. A persons height
  4. Number of children in a family
  5. The weights of babies born on a particular day
  6. The number of crimes committed in Victoria each week
  7. The distance traveled to work by the employees of a large company
  8. The make of car driven by students at RMIT

  1. Categorical
  2. Categorical
  3. Numerical – continuous
  4. Numerical – discrete
  5. Numerical – continuous
  6. Numerical – discrete
  7. Numerical – continuous
  8. Categorical

Exercise 2

Represent the data in example 1 in a bar graph.

Bar graph of exports to various countries

Exercise 3

A group of employees recorded the time that it took them to travel to work on a particular day (see table below). Represent this data using a histogram.

\(\mathbf{Time\:in\:minutes}\) \(\mathbf{Frequency}\)
0 - < 15 2
15 - < 30 12
30 - < 45 23
45 - < 60 9
60 - < 75 3
75 - < 90 1
\(\mathbf{Total}\) \(\mathbf{50}\)

Histogram of time to travel to work

Images on this page by RMIT, licensed under CC BY-NC 4.0


Keywords