Interpreting data: boxplots and tables

by The Open University

Available in 23 free installments

Owner:

View book

Email address:

Enter your email address above to start receiving your free daily installments.

Dripread will never disclose your email address to third parties.

1.2.1 Data sets in different tabular forms

In much of your statistical work, you will begin with data set, often presented in the form of a table, and use the information in the table to produce diagrams and/or summary statistics that help in the interpretation of the data set. However, in practice, much interpretation of data sets can be done directly from an appropriate table of data, or by re-presenting the data in a rather different tabular form. Dealing with data in tables is the subject of this section and the next. By the time you have finished you should be able to produce tables which make certain aspects of the data in question more obvious.

Example 2.1 Lung cancer deaths in South Australia

Table 2.1 contains raw data on the incidence and mortality for lung cancer in South Australia in 1981.

Table 2.1 Age group, male and of population sizes, male and female cases, male and female deaths
0?4 47589 45273 0 0 0 0
5?9 53814 50672 0 0 0 0
10?14 58561 55645 0 0 0 0
15?19 59408 57756 0 0 0 0
20?24 58443 57249 0 0 0 0
25?29 54341 53376 0 0 1 0
30?34 53456 52978 1 0 1 0
35?39 42113 41988 0 2 0 0
40?44 35648 35547 2 5 3 3
45?49 32911 31799 8 2 10 2
50?54 36485 35333 38 8 26 8
55?59 35192 35555 61 18 43 8
60?64 28131 30868 67 16 57 15
65?69 24419 27390 88 15 69 17
70?74 16613 21402 60 21 61 21
75?79 9958 14546 46 10 46 9
80?84 4852 9749 24 6 23 4
85+ 2790 7477 7 2 8 3
O'Neill, T. J., Tallis, G. M. and Leppard, P. (1985) The epidemiology of a disease using hazard functions. Australian Journal of Satistics, 27, 283?297.

A table like Table 2.1 may be adequate for someone who is merely taking a quick look at the data, perhaps prior to carrying out an analysis, but it is not the best way of presenting the figures to most readers. The objectives in producing a table that is actually being used to communicate information are to make the data immediately clear, and to facilitate picking out important patterns in them with the minimum of effort. To this end, there are several guidelines for producing tables which should be borne in mind.

Guidelines for tables

  1. Labelling of rows and columns should be clear and unambiguous.

  2. A table should contain the minimum amount of information needed to communicate its message. This may involve splitting the data into several simpler tables or pooling cells.

  3. It may be appropriate to simplify the numbers in a table to aid speedy comprehension.

  4. Useful summary statistics or calculation results should be added, where appropriate, to help communicate the message.

These guidelines will be followed in relation to Table 2.1 to see what changes they suggest.

Except for third party materials and otherwise stated (see terms and conditions), this content is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Licence