The last value will always be equal to the total for all observations, since all frequencies will already have been added to the previous total.

[u]Discrete or continuous variables[/u] Variables in any calculation can be characterized by the value assigned to them. A discrete variable consists of separate, indivisible categories.

No values can exist between a variable and its neighbour. For example, if you were to observe a class attendance registered from day-to-day, you may discover that the class has 29 students on one day and 30 students on another.

However, it is impossible for student attendance to be between 29 and 30. (There is simply no room to observe any values between these two values, as there is no way of having 29 and a half students.)

Not all variables are characterized as discrete. Some variables (such as time, height and weight) are not limited to a fixed set of indivisible categories. These variables are called continuous variables, and they are divisible into an infinite number of possible values.

For example, time can be measured in fractional parts of hours, minutes, seconds and milliseconds. So, instead of finishing a race in 11 or 12 minutes, a jockey and his horse can cross the finish line at 11 minutes and 43 seconds.

It is essential to know the difference between the two types of variables in order to properly calculate their cumulative frequency.

Example 1 � Discrete variables

The total rock climber count of Lake Louise, Alberta was recorded over a 30-day period. The results are as follows:

31, 49, 19, 62, 24, 45, 23, 51, 55, 60, 40, 35 54, 26, 57, 37, 43, 65, 18, 41, 50, 56, 4, 54, 39, 52, 35, 51, 63, 42

a) Use these discrete variables to: set up a stem and leaf plot, (see the section on stem and leaf plots) with additional columns labelled Frequency, Upper Value and Cumulative frequency

figure out the frequency of observations for each stem

find the upper value for each stem

calculate the cumulative frequency by adding the numbers in the Frequency column

record all the results in the plot

a) The number of rock climbers ranges from 4 to 65. In order to produce a stem and leaf plot, the data are best grouped in class intervals of 10.

Each interval can be located in the Stem column. The numbers within this column represent the first number within the class interval. (For example, Stem 0 represents the interval 0�9,

Stem 1 represents the interval of 10�19, and so forth.)

The Leaf column lists the number of observations that lie within each class interval. For example, in Stem 2 (interval 20�29), the three observations, 23, 24, and 26, are represented as 3, 4 and 6.

The Frequency column lists the number of observations found within a class interval. For example, in Stem 5, nine leaves (or observations) were found; in Stem 1, there are only two.

Use the Frequency column to calculate cumulative frequency.

First, add the number from the Frequency column to its predecessor. For example, in Stem 0, we have only one observation and no predecessors. The cumulative frequency is one.

1 + 0 = 1

However in Stem 1, there are two observations. Add these two to the previous cumulative frequency (one), and the result is three.

1 + 2 = 3

In Stem 2, there are three observations. Add these three to the previous cumulative frequency (three) and the total (six) is the cumulative frequency for Stem 2.

3 + 3 = 6

Continue these calculations until you have added up all of the numbers in the Frequency column.

Record the results in the Cumulative frequency column.

The Upper value column lists the observation (variable) with the highest value in each of the class intervals.

For example, in Stem 1, the two observations 8 and 9 represent the variables 18 and 19. The upper value of these two variables is 19.

Table 1. Cumulative frequency of daily rock climber counts recorded in Lake Louise, Alberta, 30-day period Stem Leaf Frequency

0 4 1 4 1

1 8 9 2 19 1 + 2 = 3

2 3 4 6 3 26 3 + 3 = 6

3 1 5 5 7 9 5 39 6 + 5 = 11

4 0 1 2 3 5 9 6 49 11 + 6 = 17

5 0 1 1 2 4 4 5 6 7 9 57 17 + 9 = 26

6 0 2 3 5 4 65 26 + 4 = 30

The following information can be gained from either graph or table:

on 11 of the 30 days, 39 people or fewer climbed the rocks around Lake Louise

on 13 of the 30 days, 50 or more people climbed the rocks around Lake Louise

When a continuous variable is used, both calculating the cumulative frequency and plotting the graph require a slightly different approach from that used for a discrete variable.

Example 2 � Continuous variables

For 25 days, the snow depth at Whistler Mountain, B.C. was measured (to the nearest centimetre) and recorded as follows:

242, 228, 217, 209, 253, 239, 266, 242, 251, 240, 223, 219, 246, 260, 258, 225, 234, 230, 249, 245, 254, 243, 235, 231, 257.

a) Use the continuous variables above to:

set up a frequency distribution table

find the frequency for each class interval

locate the endpoint for each class interval

calculate the cumulative frequency by adding the numbers in the Frequency column

record all results in the table

b) Use the information gathered from the frequency distribution table to plot a cumulative frequency graph.

a) The snow depth measurements range from 209 cm to 266 cm. In order to produce the frequency distribution table, the data are best grouped in class intervals of 10 cm each.

In the Snow depth column, each 10-cm class interval from 200 cm to 270 cm is listed.

The Frequency column records the number of observations that fall within a particular interval. This column represents the observations in the Tally column, only in numerical form.

The Endpoint column functions much like the Upper value column of Exercise 1, with the exception that the endpoint is the highest number in the interval, regardless of the actual value of each observation. For example, in the class interval of 210�220, the actual value of the two observations is 217 and 219. But, instead of using 219, the endpoint of 220 is used.

The Cumulative frequency column lists the total of each frequency added to its predecessor.

Table 2. Snow depth measured at Whistler Mountain, B.C., 25-day period Snow depth

200 0

200 �< 210 1 210 1

210 �< 220 2 220 3

220 �< 230 3 230 6

230 �< 240 5 240 11

240 �< 250 7 250 18

250 �< 260 5 260 23

260 �< 270 2 270 25

b) Because the variable is continuous, the endpoints of each class interval are used in plotting the graph. The plotted points are joined to form an ogive.

Remember, the cumulative frequency (number of observations made) is labelled on the vertical y-axis and any other variable (snow depth) is labelled on the horizontal x-axis as shown in Figure 2.

The following information can be gained from either graph or table:

none of the 25 days had snow depth less than 200 cm

one of the 25 days snow had depth of less than 210 cm

two of the 25 days snow had depth 260 cm or more

Other cumulative frequency calculations

Another calculation that can be obtained using a frequency distribution table is the relative frequency distribution. This method is defined as the percentage of observations falling in each class interval. Relative cumulative frequency can be found by dividing the frequency of each interval by the total number of observations. (For more information, see Frequency distribution in the chapter entitled Organizing data.)

A frequency distribution table can also be used to calculate cumulative percentage. This method of frequency distribution gives us the percentage of the cumulative frequency, as opposed to the percentage of just the frequency.