To study the frequency distributions of a variable one needs to be familiar with a few preliminary concepts. These are explained below.
Qualitative and Quantitative Characters
The characters of the information, collected from a group of individuals are of two types, viz., quantitative and qualitative.
Quantitative characters are those which can be expressed in numerical terms. For example, height, weight, house rent, temperature etc. Quantitative characters are technically termed as variables or numerical variables. Data on quantitative characters is called quantitative data. For example, data on weights of a group of students, family size for a group of families etc. are called quantitative data.
Qualitative characters are those which cannot be expressed in numerical terms. They can only be measured by their quality. They can only be classified under different heads or categories. For example, religion, sex, hair-color, nationality, ‘cloudiness of sky’, ‘performance of students in a test’ etc. They are technically called attributes. Since, these characters place individuals in different categories; they are also called categorical variables. Data on qualitative characters are called qualitative data. For example, data on religion of the people of a village when they are classified into categories, say, Hindu, Muslim, Christian, Others, are qualitative data.
Discrete and Continuous Variable
When a variable can take some discrete or isolated values within its range of variation, then it is known as discrete variable. For example, children per family of a village, number of misprints per page of a book, number of telephone calls, number of students in a class etc. are discrete variables. Data on discrete variables is known as discrete data. Discrete variables can be sub-divided into two further types, viz., finite discrete variable and infinite discrete variable. For example, number of telephone calls is an infinite discrete variable while number of students in a class is a finite discrete variable.
The above definition of a discrete variable may lead one to believe that a discrete variable cannot assume fractional values. Is this true? No. A discrete variable can assume fractional values. For example, proportion of children in a group of 10 people is a variable and its possible values are 0, 0.1, 0.2, 0.3… 0.9, 1. Hence, it should be noted that discrete variables can assume fractional values as well as integral values.
A variable is said to be continuous if it can assume any numerical value within its range of variation. For example, daily temperatures of a place in a month, blood pressure of a group of people, weight, height, IQ of a student etc. are continuous variables. Data on continuous variables is known as continuous data.
Frequency Type and Non-frequency Type data
Quantitative and qualitative data are said to be Frequency Type data when we are simply interested to know how frequently each of the different values of a variable or a particular category occurs in the data set. Once we have obtained the data, classified into different categories, we may in this case, totally forget which particular figure relates to which particular category, i.e., the identity of the individuals are not important at all. In fact, we are now interested in the characteristic of the groups formed by the individuals rather than the characteristics of the individuals themselves.
Data where the identity of the each of the individual values has to be kept in view are called Non-frequency Type data. For example, one may like to know how the population of a country changes over time. For this he keeps records of the population figures for different years arranged chronologically over time. Here, the individual values are the population figures corresponding to the different points in time and the identity of each and every one of these individuals is important to draw a conclusion on how population changes over time.
By simple series data or ungrouped data, we mean data, recorded without any definite systematic arrangement. By frequency of a particular value of the variable or for a particular category (depending on the type of data) we mean how frequently, i.e., the number of times the value of the variable or the particular category occurs in a given data set. When the different values of the variable, arranged in order or the different categories are written together with the corresponding frequencies, generally in the form of a table, then, that is called a frequency distribution. A frequency distribution is a statement of all possible values of the variable together with their respective frequencies. A frequency distribution:
- Summarizes and condenses a large mass of data.
- Exhibits how the total frequency is distributed over different values of the variable or over the different categories of the variable.
- Tells us the values the variable can take and how often it talks those values.
- Reflects the pattern of the variation of the variable.
- Highlights the important characteristics of the frequency distribution like central tendency, dispersion, moments, skewness and kurtosis.
Frequency Distribution of an Attribute:
If we recall, an attribute is a qualitative character, e.g. performance in a test. Hence, to construct the frequency distribution of an attribute it is natural to take a separate class for each distinct form of the attribute, arranged in order, if required.
Let us consider the following example regarding the grades obtained by 30 students in an examination:
A E D B C A E D D B E A C D E
B C A B A C A D A B A A D E B
Table No. 1: Showing frequency distribution for the grades obtained by 30 students.
We take the classes A, B, C, D, E for each distinct form of the attribute. We count the given data for the number of As present in it. We see that there are 9 A grades. So, corresponding to A in the table, we write 9 in the frequency column. We do the same for all the classes of the attributes. One way to check if we have missed out any observation or not is to add all the frequencies and check if it adds up to the total number of observations or not. In this case the total number of observations is 30 students. So the total frequency is 30. From the table we see that 9+6+4+6+5=30. So, we may conclude that we have not made any mistake.
- Indicate which of the following are Qualitative characters and Quantitative characters.
- Number of persons in famliy
- Colour of cars in a parking lot
- Marital status of the members of a club
- Rent paid by different tenants in a housing complex
- Monthly phone bill
- Indicate which of the following are discrete variables and which are continuous variables:
- Weight of a new born baby
- Weekly earnings of the workers of a factory
- Number of students in a class
- Age (in years) of a group of boys
- Number of families in a village
- A sample of 15 Higher Secondary students were asked about their future plan regarding their choice of honours subjects at degree level. Their responses are as under:
Physics Mathematics Economics Chemistry Physics
Chemistry Chemistry Mathematics Chemistry Economics
Chemistry Physics Economics Statistics ChemistryMake a frequency distribution for the above data.