Step 5: Normalisation
Avoid adding up apples and pears
Indicators are expressed in a variety of statistical units, ranges or scales. First, they must be adjusted on dimensions such as size/population/income and smoothed through time against cyclical variability. Next, they need to be put on a common basis to avoid adding up apples and pears. Normalisation serves this purpose.
The selection of a suitable normalisation method to apply to the problem at hand is not trivial and deserves special care. The normalisation method should take into account the data properties and the objectives of the composite indicator. The issues that could guide the selection of the normalisation method include: whether hard or soft data are available, whether exceptional behaviour needs to be rewarded/penalised, whether information on absolute levels matters, whether benchmarking against a reference country is requested, whether the variance in the indicators needs to be accounted for. For example, in the presence of extreme values, normalisation methods that are based on standard deviation or distance from the mean are preferred. Special care to the type of the normalisation method needs to be given if the composite indicator values per country need to be comparable over time.
Two types of transformations that are sometimes applied to the raw data prior to normalisation are truncation and functional form. The choice of trimming the tails of the indicators’ distributions is supported by the need to avoid having extreme values overly dominate the result and, partially, to correct for data quality problems in such extreme cases. The functional transformation is applied to the raw data to represent the significance of marginal changes in its level. In most cases, the linear functional form is applied to all variables, de facto. This approach is suitable if changes in the indicator’s values are important in the same way, regardless of the level. If changes are more significant at lower levels of the indicator, the functional form should be concave down (e.g. log or the nth root). If changes are more important at higher levels of the indicator, the functional form should be concave up (e.g. exponential or power).
Commonly used methods for normalizing indicators include the following:
- Ranking of indicators across countries
- Standardisation (or z-scores)
- Distance to a reference country
- Categorical scales
- Indicators above or below the mean
- Method for cyclical indicators
- Percentage of differences over consecutive time points
The simplest normalisation approach is to rank each indicator across countries. The main advantages of this approach are its simplicity and the independence to outliers. Disadvantages are the loss of information on absolute levels and the impossibility to draw any conclusion about difference in performance.
For time-dependent studies, the ranking is carried out at each point in time. Therefore one can follow country performance in terms of relative positions (rankings). However, it is not possible to follow the absolute performance of each country across time: perhaps a country improves from one year to the next, yet its ranking deteriorates as other countries improve faster.
For each indicator the average value and the standard deviation across countries are calculated. The normalized indicator value for a country is then calculated as the ratio of the difference between the raw indicator value and the average divided by the standard deviation.
This type of normalisation is the most commonly used because it converts all indicators to a common scale with an average of zero and standard deviation of one. The average of zero means that it avoids introducing aggregation distortions stemming from differences in indicators means. The scaling factor is the standard deviation of the indicator across the countries. Thus, an indicator with extreme values will have intrinsically a greater effect on the composite indicator. This might be desirable if the intention is to reward exceptional behaviour, that is, if an extremely good result on few indicators is thought to be better than a lot of average scores. With this approach, the range (minimum, maximum) differs among the normalized indicators.
For time–dependent studies, in order to assess country performance across years, the average across countries and the standard deviation across countries are calculated for a reference year (usually the initial time point).
Each indicator for a given country at a given time is calculated as the ratio of the difference between the raw indicator value and the minimum value divided by the range.
This method uses the range rather than the standard deviation. All normalized indicators have identical range (0,1). A disadvantage of the method is that the minima and maxima could be unreliable outliers, and have a distortion effect on the normalised indicator. On the other hand, for indicator values lying within a small interval, this method increases the effect of the indicator on the composite indicator.
For time-dependent studies the minimum and maximum values for an indicator are often selected for the entire time frame. However, this method can not be applied if the new data at a next time point exceed the selected range. In such cases, to maintain comparability between the existing and the new data, the composite indicator would have to be recalculated for all years.
This method divides the indicator value for a given country at a given point in time with the value of a reference country at an initial time. Using this denominator, the normalisation takes into account the evolution of indicators across time; alternatively one can use a denominator that changes across time.
The reference could be a target to be reached in a given time frame. For example, if CO2 emissions is considered as an indicator in case study, then the Kyoto protocol 8% reduction target could be used as the reference value. The reference could also be an external benchmarking country. For example, United States or Japan are benchmark countries for the composite indicators built in the frame of the EU Lisbon agenda. The reference country could alternatively be the average country within the group of countries considered in the analysis. Here, the average country will be given value 1, and the countries receive scores depending on their distance from the average country. Indicator values that are greater than 1 upon normalisation indicate countries with above-average performance. The reference country could also be the group leader (‘distance from the best performer’). The value 1 is given to the leading country and the others are given percentage points away from the leader. The disadvantage is that this approach is based on extreme values which could be unreliable outliers.
Alternatively, instead of using the simple ratio between an indicator value and a reference one, one could first subtract the reference value from the raw datum and then divide by the reference value. In this case, the normalised indicator values are centred on zero, instead of on 1 as in the former case.
Each indicator is assigned a categorical score, which can be either numerical (e.g. one, two, three) or qualitative (e.g. ‘fully achieved’, ‘partly achieved’, ‘not achieved’). Sometimes, the scores are based on the percentiles of the distribution of the indicator value across the countries. For example, the top 5% of the countries receive a score of 100, countries in the 85th - 95th percentiles receive 80 points, in the 65th - 85th percentiles receive 60 points, in the 35th - 65th percentiles receive 50 points, in the 15th - 35th percentiles receive 40 points, in the 5th - 15th percentiles receive 20 points, and, finally, the bottom 5% of the countries receive 0 points.
Categorical scales have the advantage that any small change in the indicator value (e.g. in time) will not affect the normalised value. This, however, can be a disadvantage in other cases, because a large amount of information about the variance between countries in the normalised indicators is lost. Another disadvantage is that, if there is little variation within the original values, the percentile banding forces the categorization on the data, irrespective of the distribution of the underlying data. One possible solution to this is to adjust the percentile brackets across the individual indicators in order to obtain normalised categorical indicators with almost normal distributions.
This type of normalisation distinguishes among values that are above, close to, or below, an arbitrarily defined percentage threshold around the mean. The normalised value is 1 if the indicator value is above the percentage threshold around the mean, -1 if it is below, and 0 otherwise. This means that the threshold creates a neutral region around the mean, where the normalised value is zero. This aims at reducing the sharp discontinuity (from -1 to +1) that would exist across the mean value, to two minor discontinuities (from -1 to 0 and from 0 to +1) that exist across the thresholds. A larger number of thresholds could be created at different distances from the mean value. However, in that case, this type of normalisation would overlap with the approach based on categorical scales.
The advantage of the method is its simplicity and its robustness to the presence of outliers. The disadvantages are the arbitrariness of the threshold level and the loss of absolute level information. For example, assume that the value of a given indicator for country A is 3 times (300%) above the mean calculated across all the countries, and the value for country B is 25% above the mean, with a threshold of 20% around the mean. Both country A and B are then counted equally as ‘above average’.
For time–dependent studies, in order to assess country performance across years, the average across countries is calculated for a reference year (usually the initial time point). An indicator that moves from significantly below the mean to significantly above the mean in the consecutive year will have a positive effect on the composite indicator.
Most institutes conducting business tendency surveys select a set of survey series and combine them into cyclical composite indicators. This is done in order to reduce the risk of false signals, and to better forecast cycles in economic activities.
When indicators are in the form of time series the normalisation is done by subtracting the mean over time and then by dividing by the mean of the absolute values of the difference from the mean.
Each indicator value for a country at a given time point is normalised by subtracting the corresponding value of the previous time point and diving by the indicator value. The normalised value represents the percentage growth with respect to the previous year instead of the absolute level. This method can be applied if indicators are available for a number of years.