Jump to content

Normalization (statistics)

From Wikipedia, the free encyclopedia

Instatisticsand applications of statistics,normalizationcan have a range of meanings.[1]In the simplest cases,normalization of ratingsmeans adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is to bring the entireprobability distributionsof adjusted values into alignment. In the case ofnormalization of scoresin educational assessment, there may be an intention to align distributions to anormal distribution.A different approach to normalization of probability distributions isquantile normalization,where thequantilesof the different measures are brought into alignment.

In another usage in statistics, normalization refers to the creation of shifted and scaled versions of statistics, where the intention is that thesenormalized valuesallow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influences, as in ananomaly time series.Some types of normalization involve only a rescaling, to arrive at values relative to some size variable. In terms oflevels of measurement,such ratios only make sense forratiomeasurements (where ratios of measurements are meaningful), notintervalmeasurements (where only distances are meaningful, but not ratios).

In theoretical statistics, parametric normalization can often lead topivotal quantities– functions whosesampling distributiondoes not depend on the parameters – and toancillary statistics– pivotal quantities that can be computed from observations, without knowing parameters.

Examples

[edit]

There are different types of normalizations in statistics – nondimensional ratios of errors, residuals, means andstandard deviations,which are hencescale invariant– some of which may be summarized as follows. Note that in terms oflevels of measurement,these ratios only make sense forratiomeasurements (where ratios of measurements are meaningful), notintervalmeasurements (where only distances are meaningful, but not ratios). See alsoCategory:Statistical ratios.

Name Formula Use
Standard score Normalizing errors when population parameters are known. Works well for populations that arenormally distributed[2]
Student's t-statistic the departure of the estimated value of a parameter from its hypothesized value, normalized by its standard error.
Studentized residual Normalizing residuals when parameters are estimated, particularly across different data points inregression analysis.
Standardized moment Normalizing moments, using the standard deviationas a measure of scale.
Coefficient of
variation
Normalizing dispersion, using the meanas a measure of scale, particularly for positive distribution such as theexponential distributionandPoisson distribution.
Min-max feature scaling Feature scalingis used to bring all values into the range [0,1]. This is also called unity-based normalization. This can be generalized to restrict the range of values in the dataset between any arbitrary pointsand,using for example.

Note that some other ratios, such as thevariance-to-mean ratio,are also done for normalization, but are not nondimensional: the units do not cancel, and thus the ratio has units, and is not scale-invariant.

Other types

[edit]

Other non-dimensional normalizations that can be used with no assumptions on the distribution include:

  • Assignment ofpercentiles.This is common on standardized tests. See alsoquantile normalization.
  • Normalization by adding and/or multiplying by constants so values fall between 0 and 1. This is used forprobability density functions,with applications in fields such as quantum mechanics in assigning probabilities to|ψ|2.

See also

[edit]

References

[edit]
  1. ^Dodge, Y(2003)The Oxford Dictionary of Statistical Terms,OUP.ISBN0-19-920613-9(entry for normalization of scores)
  2. ^Freedman, David; Pisani, Robert; Purves, Roger (February 20, 2007).Statistics: Fourth International Student Edition.W.W. Norton & Company.ISBN9780393930436.