Statistical Analysis: an Introduction using R/R/Factors

Categorical variables in R are stored as a special vector object known as a factor. This is not the same as a character vector filled with a set of names (don't get the two mixed up). In particular, R has to be told that each element can only be one of a number of known levels (e.g. Male or Female). If you try to place a data point with a different, unknown level into the factor, R will complain. When you print a factor to the screen, R will also list the possible levels that factor can take (this may include ones that aren't present)

The function creates a factor and defines the available levels. By default the levels are taken from the ones in the vector***. Actually, you don't often need to use, because when reading data in from a file, R assumes by default that text should be converted to factors (see ../R/Data frames). You may need to use. Internally, R stores the levels as numbers from 1 upwards, but it is not always obvious which number corresponds to which level, and it should not normally be necessary to know.

Ordinal variables, that is factors in which the levels have a natural order, are known to R as ordered factors. They can be created in the normal way a factor is created, but in addition specifying.