We create factors using the factor() function. This function takes a vector of values (typically numeric or character) where each unique value represents a group and converts this input vector into a factor.
[1] foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
[20] bar
Levels: bar foo
It may not seem like the factor() function did much in the examples above, but that’s far from true. Regardless of what type of input data we use to create a factor, the factor() function will always induce three consistent features in the resulting factor object.
Factors have their own class: “factor”.
Factors have a “levels” attribute that stores the names of the groups represented by the factor.
Factors use integer vectors to represent the grouping implied by the input vector.
Regardless of what type the input vector has, the data underlying the new factor will be an integer vector.
# Factors have their own classclass(x)
[1] "integer"
class(f1)
[1] "factor"
class(y)
[1] "character"
class(f2)
[1] "factor"
# Factors have a levels attributeattributes(x)
NULL
attributes(f1)
$levels
[1] "1" "2" "3"
$class
[1] "factor"
attributes(y)
NULL
attributes(f2)
$levels
[1] "bar" "foo"
$class
[1] "factor"
# Factors use integer vectors to map each observation to a grouptypeof(x)
[1] "integer"
typeof(f1)
[1] "integer"
typeof(y)
[1] "character"
typeof(f2)
[1] "integer"
Factor Levels
The levels name the categories represented by the factor. By default, the factor() function will use the unique values of the input vector to name the levels of the resulting factor.
levels(f1)
[1] "1" "2" "3"
levels(f2)
[1] "bar" "foo"
We can assign our own level names through the labels argument.