<- 1:4
y length(y)
[1] 4
R vectors are dimensionless objects. Unlike the mathematical column vectors with which you may be familiar, R vectors don’t have rows or columns, but they do have length. We can check the length of a vector using the length()
function.
Arithmetic in R works element-wise when applied to vectors. That is, the requested operation is performed between matched elements of the vectors. In this way, R vectors behave much like arrays do in most other programming languages.
[1] 2 2 2 2
[1] 3 4 5 6
[1] -1 0 1 2
[1] 0.5 1.0 1.5 2.0
[1] 2 4 6 8
If two vectors involved in an element-wise operation have different lengths, R will recycle the elements of the shorter vector until each element in the longer vector can be matched with an element from the shorter vector.
Can you see how R is applying recycling to calculate the elements of w
?
In this example, y
has four elements, but z
has only two. So, R re-uses the two elements in z
to define matches for all four elements in y
and specify the four differences needed to define the elements of w
.
z
.z
.If the length of the longer vector is divisible by the length of the shorter vector, R will recycle silently (i.e., without any messages or warnings). So, you need to vigilant and make sure you don’t accidentally trigger recycling when you don’t want it.
In cases where the length of the shorter vector doesn’t evenly divide the length of the longer vector, R will still use execute the requested operation and apply recycling, but it will also return a warning.
[1] 2 4 4 6
Warning in a + c: longer object length is not a multiple of shorter object
length
[1] 2 4 6 8 6
At this point, if you have any predilection toward strictly typed programming languages, all this recycling business might sound like insanity, but there’s a method to the madness. Let’s consider a few examples that demonstrate how recycling can actually be very helpful in day-to-day programmatic data analysis.
The following code looks like it should implement scalar multiplication, and outcome looks like the result of a scalar multiplication. But R doesn’t have scalars, and vector arithmetic in R doesn’t follow standard linear algebraic rules, so our intuitive reading of this expression can’t be correct.
In fact, the preceding code is actually doing element-wise multiplication of a length-one vector, 3
, and a length-four vector, y
, and recycling the elements of the length-one vector to resolve the length difference. In this example, recycling provides two benefits:
The following three expressions follow the same pattern as the last example but provide more compelling evidence for the real-world usefulness of recycling. These expressions don’t represent valid operations under the conventional rules of linear algebra, and they wouldn’t be valid commands in a highly structure programming language like C. Yet, the intended operations are intuitively obvious, and R will use recycling to execute the operations we expect.
Logical comparisons between vectors test some logical relation between the vectors and return a logical vector that encodes the answer to whatever TRUE/FALSE question the logical relation tested. These comparisons follow the same rules as vector arithmetic:
[1] FALSE FALSE FALSE TRUE
[1] FALSE TRUE FALSE TRUE
We need to be especially mindful of the vector operation rules when applying logical comparisons to vectors. When defining logical tests, intuition can easily fail us. For example, it would be natural to read the following code as implementing the same test as the previous expression, but the two results are clearly different. Why?
As you’ve probably surmised, recycling is causing the discrepancy. We can clarify the differences between the two tests by extending the shorter vectors to manually implement recycling.
[1] FALSE TRUE FALSE TRUE
[1] FALSE TRUE FALSE FALSE
Now, the two tests don’t appear very similar, and their different outcomes aren’t particularly surprising.