SAPA Project Blog

A weekly review of a randomly chosen article.

Subset a Vector in R

The ‘subset()’ function in base R is, in my experience, the cleanest method for trimming large data frames and matrices. While there are many options for accomplishing this chore, I find ‘subset’ to be the most readable and intuitive — qualities that are imperative for minimizing errors and using code which is readable to others (especially when “others” may include those who are not useRs). If you’re not using it already, it’s worth making the switch (see ?subset or here).

That said, making it work with vectors is annoying! There are logical reasons for this of course, but… it just won’t do what you want if you use ‘subset()’ as you would with a data frame or matrix.

Here’s the example:

Use the ‘letters’ character vector
1
2
3
letters
# Get the second half
subset(letters, select = c(n:z))
Here’s what you get:
1
2
3
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
Error in subset.default(letters, select = c(n:z)) :
  argument "subset" is missing, with no default

After some consternation, I stumbled across this gem submitted by Marc Schwartz.

Function to subset vectors in R
1
2
3
4
5
6
7
8
subset.vector <-
function(x, select) {
  nl <- as.list(1L:length(x))
  names(nl) <- x
  vars <- eval(substitute(select), nl)
  x[vars]
  print(x)
}

There are other options, but this conveniently uses the same approach as the base ‘subset()’.

Use it like this:
1
2
3
subset.vector(letters, select = c(n:z))
# or
subset.vector(letters, select = -c(j, x, f))

That’s it. Consternation vanquished.