fun with map ( expr ) %>% reduce

Ever since I discovered the rlang function ‘expr’ I have become mildy obsessed with the idea of writing code which writes code. This will be the first of probably many posts where I experiment with combining either iteration or recursion with non-standard evaluation. However, before launching into anything genuinely useful, some background is required:

R is an ‘eagerly evaluated’ language, which means if you write:

a <- 50

and then just write a and hit enter you get:

a
##  50

This becomes problematic when you want one variable, let’s called it a, to hold the name of another variable, b. If you write:

a <- b
## Error in eval(expr, envir, enclos): object 'b' not found

You get an error, because r is eagerly trying to evaluate b, so that it can pass b’s value to a. That’s not r’s fault, that’s just the way it was built.

Base r does have functions for preventing this, but I’m not even going to talk about those, since the functions from the rlang package are just better, so let’s talk about those instead. with rlang you can write:

library(rlang)
a <- expr(b)
a
## b

This is nice. We haven’t even defined b yet, but we’re now allowed to tell r that a is b. rlang also has a function for telling r to go get rid of the ‘expr’ and look underneath to find out what’s in b. The function is eval_tidy(). Let’s assign a value to b first, so that we won’t get an error.

b <- "hey here"
eval_tidy(a)
##  "hey here"

Additionally, rlang provides a function for ‘unquoting’ i.e. which is essentially like evaluating but only one step down. To demonstrate this, let’s build a slightly bigger expression:

c <- expr(d)
d <- " Tom"
e <- expr(paste(!!a, !!c, sep = ","))
e
## paste(b, d, sep = ",")
eval_tidy(e)
##  "hey here, Tom"

Ok, so you get the idea. We can build bigger expressions by unquoting smaller ones inside them. This is where things start to get interesting. One of the things I really hate writing is nested ifelse statements. It’s easy to make a typo and it just feels repetitive. Using map, expr and reduce we can write this function, which will write them for us.

# we'll need the purrr package, for it's map and reduce functions and if_else from dplyr
library(dplyr, warn.conflicts = FALSE)
library(purrr, warn.conflicts = FALSE)

build_ifelse <- function(mapping_list, col) {

col <- enexpr(col)

mapping_list %>% map(~ list(expr(!!col %in% !!.[]), .[] ) ) %>%
{ c("other", .) } %>%
reduce(~ expr(if_else(!!(.y[]), !!(.y[]), !!.x )))

}

If we define a mapping that we want to build our if_else statement upon (I always use if_else, rather than ifelse, not becaues it’s faster, but because you know what type you’ll get out)

# define a mapping
i = 1
numbers <- list()
pot <- seq(from = 1, to = (26*3), by = 1)
while (i <= 26) {
numbers[[i]] <- sample(pot, 3)
pot <- setdiff(pot, numbers %>% unlist)
i <- i + 1
}

mapping <- list(letters, numbers) %>%
transpose

# use the fuction to build the if_else statement
mapping_exp <- build_ifelse(mapping, number)

# have a look at it
mapping_exp
## if_else(number %in% c(32, 30, 56), "z", if_else(number %in% c(42,
## 61, 51), "y", if_else(number %in% c(12, 19, 27), "x", if_else(number %in%
##     c(43, 33, 31), "w", if_else(number %in% c(36, 49, 57), "v",
##     if_else(number %in% c(69, 60, 5), "u", if_else(number %in%
##         c(41, 21, 26), "t", if_else(number %in% c(72, 63, 6),
##         "s", if_else(number %in% c(77, 8, 46), "r", if_else(number %in%
##             c(39, 47, 54), "q", if_else(number %in% c(14, 38,
##         48), "p", if_else(number %in% c(58, 76, 16), "o", if_else(number %in%
##             c(9, 23, 45), "n", if_else(number %in% c(52, 62,
##         68), "m", if_else(number %in% c(78, 74, 18), "l", if_else(number %in%
##             c(28, 2, 3), "k", if_else(number %in% c(10, 1, 64
##         ), "j", if_else(number %in% c(66, 17, 71), "i", if_else(number %in%
##             c(20, 35, 50), "h", if_else(number %in% c(4, 24,
##         37), "g", if_else(number %in% c(40, 73, 7), "f", if_else(number %in%
##             c(75, 34, 44), "e", if_else(number %in% c(29, 53,
##         25), "d", if_else(number %in% c(55, 59, 67), "c", if_else(number %in%
##             c(11, 22, 70), "b", if_else(number %in% c(65, 15,
##         13), "a", "other"))))))))))))))))))))))))))

I think it’s fair to say that’s a bit of code you wouldn’t really want to write manually!

# make a tbl with only numbers to test it on
number_tbl <- tibble(number = sample(1:120, 10))

# use the expression to find the right letter for each number
number_tbl %>% mutate(letter = !!mapping_exp)
## # A tibble: 10 x 2
##    number letter
##     <int> <chr>
##  1     30 z
##  2     56 z
##  3     70 b
##  4     84 other
##  5     68 m
##  6     10 j
##  7     52 m
##  8     48 p
##  9     79 other
## 10     28 k

Some provisos here: 1) Really you should use quo, not expr inside functions, because quo tracks its environment, and the fact that expr doesn’t can get you into a lot of trouble. So, the function should really be written:

build_ifelse1 <- function(mapping_list, col) {

col <- enquo(col)

mapping_list %>% map(~ list(quo(!!col %in% !!.[]), .[] ) ) %>%
{ c("other", .) } %>%
reduce(~ quo(if_else(!!(.y[]), !!(.y[]), !!.x )))

}

The reasons I didn’t do it that way, for the purposes of this article is because the expression, when using quo is harder to read.

# use the fuction to build the if_else statement
mapping_exp1 <- build_ifelse1(mapping, number)

# have a look at it
mapping_exp1
## <quosure>
## expr: ^if_else(^(^number) %in% <dbl: 32, 30, 56>, "z", ^if_else(^(^number)
##            %in% <dbl: 42, 61, 51>, "y", ^if_else(^(^number)
##             %in% <dbl: 12, 19, 27>, "x", ^if_else(^(^number)
##             %in% <dbl: 43, 33, 31>, "w", ^if_else(^(^number)
##             %in% <dbl: 36, 49, 57>, "v", ^if_else(^(^number)
##             %in% <dbl: 69, 60, 5>, "u", ^if_else(^(^number)
##             %in% <dbl: 41, 21, 26>, "t", ^if_else(^(^number)
##             %in% <dbl: 72, 63, 6>, "s", ^if_else(^(^number)
##             %in% <dbl: 77, 8, 46>, "r", ^if_else(^(^number)
##             %in% <dbl: 39, 47, 54>, "q", ^if_else(^(^number)
##             %in% <dbl: 14, 38, 48>, "p", ^if_else(^(^number)
##             %in% <dbl: 58, 76, 16>, "o", ^if_else(^(^number)
##             %in% <dbl: 9, 23, 45>, "n", ^if_else(^(^number)
##             %in% <dbl: 52, 62, 68>, "m", ^if_else(^(^number)
##             %in% <dbl: 78, 74, 18>, "l", ^if_else(^(^number)
##             %in% <dbl: 28, 2, 3>, "k", ^if_else(^(^number) %in% <dbl: 10,
##             1, 64>, "j", ^if_else(^(^number) %in% <dbl: 66, 17, 71>, "i",
##           ^if_else(^(^number) %in% <dbl: 20, 35, 50>, "h", ^if_else(^(^number)
##              %in% <dbl: 4, 24, 37>, "g", ^if_else(^(^number)
##               %in% <dbl: 40, 73, 7>, "f", ^if_else(^(^number)
##               %in% <dbl: 75, 34, 44>, "e", ^if_else(^(^number)
##               %in% <dbl: 29, 53, 25>, "d", ^if_else(^(^number)
##               %in% <dbl: 55, 59, 67>, "c", ^if_else(^(^number)
##               %in% <dbl: 11, 22, 70>, "b", ^if_else(^(^number)
##               %in% <dbl: 65, 15, 13>, "a", "other"))))))))))))))))))))))))))
## env:  0x557e4e781a90

Yeah, I can’t read that either.

Proviso 2) As one of my colleagues pointed out; for this particular example, you can do it more easily with a join:

library(tidyr, warn.conflicts = FALSE)
tibble(letter = letters, number = numbers) %>%
unnest %>%
{ left_join(number_tbl, .)}
## Joining, by = "number"
## # A tibble: 10 x 2
##    number letter
##     <dbl> <chr>
##  1     30 z
##  2     56 z
##  3     70 b
##  4     84 <NA>
##  5     68 m
##  6     10 j
##  7     52 m
##  8     48 p
##  9     79 <NA>
## 10     28 k

Which would have saved some time, but then you wouldn’t have had the pleasure of seeing that massive if_else statement the computer wrote for you, so, pros and cons I guess. I’m sure there must be some case where you couldn’t unnest, so you’d need the if_else.