The %.% operator in dplyr allows one to put functions together without lots of nested parentheses. The flanking percent signs are R’s way of denoting infix operators; you might have used %in% which corresponds to the match function or %*% which is matrix multiplication. The %.% operator is also called chain, and what it does is rearrange the call to pass its left hand side on as a parameter to the right hand side function. As noted in the documentation this makes function calls read from left to right instead of inside and out. Yesterday we we took a simulated data frame, called data, and calculated some summary statistics. We could put the entire script together with %.%:
library(dplyr) data %.% melt(id.vars=c("treatment", "sex")) %.% group_by(sex, treatment, variable) %.% summarise(mean(value))
I haven’t figured out what would be the best indentation here, but I think this looks pretty okay. Of course it works for non-dplyr functions as well, but they need to take the input data as their first argument.
data %.% lm(formula=response1 ~ factor(sex)) %.% summary()
As mentioned, dplyr is not the only package that has something like this, and according to a comment from Hadley Wickham, future dplyr will use the magrittr package instead, a package that adds piping to R. So let’s look at magrittr! The magrittr %>% operator works much the same way, except it allows one to put ”.” where the data is supposed to go. This means that the data doesn’t have to be the first argument to the function. For example, we can do this, which would give an error with dplyr:
library(magrittr) data %>% lm(response1 ~ factor(sex), .) %>% summary()
Moreover, Conrad Rudolph has used the operators %.%, %|>% and %|% in his own package for functional composition, chaining and piping. And I’m sure he is not the only one; there are several more packages that bring more new ways to define and combine functions into R. I hope I will revisit this topic when I’ve gotten used to it and decided what I like and don’t like. This might be confusing for a while with similar and rather cryptic operators that do slightly different things, but I’m sure it will turn out to be a useful development.
Thanks for the post. I see you said there will be a move to incorporate magrittr in the future. Hadley has said this at least one other time I know of as well. I wonder if that means that if you used %.% in your code it will be broken? Will magrittr support the %.% operator as well? Important questions for development reasons.
I don’t have any inside information, just read it in the comments section of the linked post 🙂 So if you really need to know, ask Hadley. I’m planning to stick to plyr for most things until dplyr has grown a bit older.
Yeah I assumed so, More of a wondering and hoping Hadley would see and respond.
I am not certain, but I believe that dplyr will keep an alias %.% for %>%, at least for a while before depricating %.% (if at all).
@mrtnj: In deciding ”what you like and don’t like”, you might like to check out some of the experimental features in the ”tee” branch of magrittr. https://github.com/smbache/magrittr/tree/tee, where e.g. there is a tee/pipe operator ala http://en.wikipedia.org/wiki/Tee_(command).
You can do e.g.
data %>% lm(Y~X, .) %T>% plot %>% summary
Not that plot don’t usually return anything but here %T>% makes sure the lhs is passed on.
Oh, that looks fun! 🙂 By the way, I’m sure magrittr will end up in a lot of my daily R use, dplyr related or not. Thank you!
Glad to hear it. As always, let us know if you have feedback.
Pingback: Simpler R coding with pipes > the present and future of the magrittr package | R-statistics blog
Pingback: Some useful dplyr links – Gaile Stats
Pingback: Pipes in R {magrittr} | GIS-Blog.com