10 New R Things I Picked Up at useR!2019

useR! 2019 is in the books. Though the 2019 edition of the biggest R show on the planet is over, it will likely take me months to digest all the inspiration. Step 1 en route to do so is this blog post which highlights 10 things I picked up from Toulouse.

1. “I came for the software and stayed for the community.”

The above David Smith quote nutshells my top take away from the trip to France. Though the gathering of R gurus, their padawans and witnesses exceeded expectations in many regards, the community was what impressed me the most. The Toulouse conference caused an inclusive, familiar vibe around Pierre Baudis conference center all week long (which is an achievement of its own given the sheer size of the conference).

There is little doubt that good vibe does not just happen these days. In particular not in a community that lives in the troll pested online world to a large degree. But just like good referees, those who must have been nurtering the community’s culture for years, stayed in the background and made sure a diverse variety of attendees felt welcome. Doing so without competing for attention with the topic of the conference itself is what’s special. Chapeaux.

FWIW: I am not the only one who felt the need express how special the atmosphere was. Robin Lovelace has also just posted his impressions from useR!2019. And we got another thing in common: Like several others, Robin also stayed on the ground to get to Toulouse despite the fact that the southern France town is not exactly his nearest neighbor.

2. How to never forget that you wanted to fix these things later - the todor package

My other take aways are a bit more random as they mostly introduce packages. It’s hard to single out packages or talks, so my choice often depends on whether I knew these packages before.

So give it up for the todor package! When we are determined to fix a certain problem or implement a feature, we often fight distraction by postponing a chance to patch a thing we stumbled upon along the way. Often we just add a comment like:

  • FIXME
  • TODO
  • CHANGED
  • IDEA
  • HACK
  • NOTE
  • REVIEW
  • BUG
  • QUESTION
  • COMBAK
  • TEMP

todor is a small but useful package to help you find such a NOTE in a file, package or folder later on:

library(todor)
todor_file()
todor()
todor_package()

3. Are you teaching (or onboarding employees)? Check out datasciencebox.org

It’s hard to summarize a good summary. So I’ll leave you with what I just tweeted a while ago and hope it makes you visit datasciencebox.org.

4. Add R Studio addins to give your packages an extra polish

Even moderately avid R Studio users will have to admit that R Studio continues to have a massive impact when it comes to bringing new users to R. Thanks to the rstudioapi your package can benefit from the exact same tailwind. Addins, macros, snippets and custom shortcuts make life of developers more convenient and the R experience of package endusers much smoother.

Though customizing R Studio is definitely easy to do, a good kickstart like Colin Fay’s workshop always helps. If you couldn’t attend it, I recommend to work through the examples in particular. Working through won’t take long and Colin’s examples are definitely instructive.

Slides

Repo

Example of answers to the exercises

5. There are R conferences beyond useR!: eRum and uRos

Good news if you did not attend useR!2019, saw all the buzz on twitter and elsewhere and felt like you missed something: There will be another useR! useR!2020 in St. Louis is in the making.

Plus there is eRum which takes place in Europe whenever useR! is on another continent. So, if you live in the old world and prefer to travel by train, eRum seems to be a pretty cool option as well. It will take place in Milano in May 2020.

And if you work in official statistics or anything close to it, uRos in Vienna](https://www.r-project.ro/conference2020.html) is for you. uRos is obviously more of a special interest gathering and way smaller than the big R conferences but (just like Vienna itself) always worth your consideration.

6. Feature-based time series analysis - the feasts package

If there is one name that’s been synonymous to forecasting with R, it’s Rob J. Hyndman. Among other things, he and his team presented the idea of feature extraction from time series and how such an extraction can be implemented using the tidyverse. Features can be seen as properties of time series that allow for a comparison of time series. So, how seasonal is your series? How do certain series perform given their features?

Here’s a short reproducible example using build in time series:

library(feasts)
library(tsibble)
library(ggplot2)
library(colorspace)

tourism %>%
  features(Trips, feature_set(tags = "stl")) %>%
  ggplot(aes(x = trend_strength,
             y = seasonal_strength_year,
             col = Purpose)) +
  geom_point() +
  scale_color_discrete_sequential(palette = "Viridis") +
  facet_wrap(vars(State)) + 
  theme_bw()

We can immediately see from the graph that holidays are more seasonal than other travel, and that Western Australia (WA) has the strongest trends.

More feature based examples can be found in the slides and online on tidyverts.org.

7. The colorspace package

If you’ve ever tried to create a publication ready graph that contains more than 2 or 3 colors you know: fiddling around with colors on your own can be time consuming and frustrating. Some colors are too hard to distinguish (let alone, if you are colorblind), some combinations look like Chinatown at night.

The colorspace project got you covered. Colorspace is the ultimate no-nonsense approach to colors and color palettes in R. The approach is comprehensive and the documentation is very instructive. This does not come as a surprise: the list of names behind the package around maintainer Achim Zeileis reads like the Ocean’s Eleven of the R world. His talk gave interesting insights into how various three dimensional representation of colors, e.g., HCL vs RGB actually work, but also offered practical, application minded advice.

What I liked best in that regard was the grouping of color palettes into qualitative color palettes that only change the type of color (hue), sequential single hue palettes, that change the intensity of the same color, sequential multi hue color palettes that change both, color and intensity and diverging palettes that go into two directions, starting from the middle.

library(colorspace)
hcl_palettes(plot = TRUE)

There’s even a shiny based colorpicker to compose your very own color palette.

The documentation and the additional material are tremendous as well:

Ah, and on an endnote: if you’re on R 3.6 or beyond you wont even need the package to create a simple palette. The hcl.colors function ships with recent versions of R!

8. OSRM is great and there is an R package for it

“Modern C++ routing engine for shortest path in road networks” That’s how project OSRM (Open Streetmap Routing Machine) describes itself on their website. And guess what: there’s an R package for it. So what do with it?

Megan Beckett asked “What can I see from my AirBnB with my running shoes on” and lost no time answering this question in her lightning talk. For those of you who enjoyed the idea, but found the lightning talk a bit too lightning, I’ve tried to adapt Megan’s Toulouse example to my Zurich office.

Because Megan works on Linux she probably ran the data preparation locally and only used docker for the OSRM server in the end. Doing the same on OSX or even Windows might be a bit tricky and spoil the fun. So here’s how to do the preparation also in fully dockerized fashion.

But before you start hacking, get your area from Open Street Map and use the export function. I suggest to use the API offered in the export context menu. Rename the map file you received to map.xml and put it in a separate folder. Then start the osrm-backend docker container to do the preparation from the folder (that’s the $pwd argument in your docker run command) where your map.xml is located on the host.

docker run -t -v $(pwd):/data osrm/osrm-backend osrm-extract /data/map.xml -p /opt/foot.lua
docker run -t --rm -v $(pwd):/data osrm/osrm-backend osrm-contract /data/map.xml.osrm

This may take a bit depending on the size of your export, the profile (there are foot.lua and car.lua). Once you’re done, simply fire up the OSRM server in order to use its API from R.

docker run -t -i -p 5000:5000 -v $(pwd):/data osrm/osrm-backend osrm-routed /data/map.xml.osrm

Now you’re finally able to do the fun leaflet part.

library(osrm)
library(leaflet)
library(tidyverse)
# OSRM SERVER
options(osrm.server = "http://127.0.0.1:5000/")
  
loc <- c(8.54621,47.37779)


iso <- osrmIsochrone(loc, breaks = seq(from = 0, to = 30, by = 5),res=400)


iso@data$run_times <- factor(paste(iso@data$min, "to", iso@data$max, "min"),
                             levels = c("0 to 5 min", "5 to 10 min", "10 to 15 min", "15 to 20 min", "20 to 25 min", "25 to 30 min"))
# Colour palette for each area
factpal <- colorFactor(rev(heat.colors(6)), iso@data$run_times)
leaflet(data = iso) %>% 
  setView(lng = 8.54621, lat = 47.37779, zoom = 13) %>%
  addTiles() %>% 
  addMarkers(lng = 8.54621, lat = 47.37779, popup = "My Office") %>%
  addPolygons(fill=TRUE, stroke=TRUE, color = "black",
              fillColor = ~factpal(iso@data$run_times),
              weight=0.5, fillOpacity=0.3,
              data = iso, popup = iso@data$run_times,
              group = "Driving Time") %>% 
  # Legend
  addLegend("bottomright", pal = factpal, values = iso@data$run_times,   title = "Running Time")


The original example can be found on Megan’s git repository.

9. It is important to think about security in R and it’s hilariously entertaining.

I can only spoil this Eddie Izzardesque stand up performance by Colin Gillespie if I try to describe it. So wait for the video to be online.

In the meantime check out the slides (and the intro video in particular).

10. There are 6(!) very diverse keynotes online

Even if you made it to Toulouse you haven’t seen all the keynotes probably. The good news is, the R Consortium has its owntube channel and you should definetely add to its 3600+ subscribers if you haven’t done so. From Julia Steward Lowndes opening keynote R for better science in less time to Julien Cornebisse’s AI for good in R and Python Ecosystem all talks are either online already or will be published soon.

And you should definitely not miss :) Julie Jossie’s Missing value tour in R. Joe Cheng talked about Shiny’s Holy Grail: Interactivity with reproducibility and Martin Morgan is on the record speaking about How Bioconductor advances science while contributing to the R language and community. And to complete a perfect 50-50 split among male and female keynote speakers Bettina Grün gave her talk on Tools for Model-Based Clustering in R.

I can imagine it will take some time, but every ordinary full talk has been recorded as well and will hopefully be online soon.

Avatar
Matt Bannert
gut checks stack. makes public data public. runs on rap & open source.

I am interested in data science devOps for official statistics, open source, time series, rstats, Python and SQL. I contribute to RAdwords, tstools, timeseriesdb and kofdata.

Related