I’m very excited the learn about vroom, RStudio’s latest tidyverse offering. It imports data a lot faster compared with existing R solutions.
Check out the following benchmark that provides a comparison across a handful of similar functions and interactions between various libraries.
The speed is already a game-changer, but the following features sweeten the deal:
Similar to readr
vroom
shares many features withreadr
, including nearly all of the parsing features of readr for delimited and fixed width files.Reading multiple files
Native support reading from multiple files and connections. It reads sets of files with the same columns into one table.Delimited files
Automatically guesses the delimiter of a file.Compressed files
Automatically reads and writes zip, gzip, bz2 and xz compressed files with the standard file extensions.Remote files
Read files from the internet by passing the URL of the file tovroom()
.Reading and writing from pipe connections
Provides efficient input and output frompipe()
connections, which is useful for pre-filtering large inputs for example.Column selection
Thecol_select
feature makes it easy to select columns to retain or omit. It supports selection helpers and renaming too, including helper functions to repair names.
It significantly speeds up workflow, making it my default tool for importing files into R. You can find the original article here.