When writing R scripts at very early stage, I was quite struggle on which R packages I should or had better to use so that my codes would be efficient and effective. After a couple of years of practicing and searching, I found some. In this post, I am going to share those that I think they are awesome.
The first one is from Analytics Vidhya, a simple picture:
Another list of recommended packages is from Garrett Grolemund for Quick list of useful R packages.
To load data
foreign – Want to read a SAS data set into R? Or an SPSS data set? Foreign provides functions that help you load data files from other programs into R.
R can handle plain text files – no package required. Just use the functions read.csv, read.table, and read.fwf. If you have even more exotic data, consult the CRAN guide to data import and export.
To manipulate data
dplyr – Essential shortcuts for subsetting, summarizing, rearranging, and joining together data sets. dplyr is our go to package for fast data manipulation.
stringr – Easy to learn tools for regular expressions and character strings.
lubridate – Tools that make working with dates and times easier.
To visualize data
ggvis – Interactive, web based graphics built with the grammar of graphics.
rgl – Interactive 3D visualizations with R
- leaflet (maps)
- dygraphs (time series)
- DT (tables)
- diagrammeR (diagrams)
- network3D (network graphs)
- threeJS (3D scatterplots and globes).
googleVis – Let’s you use Google Chart tools to visualize data in R. Google Chart tools used to be called Gapminder, the graphing software Hans Rosling made famous in hie TED talk.
To model data
mgcv – Generalized Additive Models
randomForest – Random forest methods from machine learning
multcomp – Tools for multiple comparison testing
vcd – Visualization tools and tests for categorical data
glmnet – Lasso and elastic-net regression methods with cross validation
survival – Tools for survival analysis
caret – Tools for training regression and classification models
To report results
shiny – Easily make interactive, web apps with R. A perfect way to explore data and share findings with non-programmers.
R Markdown – The perfect workflow for reproducible reporting. Write R code in your markdown reports. When you run render, R Markdown will replace the code with its results and then export your report as an HTML, pdf, or MS Word document, or a HTML or pdf slideshow. The result? Automated reporting. R Markdown is integrated straight into RStudio.
xtable – The xtable function takes an R object (like a data frame) and returns the latex or HTML code you need to paste a pretty version of the object into your documents. Copy and paste, or pair up with R Markdown.
For Spatial data
maps – Easy to use map polygons for plots.
ggmap – Download street maps straight from Google maps and use them as a background in your ggplots.
For Time Series and Financial data
zoo – Provides the most popular format for saving time series objects in R.
xts – Very flexible tools for manipulating time series data sets.
quantmod – Tools for downloading financial data, plotting common charts, and doing technical analysis.
To write high performance R code
Rcpp – Write R functions that call C++ code for lightning fast speed.
data.table – An alternative way to organize data sets for very, very fast operations. Useful for big data.
parallel – Use parallel processing in R to speed up your code or to crunch large data sets.
To work with the web
XML – Read and create XML documents with R
jsonlite – Read and create JSON data tables with R
httr – A set of useful tools for working with http connections
To write your own R packages
devtools – An essential suite of tools for turning your code into an R package.
testthat – testthat provides an easy way to write unit tests for your code projects.
roxygen2 – A quick way to document your R packages. roxygen2 turns inline code comments into documentation pages and builds a package namespace.
And my list is:
Data Type Processing
- dplyr – Next generation tools for working with data frames
- data.table – R’s data.table package extends data.frame
- jsonlite – A Robust, High Performance JSON Parser and Generator for R
- stringr – A fresh approach to string manipulation in R
- fuzzyjoin – Join tables together on inexact matching
- RODBC – A ODBC database interface
- DBI – A database interface (DBI) for communication between R and RDBMSs
- RMySQL – R interface to MySQL and MariaDB
- ROracle – R interface to Oracle
- RPostgreSQL – R interface to PostgreSQL
- rmongodb – R driver for MongoDB
- rredis – R driver for Redis
- RCassandra – R direct interface (not java) to Apache Cassanda
- RHive – An R extension facilitating distributed computing via Apache Hive
- RNeo4j – Neo4j Driver for R
- ggplot2 – An implementation of the Grammar of Graphics in R
- ggalt – Extra Coordinate, Geoms, Statistical Transformations & Scales for ‘ggplot2’
- ggtree – Visualization and annotation of phylogenetic trees
- ggplot2 Extensions – ggplot2 official extension mechanism
- lattice – The lattice add-on package is an implementation of Trellis graphics for R
- extrafont – Tools for using fonts in R graphics
- showtext – Using Fonts More Easily in R Graphs
- gganimate – Create easy animations with ggplot2
- misc3d – A collection of miscellaneous 3d plots, including isosurfaces
Report & Export from R
- rapport – An R package that facilitates creation of reproducible report templates
- rmarkdown – Supports dozens of static and dynamic output formats
- slidify – Generate reproducible html5 slides from R markdown
- ReporteRs – An R package for creating Microsoft Word and PowerPoint documents
- bookdown – Tools to write HTML, PDF, ePub, and Kindle books
Web Processing Tools
- shiny – Easy interactive web applications with R
- RCurl – General network (http, FTP, etc.) client interace
- XML – Reading and creating XML (and HTML) documents (including DTDs)
- Rfacebook – Facebook API for R
Parallel & Distribution Computing
- parallel – High-Performance and Parallel Computing with R, multicore and snow.
- SparkR – R front-end for Spark
- DistributedR – Distributed R is a scalable high-performance platform for R
- AnomalyDetection – Twitter’s Anomaly Detection
- ahaz – Association Rule Mining
- bigrf – Big Random Forest Learning
- C50 – Decision Tree Learning
- caret – Classification and Regression Learning
- e1071 – Misc Functions of the Department of Statistics, Probability Theory Group
- forecast – Forecast by using ARIMA, ETS, STLM, TBATS, and Neural Network
- h2o – Deeplearning, Random forests, GBM, KMeans, PCA, GLM
- LogicReg – Logic Regression Learning
- maptree – Mapping, pruning, and graphing tree models
- mboost – Model-Based Boosting
- randomForest – Breiman and Cutler’s random forests
- rattle – Graphical User Interface for Data Mining in R
- rpart – Recursive Partitioning and Regression Trees
- RSNNS – Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
- svmpath – svmpath: the SVM Path algorithm
- varSelRF – Use random forest to select features
- xgboost – eXtreme Gradient Boosting Tree model, good speed and performance
Natural Language Processing
- text2vec – Fast Text Mining Framework for Vectorization and Word Embeddings
- openNLP – Apache OpenNLP interface
- NLP – Basic classes and methods for Natural Language Processing
- topicmodels – Topic modeling interface to the C code developed by by David M
- SnowballC – Snowball stemmers based on the C libstemmer UTF-8 library
- quanteda – An R package for the Quantitative Analysis of Textual Data
R Development Tools
- Package Development List – Instructions of R Package Development
- devtools – Tools to make an R developer’s life easier
- testthat – Test R codes
- roxygen – Process source codes and comments to produce Rd files
- packrat – Dependency management system for R.
- installr – Functions for installing software from within R
- import – An Import Mechanism For R
- modules – Replacing packages: An alternative module system for R
- Rocker – R configurations for Docker
- drat – Creation and use of R repositories on R
- lintr – Static Code Analysis for R
- staticdocs – Create R’s HTML docs
- d3heatmap – A D3.js-based heatmap htmlwidget for R
- DataTables – An R interface to the DataTables library
- scatterD3 – D3 scatter map
Interface with Other Programming Language
- Rcpp – Seamless R and C++ Integration
- rJava – Interface to Java
- jvmr – Interface to Java and Scala
- rJython – Interface to Python/Jython
- rPython – Call Python in R
- runr – Call Julia and Bash in R
- RJulia – Call Julia in R
- RinRuby – A library for Ruby
- R.matlab – Read and write matlab file
- RcppOctave -Interface to Octave and Matlab
- RSPerl – Interface to Perl
- rpy2 – Python Interface to R