class: right, middle, title-slide .title[ #
Building an R package ] .subtitle[ ##
Become an R developer
] .author[ ###
Nicolas Casajus & Aurélie Siberchicot
December 2024

## What's an R Package?
, the fundamental unit of shareable code is the package. A package bundles together `code`, `data`, `documentation`, and `tests`, and is easy to share with others. .right[— **_Hadley Wickham_**] -- <br /> An
package: - is a collection of `well-documented functions` - makes your work more `reproducible` - makes your code `useful` for you and for others -- <br /> As of today (2024-12-05): - **21734** packages are available on the [`CRAN`]( - **2289** packages on `Bioconductor` --- ## Must-read resources <br /> .center[ []( []( ] --- ## Recommended environment <br /> .center[  ] .center[      ] .center[  ] --- ## Development workflow <br />  --- ## Creating the structure <br />  --- ## Creating the structure <!-- - Using **RStudio** -->  .center[    ]
A **package name** can only contain `letters`, `numbers`, and the `.`
Check that the chosen name is not already in use with `available::available("nomdupackage")` -- <br /> ``` r ## Alternatively ---- usethis::create_package("/absolute/path/to/the/package/name") ``` --- ## Package structure ``` . ├── (.git) # Git files system ├── (.gitignore) # Untracked files by git │ ├── (mypkg.Rproj) # RStudio files │ ├── .Rbuildignore # List of non-standard package files │ ├── R/ # Folder to store (only) R functions │ ├── myfun-1.R # A first R function file │ └── myfun-2.R # A second R function file │ ├── man/ # Folder to store R functions documentation (automatically edited) │ ├── my_fun_1.Rd │ └── my_fun_2.Rd │ ├── DESCRIPTION # Package metadata │ └── NAMESPACE # Automatically edited ``` --- ## Writing an R function <br />  --- ## Writing an R function
Let's create a first function `moyenne()`. -- <br /> - First, we will create a new empty
file in the folder **R/**. ``` r usethis::use_r("moyenne") ``` -- <br /> - Now we can implement our function `moyenne()`. ``` r moyenne <- function(x) sum(x) / length(x) ```
Resources: [`Tidyverse style guide`]( --- ## Time to document <br />  --- ## Time to document .pull-leftt[ .center[[](] ] .pull-rightt[ - Specially-structured comments **preceding** each function definition - Lightweight syntax easy to write and to read - Syntax: `#' @field value` - Keep function definition and documentation in the same file - Automatically write `.Rd` files and **NAMESPACE** ] -- Each `roxygen` header will always start with these two fields: ``` r #' @title Short title of the function (one line) #' #' @description A longer description of what the function does (several lines) ``` --
Keywords `@title` and `@description` can be omitted. ``` r #' Short title of the function (one line) #' #' A longer description of what the function does (several lines) ``` --- ## Time to document
If your function has `input parameters`, each one must be documented. ``` r #' @param param_1 description #' @param param_2 description ``` -- In our function `moyenne()`: ``` r #' @param x a numeric vector ``` -- <br />
If your function `returns` an
object, use the keyword `@return`. ``` r #' @return What the function returns. ``` -- In our function `moyenne()`: ``` r #' @return A `numeric` representing the arithmetic mean of `x`. ``` -- If your function returns nothing, use the keyword `@return` as follow: ``` r #' @return No return value. ``` --- ## Time to document
Add a section `@examples` to show how to use your function: ``` r #' @examples #' x <- 1:10 #' moyenne(x) ```
Finally if you want your function to be used directly by user, you need to add this tag. ``` r #' @export ``` --- ## Time to document Back to the R file of our function. ``` r #' Compute the arithmetic mean #' #' This function computes the arithmetic mean of a numeric variable. #' #' @param x a numeric vector #' #' @return A `numeric` representing the arithmetic mean of `x`. #' #' @export #' #' @examples #' x <- 1:10 #' moyenne(x) moyenne <- function(x) sum(x) / length(x) ``` --- ## Generating the documentation .pull-lefttt[
It's time to generate the corresponding `.Rd` file from this `roxygen` header.
All `.Rd` files are stored in the `man` folder. ``` r devtools::document() ``` ``` ✓ Writing 'man/moyenne.Rd' ✓ Writing 'NAMESPACE' ``` ] .pull-righttt[ .center[[](] ] -- <br />
In addition to the creation of `man/moyenne.Rd` file, the `NAMESPACE` has been updated.
This file lists which functions need to be exported, i.e. directly usable when loading the package (this file also deals with external dependencies). ``` r # Generated by roxygen2: do not edit by hand export(moyenne) ``` --- ## Testing the function <br />  --- ## Testing the function .pull-lefttt[
Before going any further, we have to try our function. So we will load our package (and **NOT** sourcing the function). ``` r devtools::load_all() ``` ] .pull-righttt[ .center[[](] ] -- Now we can use our function: ``` r moyenne(c(1, 2)) ``` ``` ## [1] 1.5 ``` -- What about `NA`? ``` r moyenne(c(1, 2, NA)) ``` ``` ## [1] NA ``` ***Hum...*** --- ## Modifying the function <br />  --- ## Modifying the function Our function does not seem to work properly.
We need to update the code to deal with `NA` values. -- ``` r moyenne <- function(x) { x <- na.omit(x) sum(x) / length(x) } ``` -- <br /> Let's test the function again: ``` r ## Reload the function ---- devtools::load_all() ## Testing the function ---- moyenne(c(1, 2)) ## [1] 1.5 ## Testing the function (with NA) ---- moyenne(c(1, 2, NA)) ## [1] 1.5 ``` --- ## Modifying the function That's better, but... If user has `NA` values, this implementation will not inform him/her and will make the decision to remove `NA`. Instead we are going to let user choose to delete the `NA` or not. -- <br />
Let's add an additional parameter to our function: `na_rm` with a default value (`FALSE`). If `x` contains `NA` values and `na_rm = FALSE`, then an error will be returned. If `na_rm = TRUE`, `NA` values will be removed and the computation can be done. -- ``` r moyenne <- function(x, na_rm = FALSE) { if (any( { if (na_rm) { x <- na.omit(x) } else { stop("Argument 'x' contains NA values. Use 'na_rm = TRUE' to remove missing values.") } } sum(x) / length(x) } ``` --- ## Modifying the function Let's test the function again. ``` r ## Reload the function ---- devtools::load_all() ## Testing the function ---- moyenne(x = c(1, 2)) ## [1] 1.5 moyenne(x = c(1, 2, NA)) ## Error in moyenne(x = c(1, 2, NA)): ## Argument 'x' contains NA values. Use 'na_rm = TRUE' to remove missing values. moyenne(x = c(1, 2, NA), na_rm = TRUE) ## [1] 1.5 ``` --- ## Update documentation <br />  --- ## Update documentation ``` r #' Compute the arithmetic mean #' #' This function computes the arithmetic mean of a numeric variable. #' #' @param x a numeric vector (can contain `NA` values). #' #' @param na_rm a logical value indicating whether `NA` values should be #' stripped before the computation proceeds. Default is `FALSE`. #' #' @return A `numeric` representing the arithmetic mean of `x`. #' #' @details An error will be returned if `x` contains `NA` values and `na_rm` #' is `FALSE` (default behaviour). #' #' @export #' #' @examples #' moyenne(x = c(1, 2)) #' #' \dontrun{ #' moyenne(x = c(1, 2, NA)) # error #' } #' #' moyenne(x = c(1, 2, NA), na_rm = TRUE) moyenne <- function(x, na_rm = FALSE) { ... } ``` And update the corresponding `.Rd` file and the `NAMESPACE`. ``` r devtools::document() ``` --- ## And so on... <br />  --- ## Package metadata Before we go any further, we need to edit some information about our package, using the `DESCRIPTION` file: - what defines our package (name, title, description, version, authors)? - who can use our package and how (licence)? - who should be contacted if there is a problem or question (the maintainer has the `cre` role)? .pull-lefttt[ ``` r usethis::edit_file("DESCRIPTION") ``` ``` Package: mypkg Title: A Minimal but Complete R Package Version: Authors@R: person(given = "Nicolas", family = "Casajus", role = c("aut", "cre", "cph"), email = "", comment = c(ORCID = "0000-0002-5537-5294")) Description: Illustrates the main structure and components of an R Package with respect to the CRAN submission policies <>. License: GPL (>= 2) Encoding: UTF-8 Roxygen: list(markdown = TRUE) RoxygenNote: 7.1.2 ``` ] .pull-righttt[ .center[[](] ]
Resources: [`DESCRIPTION file`]( and [`Choose a license`]( --- ## Check package <br />  --- ## Check package .pull-lefttt[ ``` r devtools::check() ``` ``` ── R CMD check results ─────────────────────────────────── mypkg ─────── Duration: 11.6s > checking R code for possible problems ... NOTE moyenne: no visible global function definition for ‘na.omit’ Undefined global functions or variables: na.omit Consider adding importFrom("stats", "na.omit") to your NAMESPACE file. 0 errors ✓ | 0 warnings ✓ | 1 note x ``` ] .pull-righttt[ <br /> <br /> <br /> <br /> .center[[](] ] -- <br />
**1 note**: Let's talk about package dependencies! --- ## Package dependencies ``` > checking R code for possible problems ... NOTE moyenne: no visible global function definition for ‘na.omit’ Undefined global functions or variables: na.omit Consider adding importFrom("stats", "na.omit") to your NAMESPACE file. ``` <br />
So we need to import the function `na.omit()` from the package `{stats}`. In `roxygen` header, import: - either the whole package, with the tag `@import`: ``` r #' @import stats ``` - or only specific functions from a package, with the tag `@importFrom`: ``` r #' @importFrom stats na.omit ``` --- ## Package dependencies
In your R code, call external functions with `::` for clarity and efficiency.<br /> Here `stats::na.omit()` ``` r moyenne <- function(x, na_rm = FALSE) { if (any( { if (na_rm) { x <- stats::na.omit(x) } else { stop("Argument 'x' contains NA values. Use 'na_rm = TRUE' to remove missing values.") } } sum(x) / length(x) } ``` --- ## Package dependencies
**Do not forget** to update the `NAMESPACE` (and the `.Rd` files) with `devtools::document()`. ``` export(moyenne) importFrom(stats,na.omit) ``` --
The `NAMESPACE` controls what happens when our package is loaded but not when it's installed. This is the role of the `DESCRIPTION` file and we need to add dependencies to this file. --- ## Package dependencies .pull-lefttt[
Let's add the dependency `stats` in the `DESCRIPTION` file. ``` r usethis::use_package("stats", type = "Imports") ``` <br/> ``` Package: mypkg Title: A Minimal but Complete R Package Version: Authors@R: person(given = "Nicolas", family = "Casajus", role = c("aut", "cre", "cph"), email = "", comment = c(ORCID = "0000-0002-5537-5294")) Description: Illustrates the main structure and components of an R Package with respect to the CRAN submission policies <>. License: GPL (>= 2) Encoding: UTF-8 Roxygen: list(markdown = TRUE) RoxygenNote: 7.1.2 Imports: stats ``` ] .pull-righttt[ .center[[](] ] --- ## Dependencies types - `Depends`: packages listed in this field are `installed` when your package is installed and are `attached` when your package is attached. **NEVER** use `Depends` and always use `Imports` (except for special cases). -- - `Imports`: packages listed in this field are `installed` when your package is installed but are `not attached` when your package is attached (there are just loaded). **ALWAYS** use this method. -- - `Suggests`: packages listed in this field are `not installed` when your package is installed. Your package can use these packages, but doesn't require them (e.g. to run tests, build vignettes, etc.). -- <br/>
Resources: Wickham & Bryan - R Packages [**Chap 9**](, [**Chap 10**]( and [**Chap 11**]( --- ## Check package <br />  --- ## Check package ``` r devtools::check() ``` ``` ── R CMD check results ─────────────────────────────────── mypkg ─────── Duration: 11.6s 0 errors ✓ | 0 warnings ✓ | 0 notes ✓ ``` --- ## Install package <br />  --- ## Install package .pull-lefttt[ ``` r ## Install package ---- devtools::install() ``` ] .pull-righttt[ .center[[](] ] --
Now we can use our package. ``` r ## Load and attach the package ---- library("mypkg") ## Use the package ---- moyenne(c(1, 2)) ``` ``` r ## Use the package (without attaching it) ---- mypkg::moyenne(c(1, 2)) ``` --- class: inverse, center, middle ## To go further... --- ## Advanced tests - Testing is a vital part of package development - But until now we just tried our code informally and on the fly - Problem: it's time consuming, repetitive and it can break the code
Package `{testthat}` .pull-leftt[ .center[[](] ] .pull-rightt[ - Implements a lot of unit tests - Formal automated testing - Explicits how your code should behave - Makes your code more robust ] ``` r usethis::use_testthat() ``` <br />
Resources: [**R Packages Chap 13**]( --- ## Add a vignette .pull-leftt[ .center[[](] ] .pull-rightt[ - A tutorial for your package - Shows how to use your package - Uses the syntax `Rmarkdown` ] ``` r usethis::use_vignette("mypkg") ``` <br />
Resources: [**R Package Chap 17**]( --- ## And... - Deploy on **GitHub** - `usethis::use_github()` - Add a **README** (and badges) - `usethis::use_readme_rmd()` - Add a **NEWS** file - `usethis::use_news_md()` - Add a **Website** with `pkgdown` - Add a **Logo** with `hexSticker` - CI/CD with **GitHub Actions** - `usethis::use_github_action_*()` - Check your package with `rhub` - ~~Add a **DOI** with HAL or Zenodo~~ CRAN yields a DOI for your package - Archive your package on [Software Heritage]( <br /> - Submit to CRAN - `R CMD check pkg --as-cran` - on the current version of R-devel - on at least two OS - blank report: `0 errors | 0 warnings | 0 notes` <br /> The ultimate resources: [****]( and [**Writing R Extensions**](