Skip to contents

This vignette demonstrates how to read and write SingleCellExperiment objects using the {anndataR} package, leveraging the interoperability between SingleCellExperiment and the AnnData format.

Check out ?anndataR for a full list of the functions provided by this package.

Introduction

SingleCellExperiment is a widely used class for storing single-cell data in R, especially within the Bioconductor ecosystem. {anndataR} enables conversion between SingleCellExperiment objects and AnnData objects, allowing you to leverage the strengths of both the scverse and Bioconductor ecosystems.

Prerequisites

Before you begin, make sure you have both SingleCellExperiment and {anndataR} installed. You can install them using the following code:

if (!requireNamespace("pak", quietly = TRUE)) {
    install.packages("pak")
}
pak::pak(c("SingleCellExperiment", "SummarizedExperiment"))
pak::pak("scverse/anndataR")

Converting an AnnData Object to a SingleCellExperiment Object

Using an example .h5ad file included in the package, we will demonstrate how to read an .h5ad file and convert it to a SingleCellExperiment object.

library(anndataR)
library(SingleCellExperiment)
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#> 
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#>     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#>     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
#>     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#>     Position, rank, rbind, Reduce, rownames, sapply, saveRDS, setdiff,
#>     table, tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#> 
#>     findMatches
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#> 
#>     rowMedians
#> The following objects are masked from 'package:matrixStats':
#> 
#>     anyMissing, rowMedians

h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR")

Read the .h5ad file:

adata <- read_h5ad(h5ad_file)
adata
#> AnnData object with n_obs × n_vars = 50 × 100
#>     obs: 'Float', 'FloatNA', 'Int', 'IntNA', 'Bool', 'BoolNA', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'leiden'
#>     var: 'String', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
#>     uns: 'Bool', 'BoolNA', 'Category', 'DataFrameEmpty', 'Int', 'IntNA', 'IntScalar', 'Sparse1D', 'String', 'String2D', 'StringScalar', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
#>     obsm: 'X_pca', 'X_umap'
#>     varm: 'PCs'
#>     layers: 'counts', 'csc_counts', 'dense_X', 'dense_counts'
#>     obsp: 'connectivities', 'distances'

Convert to a SingleCellExperiment object:

sce_obj <- adata$to_SingleCellExperiment()
sce_obj

Note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.

See ?to_SingleCellExperiment for more details on how to customize the conversion process. For instance:

adata$to_SingleCellExperiment(
  # ...
  # TO DO: add this when scverse/anndataR#212 is merged
)

Convert a SingleCellExperiment Object to an AnnData Object

Here’s an example demonstrating how to create a SingleCellExperiment object from scratch, then convert it to AnnData and save it as .h5ad

counts <- matrix(rbinom(20000, 1000, .001), nrow = 100)
sce_obj <- SingleCellExperiment(list(counts = counts))
sce_obj
#> class: SingleCellExperiment 
#> dim: 100 200 
#> metadata(0):
#> assays(1): counts
#> rownames: NULL
#> rowData names(0):
#> colnames: NULL
#> colData names(0):
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

You can convert the SingleCellExperiment object to an AnnData object using the from_SingleCellExperiment function:

adata <- from_SingleCellExperiment(sce_obj)
adata
#> AnnData object with n_obs × n_vars = 200 × 100

Again note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.

See ?from_SingleCellExperiment for more details on how to customize the conversion process. Example:

from_SingleCellExperiment(
  sce_obj,
  # ...
  # TO DO: add this when scverse/anndataR#212 is merged
)
#> AnnData object with n_obs × n_vars = 200 × 100

Session info

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] SingleCellExperiment_1.28.1 SummarizedExperiment_1.36.0
#>  [3] Biobase_2.66.0              GenomicRanges_1.58.0       
#>  [5] GenomeInfoDb_1.42.1         IRanges_2.40.1             
#>  [7] S4Vectors_0.44.0            BiocGenerics_0.52.0        
#>  [9] MatrixGenerics_1.18.1       matrixStats_1.5.0          
#> [11] anndataR_0.99.0             BiocStyle_2.34.0           
#> 
#> loaded via a namespace (and not attached):
#>  [1] bit_4.5.0.1             Matrix_1.7-1            jsonlite_1.8.9         
#>  [4] crayon_1.5.3            compiler_4.4.2          BiocManager_1.30.25    
#>  [7] jquerylib_0.1.4         systemfonts_1.1.0       textshaping_0.4.1      
#> [10] yaml_2.3.10             fastmap_1.2.0           lattice_0.22-6         
#> [13] XVector_0.46.0          R6_2.5.1                S4Arrays_1.6.0         
#> [16] knitr_1.49              htmlwidgets_1.6.4       DelayedArray_0.32.0    
#> [19] bookdown_0.42           desc_1.4.3              GenomeInfoDbData_1.2.13
#> [22] bslib_0.8.0             rlang_1.1.4             cachem_1.1.0           
#> [25] hdf5r_1.3.11            xfun_0.50               fs_1.6.5               
#> [28] sass_0.4.9              bit64_4.5.2             SparseArray_1.6.0      
#> [31] cli_3.6.3               pkgdown_2.1.1           magrittr_2.0.3         
#> [34] zlibbioc_1.52.0         digest_0.6.37           grid_4.4.2             
#> [37] lifecycle_1.0.4         vctrs_0.6.5             evaluate_1.0.3         
#> [40] ragg_1.3.3              abind_1.4-8             rmarkdown_2.29         
#> [43] purrr_1.0.2             httr_1.4.7              tools_4.4.2            
#> [46] htmltools_0.5.8.1       UCSC.utils_1.2.0