Read/write SingleCellExperiment objects using anndataR
Source:vignettes/usage_singlecellexperiment.Rmd
usage_singlecellexperiment.Rmd
This vignette demonstrates how to read and write
SingleCellExperiment
objects using the
{anndataR} package, leveraging the interoperability
between SingleCellExperiment
and the AnnData
format.
Check out ?anndataR
for a full list of the functions
provided by this package.
Introduction
SingleCellExperiment is a widely used class for storing single-cell
data in R, especially within the Bioconductor ecosystem.
{anndataR} enables conversion between
SingleCellExperiment
objects and AnnData
objects, allowing you to leverage the strengths of both the scverse and
Bioconductor ecosystems.
Prerequisites
Before you begin, make sure you have both SingleCellExperiment and {anndataR} installed. You can install them using the following code:
if (!requireNamespace("pak", quietly = TRUE)) {
install.packages("pak")
}
pak::pak(c("SingleCellExperiment", "SummarizedExperiment"))
pak::pak("scverse/anndataR")
Converting an AnnData Object to a SingleCellExperiment Object
Using an example .h5ad
file included in the package, we
will demonstrate how to read an .h5ad
file and convert it
to a SingleCellExperiment
object.
library(anndataR)
library(SingleCellExperiment)
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#> colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#> get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
#> match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> Position, rank, rbind, Reduce, rownames, sapply, saveRDS, setdiff,
#> table, tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> expand.grid, I, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#>
#> rowMedians
#> The following objects are masked from 'package:matrixStats':
#>
#> anyMissing, rowMedians
h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR")
Read the .h5ad
file:
adata <- read_h5ad(h5ad_file)
adata
#> AnnData object with n_obs × n_vars = 50 × 100
#> obs: 'Float', 'FloatNA', 'Int', 'IntNA', 'Bool', 'BoolNA', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'leiden'
#> var: 'String', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
#> uns: 'Bool', 'BoolNA', 'Category', 'DataFrameEmpty', 'Int', 'IntNA', 'IntScalar', 'Sparse1D', 'String', 'String2D', 'StringScalar', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
#> obsm: 'X_pca', 'X_umap'
#> varm: 'PCs'
#> layers: 'counts', 'csc_counts', 'dense_X', 'dense_counts'
#> obsp: 'connectivities', 'distances'
Convert to a SingleCellExperiment
object:
sce_obj <- adata$to_SingleCellExperiment()
sce_obj
Note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See ?to_SingleCellExperiment
for more details on how to
customize the conversion process. For instance:
adata$to_SingleCellExperiment(
# ...
# TO DO: add this when scverse/anndataR#212 is merged
)
Convert a SingleCellExperiment Object to an AnnData Object
Here’s an example demonstrating how to create a
SingleCellExperiment
object from scratch, then convert it
to AnnData
and save it as .h5ad
counts <- matrix(rbinom(20000, 1000, .001), nrow = 100)
sce_obj <- SingleCellExperiment(list(counts = counts))
sce_obj
#> class: SingleCellExperiment
#> dim: 100 200
#> metadata(0):
#> assays(1): counts
#> rownames: NULL
#> rowData names(0):
#> colnames: NULL
#> colData names(0):
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
You can convert the SingleCellExperiment
object to an
AnnData
object using the
from_SingleCellExperiment
function:
adata <- from_SingleCellExperiment(sce_obj)
adata
#> AnnData object with n_obs × n_vars = 200 × 100
Again note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See ?from_SingleCellExperiment
for more details on how
to customize the conversion process. Example:
from_SingleCellExperiment(
sce_obj,
# ...
# TO DO: add this when scverse/anndataR#212 is merged
)
#> AnnData object with n_obs × n_vars = 200 × 100
Session info
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] SingleCellExperiment_1.28.1 SummarizedExperiment_1.36.0
#> [3] Biobase_2.66.0 GenomicRanges_1.58.0
#> [5] GenomeInfoDb_1.42.1 IRanges_2.40.1
#> [7] S4Vectors_0.44.0 BiocGenerics_0.52.0
#> [9] MatrixGenerics_1.18.1 matrixStats_1.5.0
#> [11] anndataR_0.99.0 BiocStyle_2.34.0
#>
#> loaded via a namespace (and not attached):
#> [1] bit_4.5.0.1 Matrix_1.7-1 jsonlite_1.8.9
#> [4] crayon_1.5.3 compiler_4.4.2 BiocManager_1.30.25
#> [7] jquerylib_0.1.4 systemfonts_1.1.0 textshaping_0.4.1
#> [10] yaml_2.3.10 fastmap_1.2.0 lattice_0.22-6
#> [13] XVector_0.46.0 R6_2.5.1 S4Arrays_1.6.0
#> [16] knitr_1.49 htmlwidgets_1.6.4 DelayedArray_0.32.0
#> [19] bookdown_0.42 desc_1.4.3 GenomeInfoDbData_1.2.13
#> [22] bslib_0.8.0 rlang_1.1.4 cachem_1.1.0
#> [25] hdf5r_1.3.11 xfun_0.50 fs_1.6.5
#> [28] sass_0.4.9 bit64_4.5.2 SparseArray_1.6.0
#> [31] cli_3.6.3 pkgdown_2.1.1 magrittr_2.0.3
#> [34] zlibbioc_1.52.0 digest_0.6.37 grid_4.4.2
#> [37] lifecycle_1.0.4 vctrs_0.6.5 evaluate_1.0.3
#> [40] ragg_1.3.3 abind_1.4-8 rmarkdown_2.29
#> [43] purrr_1.0.2 httr_1.4.7 tools_4.4.2
#> [46] htmltools_0.5.8.1 UCSC.utils_1.2.0