Skip to contents

anndataR is designed to offer the combined functionality of the following packages:

  • theislab/zellkonverter: Convert AnnData files to/from SingleCellExperiment objects.
  • mtmorgan/h5ad: Read/write *.h5ad files natively using rhdf5.
  • dynverse/anndata: An R implementation of the AnnData data structures, uses reticulate to read/write *.h5ad files.

Ideally, this package will be a complete replacement for all of these packages, and will be the go-to package for working with AnnData files in R.

Desired feature list

  • Provide an R6 class to work with AnnData objects in R (either in-memory or on-disk).
  • Read/write *.h5ad files natively
  • Convert to/from SingleCellExperiment objects
  • Convert to/from Seurat objects

Class diagram

Here is a diagram of the main R6 classes provided by the package:

Notation:

  • X: Matrix - variable X is of type Matrix
  • *X: Matrix - variable X is abstract
  • to_SingleCellExperiment(): SingleCellExperiment - function to_SingleCellExperiment returns object of type SingleCellExperiment
  • *to_SingleCellExperiment() - function to_SingleCellExperiment is abstract

Feature tracking

The following tables show the status of the implementation of each feature in the package:

AnnData

Slot Getter Getter test Setter Setter test
layers NA NA NA
obsm NA NA NA
varm NA NA NA
X NA NA NA

HDF5AnnData

Slot Getter Getter test Setter Setter test
layers
obs
obs_names
obsm
obsp
uns
var
var_names
varm
varp
X

InMemoryAnnData

Slot Getter Getter test Setter Setter test
layers
obs
obs_names
obsm
obsp
uns
var
var_names
varm
varp
X

Seurat

Slot Getter Getter test Setter Setter test
layers
obs
obs_names
obsm ~ ~
obsp ~ ~
uns ~
var
var_names
varm ~ ~
varp ~
X

SingleCellExperiment

Slot Getter Getter test Setter Setter test
layers
obs
obs_names
obsm
obsp
uns
var
var_names
varm
varp
X

Known issues

Issue: Integers are being converted to floats.

  • Affected backend: HDF5AnnData
  • Affected slot(s): X, layers, obsp, varp, obsm, varm
  • Affected dtype(s): integer_csparse, integer_rsparse, integer_matrix
  • Probable cause: read
  • To investigate: TRUE
  • To fix: TRUE

Error message

Failure (test-roundtrip-obspvarp.R:111:5): Writing an AnnData with obsp and varp 'integer_csparse' works
a$dtype (`actual`) not equal to b$dtype (`expected`).

`class(actual)`:   "numpy.dtypes.Float64DType" "numpy.dtype" "python.builtin.object"
`class(expected)`: "numpy.dtypes.Int64DType"   "numpy.dtype" "python.builtin.object"

Proposed solution

Debug and fix

Issue: Python nd.arrays have a dimension while R vectors do not.

  • Affected backend: HDF5AnnData
  • Affected slot(s): obsm, varm
  • Affected dtype(s): boolean_array, categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values, dense_array, integer_array, nullable_boolean_array, nullable_integer_array, string_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$varm[[name]] (`actual`) not equal to py_to_r(py_get_item(adata_py$varm, name)) (`expected`).

`dim(actual)` is absent
`dim(expected)` is an integer vector (20)

Proposed solution

Debug and fix

Issue: R vectors don’t have a dimension.

  • Affected backend: HDF5AnnData
  • Affected slot(s): obsm, varm
  • Affected dtype(s): boolean_array, categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values, dense_array, integer_array, nullable_boolean_array, nullable_integer_array, string_array
  • Probable cause: write
  • To investigate: TRUE
  • To fix: TRUE

Error message

Error in `if (found_dim != expected_dim) {
    stop("dim(", label, ")[", i, "] should have shape: ", expected_dim, 
        ", found: ", found_dim, ".")
}`: argument is of length zero

Proposed solution

The input checking function for obsm and varm should allow the object to be a vector of the correct length instead of only a matrix or a data frame.

Issue: None’s are being dropped from uns

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): empty, none
  • Probable cause: read
  • To investigate: TRUE
  • To fix: TRUE

Error message

Error: names(adata_r$uns) (`actual`) not equal to reticulate::py_to_r(adata_py$uns) (`expected`).

`actual` is NULL
`expected` is a list

Proposed solution

Debug and fix

Issue: The python object has a dimension while the R object does not.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): boolean_array, dense_array, integer_array, string_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$uns[[name]] (`actual`) not equal to reticulate::py_to_r(adata_py$uns[[name]]) (`expected`).

`dim(actual)` is absent
`dim(expected)` is an integer vector (10)

Proposed solution

Think about whether this is a problem or not. If it isn’t, fix the unit test.

Issue: Python object is not being converted correctly.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

<python.builtin.AttributeError/python.builtin.Exception/python.builtin.BaseException/python.builtin.object/error/condition>
Error in `py_get_attr(x, name)`: AttributeError: 'Categorical' object has no attribute 'get_values'. Did you mean: 'sort_values'?
Run `reticulate::py_last_error()` for details.
Backtrace:
    ▆
  1. ├─testthat::expect_equal(adata_r$uns[[name]], reticulate::py_to_r(adata_py$uns[[name]])) at test-roundtrip-uns.R:80:5
  2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected")
  3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
  4. ├─reticulate::py_to_r(adata_py$uns[[name]])
  5. └─reticulate:::py_to_r.pandas.core.arrays.categorical.Categorical(adata_py$uns[[name]])
  6.   ├─reticulate::py_to_r(x$get_values())
  7.   │ ├─reticulate::is_py_object(x <- py_to_r_cpp(x))
  8.   │ └─reticulate:::py_to_r_cpp(x)
  9.   ├─x$get_values
10.   └─reticulate:::`$.python.builtin.object`(x, "get_values")
11.     └─reticulate:::py_get_attr_or_item(x, name, TRUE)
12.       └─reticulate::py_get_attr(x, name)

Proposed solution

Debug and fix

Issue: Python object is not being converted correctly.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): nullable_boolean_array, nullable_integer_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$uns[[name]] (`actual`) not equal to reticulate::py_to_r(adata_py$uns[[name]]) (`expected`).

`actual` is a logical vector (NA, FALSE, TRUE, FALSE, TRUE, ...)
`expected` is an S3 object of class <pandas.core.arrays.boolean.BooleanArray/pandas.core.arrays.masked.BaseMaskedArray/pandas.core.arraylike.OpsMixin/pandas.core.arrays.base.ExtensionArray/python.builtin.object>, an environment

Proposed solution

Debug and fix

Issue: The data type is different after the roundtrip test.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): boolean, char, float, integer, nan, string
  • Probable cause: write
  • To investigate: TRUE
  • To fix: TRUE

Error message

bi$type(a) (`actual`) not equal to bi$type(b) (`expected`).

`attr(actual, 'py_object')$pyobj` is <pointer: 0x7f4af9694d00>
`attr(expected, 'py_object')$pyobj` is <pointer: 0x7f4af9f5eca0>
Backtrace:
    ▆
1. └─anndataR:::expect_equal_py(...) at test-roundtrip-uns.R:109:5
2.   └─testthat::expect_equal(bi$type(a), bi$type(b)) at tests/testthat/helper-expect_equal_py.R:7:3

Proposed solution

Debug and fix