anndataR is designed to offer the combined functionality of the following packages:
-
theislab/zellkonverter:
Convert AnnData files to/from
SingleCellExperiment
objects. -
mtmorgan/h5ad:
Read/write
*.h5ad
files natively usingrhdf5
. -
dynverse/anndata:
An R implementation of the AnnData data structures, uses
reticulate
to read/write*.h5ad
files.
Ideally, this package will be a complete replacement for all of these packages, and will be the go-to package for working with AnnData files in R.
Desired feature list
- Provide an
R6
class to work with AnnData objects in R (either in-memory or on-disk). - Read/write
*.h5ad
files natively - Convert to/from
SingleCellExperiment
objects - Convert to/from
Seurat
objects
Class diagram
Here is a diagram of the main R6 classes provided by the package:
Notation:
-
X: Matrix
- variableX
is of typeMatrix
-
*X: Matrix
- variableX
is abstract -
to_SingleCellExperiment(): SingleCellExperiment
- functionto_SingleCellExperiment
returns object of typeSingleCellExperiment
-
*to_SingleCellExperiment()
- functionto_SingleCellExperiment
is abstract
Feature tracking
The following tables show the status of the implementation of each feature in the package:
AnnData
Slot | Getter | Getter test | Setter | Setter test |
---|---|---|---|---|
layers | NA | NA | NA | ✓ |
obsm | NA | NA | NA | ✓ |
varm | NA | NA | NA | ✓ |
X | NA | NA | NA | ✓ |
HDF5AnnData
Slot | Getter | Getter test | Setter | Setter test |
---|---|---|---|---|
layers | ✓ | ✓ | ✓ | ✓ |
obs | ✓ | ✓ | ✓ | ✓ |
obs_names | ✓ | ✓ | ✓ | ✓ |
obsm | ✓ | ✓ | ||
obsp | ✓ | ✓ | ||
uns | ✓ | ✓ | ||
var | ✓ | ✓ | ✓ | ✓ |
var_names | ✓ | ✓ | ✓ | ✓ |
varm | ✓ | ✓ | ||
varp | ✓ | ✓ | ||
X | ✓ | ✓ | ✓ | ✓ |
InMemoryAnnData
Slot | Getter | Getter test | Setter | Setter test |
---|---|---|---|---|
layers | ✓ | ✓ | ✓ | ✓ |
obs | ✓ | ✓ | ✓ | ✓ |
obs_names | ✓ | ✓ | ✓ | ✓ |
obsm | ✓ | ✓ | ||
obsp | ✓ | ✓ | ||
uns | ✓ | ✓ | ||
var | ✓ | ✓ | ✓ | ✓ |
var_names | ✓ | ✓ | ✓ | ✓ |
varm | ✓ | ✓ | ||
varp | ✓ | ✓ | ||
X | ✓ | ✓ | ✓ | ✓ |
Known issues
Issue: Integers are being converted to floats.
- Affected backend:
HDF5AnnData
- Affected slot(s):
X
,layers
,obsp
,varp
,obsm
,varm
- Affected dtype(s):
integer_csparse
,integer_rsparse
,integer_matrix
- Probable cause: read
- To investigate: TRUE
- To fix: TRUE
Error message
Failure (test-roundtrip-obspvarp.R:111:5): Writing an AnnData with obsp and varp 'integer_csparse' works
a$dtype (`actual`) not equal to b$dtype (`expected`).
`class(actual)`: "numpy.dtypes.Float64DType" "numpy.dtype" "python.builtin.object"
`class(expected)`: "numpy.dtypes.Int64DType" "numpy.dtype" "python.builtin.object"
Issue: Python nd.arrays have a dimension while R vectors do not.
- Affected backend:
HDF5AnnData
- Affected slot(s):
obsm
,varm
- Affected dtype(s):
boolean_array
,categorical
,categorical_missing_values
,categorical_ordered
,categorical_ordered_missing_values
,dense_array
,integer_array
,nullable_boolean_array
,nullable_integer_array
,string_array
- Probable cause: reticulate
- To investigate: TRUE
- To fix: TRUE
Issue: R vectors don’t have a dimension.
- Affected backend:
HDF5AnnData
- Affected slot(s):
obsm
,varm
- Affected dtype(s):
boolean_array
,categorical
,categorical_missing_values
,categorical_ordered
,categorical_ordered_missing_values
,dense_array
,integer_array
,nullable_boolean_array
,nullable_integer_array
,string_array
- Probable cause: write
- To investigate: TRUE
- To fix: TRUE
Issue: None’s are being dropped from uns
- Affected backend:
HDF5AnnData
- Affected slot(s):
uns
,uns_nested
- Affected dtype(s):
empty
,none
- Probable cause: read
- To investigate: TRUE
- To fix: TRUE
Issue: The python object has a dimension while the R object does not.
- Affected backend:
HDF5AnnData
- Affected slot(s):
uns
,uns_nested
- Affected dtype(s):
boolean_array
,dense_array
,integer_array
,string_array
- Probable cause: reticulate
- To investigate: TRUE
- To fix: TRUE
Issue: Python object is not being converted correctly.
- Affected backend:
HDF5AnnData
- Affected slot(s):
uns
,uns_nested
- Affected dtype(s):
categorical
,categorical_missing_values
,categorical_ordered
,categorical_ordered_missing_values
- Probable cause: reticulate
- To investigate: TRUE
- To fix: TRUE
Error message
<python.builtin.AttributeError/python.builtin.Exception/python.builtin.BaseException/python.builtin.object/error/condition>
Error in `py_get_attr(x, name)`: AttributeError: 'Categorical' object has no attribute 'get_values'. Did you mean: 'sort_values'?
Run `reticulate::py_last_error()` for details.
Backtrace:
▆
1. ├─testthat::expect_equal(adata_r$uns[[name]], reticulate::py_to_r(adata_py$uns[[name]])) at test-roundtrip-uns.R:80:5
2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected")
3. │ └─rlang::eval_bare(expr, quo_get_env(quo))
4. ├─reticulate::py_to_r(adata_py$uns[[name]])
5. └─reticulate:::py_to_r.pandas.core.arrays.categorical.Categorical(adata_py$uns[[name]])
6. ├─reticulate::py_to_r(x$get_values())
7. │ ├─reticulate::is_py_object(x <- py_to_r_cpp(x))
8. │ └─reticulate:::py_to_r_cpp(x)
9. ├─x$get_values
10. └─reticulate:::`$.python.builtin.object`(x, "get_values")
11. └─reticulate:::py_get_attr_or_item(x, name, TRUE)
12. └─reticulate::py_get_attr(x, name)
Issue: Python object is not being converted correctly.
- Affected backend:
HDF5AnnData
- Affected slot(s):
uns
,uns_nested
- Affected dtype(s):
nullable_boolean_array
,nullable_integer_array
- Probable cause: reticulate
- To investigate: TRUE
- To fix: TRUE
Error message
adata_r$uns[[name]] (`actual`) not equal to reticulate::py_to_r(adata_py$uns[[name]]) (`expected`).
`actual` is a logical vector (NA, FALSE, TRUE, FALSE, TRUE, ...)
`expected` is an S3 object of class <pandas.core.arrays.boolean.BooleanArray/pandas.core.arrays.masked.BaseMaskedArray/pandas.core.arraylike.OpsMixin/pandas.core.arrays.base.ExtensionArray/python.builtin.object>, an environment
Issue: The data type is different after the roundtrip test.
- Affected backend:
HDF5AnnData
- Affected slot(s):
uns
,uns_nested
- Affected dtype(s):
boolean
,char
,float
,integer
,nan
,string
- Probable cause: write
- To investigate: TRUE
- To fix: TRUE
Error message
bi$type(a) (`actual`) not equal to bi$type(b) (`expected`).
`attr(actual, 'py_object')$pyobj` is <pointer: 0x7f4af9694d00>
`attr(expected, 'py_object')$pyobj` is <pointer: 0x7f4af9f5eca0>
Backtrace:
▆
1. └─anndataR:::expect_equal_py(...) at test-roundtrip-uns.R:109:5
2. └─testthat::expect_equal(bi$type(a), bi$type(b)) at tests/testthat/helper-expect_equal_py.R:7:3