Skip to contents

This document lists known issues with the package and suggests possible solutions.

Issue: Integers are being converted to floats.

  • Affected backend: HDF5AnnData
  • Affected slot(s): X, layers, obsp, varp, obsm, varm
  • Affected dtype(s): integer_csparse, integer_rsparse, integer_matrix
  • Probable cause: read
  • To investigate: TRUE
  • To fix: TRUE

Error message

Failure (test-roundtrip-obspvarp.R:111:5): Writing an AnnData with obsp and varp 'integer_csparse' works
a$dtype (`actual`) not equal to b$dtype (`expected`).

`class(actual)`:   "numpy.dtypes.Float64DType" "numpy.dtype" "python.builtin.object"
`class(expected)`: "numpy.dtypes.Int64DType"   "numpy.dtype" "python.builtin.object"

Proposed solution

Debug and fix

Issue: Python nd.arrays have a dimension while R vectors do not.

  • Affected backend: HDF5AnnData
  • Affected slot(s): obsm, varm
  • Affected dtype(s): boolean_array, categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values, dense_array, integer_array, nullable_boolean_array, nullable_integer_array, string_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$varm[[name]] (`actual`) not equal to py_to_r(py_get_item(adata_py$varm, name)) (`expected`).

`dim(actual)` is absent
`dim(expected)` is an integer vector (20)

Proposed solution

Debug and fix

Issue: R vectors don’t have a dimension.

  • Affected backend: HDF5AnnData
  • Affected slot(s): obsm, varm
  • Affected dtype(s): boolean_array, categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values, dense_array, integer_array, nullable_boolean_array, nullable_integer_array, string_array
  • Probable cause: write
  • To investigate: TRUE
  • To fix: TRUE

Error message

Error in `if (found_dim != expected_dim) {
    stop("dim(", label, ")[", i, "] should have shape: ", expected_dim, 
        ", found: ", found_dim, ".")
}`: argument is of length zero

Proposed solution

The input checking function for obsm and varm should allow the object to be a vector of the correct length instead of only a matrix or a data frame.

Issue: None’s are being dropped from uns

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): empty, none
  • Probable cause: read
  • To investigate: TRUE
  • To fix: TRUE

Error message

Error: names(adata_r$uns) (`actual`) not equal to reticulate::py_to_r(adata_py$uns) (`expected`).

`actual` is NULL
`expected` is a list

Proposed solution

Debug and fix

Issue: The python object has a dimension while the R object does not.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): boolean_array, dense_array, integer_array, string_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$uns[[name]] (`actual`) not equal to reticulate::py_to_r(adata_py$uns[[name]]) (`expected`).

`dim(actual)` is absent
`dim(expected)` is an integer vector (10)

Proposed solution

Think about whether this is a problem or not. If it isn’t, fix the unit test.

Issue: Python object is not being converted correctly.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): categorical, categorical_missing_values, categorical_ordered, categorical_ordered_missing_values
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

<python.builtin.AttributeError/python.builtin.Exception/python.builtin.BaseException/python.builtin.object/error/condition>
Error in `py_get_attr(x, name)`: AttributeError: 'Categorical' object has no attribute 'get_values'. Did you mean: 'sort_values'?
Run `reticulate::py_last_error()` for details.
Backtrace:
    ▆
  1. ├─testthat::expect_equal(adata_r$uns[[name]], reticulate::py_to_r(adata_py$uns[[name]])) at test-roundtrip-uns.R:80:5
  2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected")
  3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
  4. ├─reticulate::py_to_r(adata_py$uns[[name]])
  5. └─reticulate:::py_to_r.pandas.core.arrays.categorical.Categorical(adata_py$uns[[name]])
  6.   ├─reticulate::py_to_r(x$get_values())
  7.   │ ├─reticulate::is_py_object(x <- py_to_r_cpp(x))
  8.   │ └─reticulate:::py_to_r_cpp(x)
  9.   ├─x$get_values
10.   └─reticulate:::`$.python.builtin.object`(x, "get_values")
11.     └─reticulate:::py_get_attr_or_item(x, name, TRUE)
12.       └─reticulate::py_get_attr(x, name)

Proposed solution

Debug and fix

Issue: Python object is not being converted correctly.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): nullable_boolean_array, nullable_integer_array
  • Probable cause: reticulate
  • To investigate: TRUE
  • To fix: TRUE

Error message

adata_r$uns[[name]] (`actual`) not equal to reticulate::py_to_r(adata_py$uns[[name]]) (`expected`).

`actual` is a logical vector (NA, FALSE, TRUE, FALSE, TRUE, ...)
`expected` is an S3 object of class <pandas.core.arrays.boolean.BooleanArray/pandas.core.arrays.masked.BaseMaskedArray/pandas.core.arraylike.OpsMixin/pandas.core.arrays.base.ExtensionArray/python.builtin.object>, an environment

Proposed solution

Debug and fix

Issue: The data type is different after the roundtrip test.

  • Affected backend: HDF5AnnData
  • Affected slot(s): uns, uns_nested
  • Affected dtype(s): boolean, char, float, integer, nan, string
  • Probable cause: write
  • To investigate: TRUE
  • To fix: TRUE

Error message

bi$type(a) (`actual`) not equal to bi$type(b) (`expected`).

`attr(actual, 'py_object')$pyobj` is <pointer: 0x7f4af9694d00>
`attr(expected, 'py_object')$pyobj` is <pointer: 0x7f4af9f5eca0>
Backtrace:
    ▆
1. └─anndataR:::expect_equal_py(...) at test-roundtrip-uns.R:109:5
2.   └─testthat::expect_equal(bi$type(a), bi$type(b)) at tests/testthat/helper-expect_equal_py.R:7:3

Proposed solution

Debug and fix

Issue: hdf5py writes the attribute as a H5T_STD_I64LE, hdf5r writes it as H5T_STD_I32LE.

  • Affected backend: HDF5AnnData
  • Affected slot(s): X
  • Affected dtype(s): float_csparse, float_csparse_nas
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: TRUE

Error message

Warning: different storage datatype
<shape> has file datatype H5T_STD_I64LE
<shape> has file datatype H5T_STD_I32LE
attribute: <shape of </X>> and <shape of </X>>  

Proposed solution

We should investigate if we can specify the type with which an attribute should be written.

Issue: hdf5py has max dimensions as 2^64 - 1, the max val for an unsigned int. hdf5r has it as the actual value

  • Affected backend: HDF5AnnData
  • Affected slot(s): X, obsm, varm, obsp, varp
  • Affected dtype(s): float_csparse, float_csparse_nas, float_rsparse, float_rsparse_nas
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: FALSE

Error message

dataset: </X/data> and </X/data>
Not comparable: </X/data> has rank 1, dimensions [200], max dimensions [18446744073709551615]
and </X/data> has rank 1, dimensions [108], max dimensions [108]
0 differences found
dataset: </X/indices> and </X/indices>
Not comparable: </X/indices> has rank 1, dimensions [200], max dimensions [18446744073709551615]
and </X/indices> has rank 1, dimensions [108], max dimensions [108]
0 differences found
dataset: </X/indptr> and </X/indptr>
Warning: different maximum dimensions
</X/indptr> has max dimensions [18446744073709551615]
</X/indptr> has max dimensions [21]

Proposed solution

We should investigate if something goes wrong with h5py, but I think hdf5 provides the expected behaviour.

Issue: hdf5py writes a nullable integer array with type H5T_SGN_2, hdf5r writes with type H5T_SGN_NONE

  • Affected backend: HDF5AnnData
  • Affected slot(s): obs, var
  • Affected dtype(s): integer_with_nas
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: TRUE

Error message

dataset: </var/nullable_integer_array/mask> and </var/integer_with_nas/mask>
Warning: different storage datatype
Not comparable: </var/nullable_integer_array/mask> has sign H5T_SGN_2 and </var/integer_with_nas/mask> has sign H5T_SGN_NONE
0 differences found

Proposed solution

We should investigate if we can specify the type with which an attribute should be written.

Issue: hdf5py writes a nullable integer array with type H5T_STD_I64LE, hdf5r writes with type H5T_STD_I32LE

  • Affected backend: HDF5AnnData
  • Affected slot(s): obs, var
  • Affected dtype(s): nullable_integer_array
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: TRUE

Error message

dataset: </var/nullable_integer_array/values> and </var/integer_with_nas/values>
Warning: different storage datatype
</var/nullable_integer_array/values> has file datatype H5T_STD_I64LE
</var/integer_with_nas/values> has file datatype H5T_STD_I32LE
size:           [20]           [20]
position        values          values          difference          
------------------------------------------------------------
[ 0 ]          0               1               1              
1 differences found

Proposed solution

We should investigate if we can specify the type with which an attribute should be written.

Issue: On position 0, hdf5py writes a 0 in the values array, hdf5r writes a 1.

  • Affected backend: HDF5AnnData
  • Affected slot(s): obs, var
  • Affected dtype(s): nullable_integer_array
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: TRUE

Error message

dataset: </var/nullable_integer_array/values> and </var/integer_with_nas/values>
Warning: different storage datatype
</var/nullable_integer_array/values> has file datatype H5T_STD_I64LE
</var/integer_with_nas/values> has file datatype H5T_STD_I32LE
size:           [20]           [20]
position        values          values          difference          
------------------------------------------------------------
[ 0 ]          0               1               1              
1 differences found

Proposed solution

We should investigate why this difference happens.

  • Affected backend: HDF5AnnData
  • Affected slot(s): X, obsm, varm, layers, obsp, varp
  • Affected dtype(s): numeric_dense, numeric_dense_with_nas, integer_dense
  • Probable cause: h5diff
  • To investigate: TRUE
  • To fix: TRUE

Error message

Error in `H5File.open(filename, mode, file_create_pl, file_access_pl)`: HDF5-API Errors:
error #000: ../../../src/H5F.c in H5Fcreate(): line 349: unable to create file
    class: HDF5
    major: File accessibility
    minor: Unable to open file
error #001: ../../../src/H5Fint.c in H5F_open(): line 1725: unable to open file
    class: HDF5
    major: File accessibility
    minor: Unable to open file
error #002: ../../../src/H5FD.c in H5FD_open(): line 722: open failed
    class: HDF5
    major: Virtual File Layer
    minor: Unable to initialize object
error #003: ../../../src/H5FDsec2.c in H5FD__sec2_open(): line 351: unable to open file: name = '/tmp/RtmpN29Fmn/anndata_r2_integer_matrixbe0976b43e39b.h5ad', errno = 17, error message = 'File exists', flags = 15, o_flags = c2
    class: HDF5
    major: File accessibility
    minor: Unable to open file

Proposed solution

Fix writing dgeMatrix objects.