-
Notifications
You must be signed in to change notification settings - Fork 94
Description
Hi,
while figuring out how to get the dimensions of a dataset I am trying to access STD_REF_OBJ
's in attributes (using #65 with changes from @balintbalazs). A simple example of references are generated with this script:
from h5py import File
import numpy as np
# http://docs.h5py.org/en/stable/high/dims.html
f = File('dims_1d.h5', 'w')
f['x1'] = [1, 2]
f['x1'].make_scale('x1 name')
f['data'] = np.ones((2,), 'f')
f['data'].dims[0].attach_scale(f['x1'])
which gives the following structure (using h5dump -A dims_1d.h5
):
HDF5 "dims_1d.h5" {
GROUP "/" {
DATASET "data" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 2 ) / ( 2 ) }
ATTRIBUTE "DIMENSION_LIST" {
DATATYPE H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): (DATASET 800 /x1 )
}
}
}
DATASET "x1" {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 2 ) / ( 2 ) }
ATTRIBUTE "CLASS" {
DATATYPE H5T_STRING {
STRSIZE 16;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
DATA {
(0): "DIMENSION_SCALE"
}
}
ATTRIBUTE "NAME" {
DATATYPE H5T_STRING {
STRSIZE 8;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
DATA {
(0): "x1 name"
}
}
ATTRIBUTE "REFERENCE_LIST" {
DATATYPE H5T_COMPOUND {
H5T_REFERENCE { H5T_STD_REF_OBJECT } "dataset";
H5T_STD_I32LE "dimension";
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): {
DATASET 1400 /data ,
0
}
}
}
}
}
}
examples of how to de-reference the reference are here: https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5-examples/browse/1_10/C/H5T/h5ex_t_objrefatt.c with hdf5-reference: https://portal.hdfgroup.org/display/HDF5/H5R_DEREFERENCE. As far as I can see this requires some changes in hdf5-rust
in order to access since I do not have direct access to the hid_t
of my attribute (in this case). I could try to implement this, but it seems a bit tricky, either way it probably requires some dereferencing to get the actual object. In my case getting the name of the object would be enough at first.
Regards, Gaute
Activity
gauteh commentedon Jun 11, 2020
Seems I can use
.id()
on objects, never noticed before!JamesMc86 commentedon Apr 2, 2024
I'm trying to understand if the description above offers a workaround. I have references in a dataset I need to access, but I think this ID getting will work on an attribute. Is that right? Is there any way to fake a reference data set right now?
If not I'm up for taking a swing at integrating this.
mulimoen commentedon Apr 3, 2024
I've taken a first shot at implementing this in a branch here [0]. Some notes follows
ObjectReference
to an object (seeStdReference
for new ref.type to object)VarLenArray
, but this accepts onlyCopy
typesFeel free to play around using this @JamesMc86
[0] https://github.com/mulimoen/hdf5-rust/tree/feature/refs2
JamesMc86 commentedon Apr 7, 2024
This looks great and will meet all my requirements in its current format. Just sent you a PR to that branch with some integration tests I used during testing.
There are a couple of notes but none of this block what I need right now:
I'm happy to work on some aspects that I can
For my use case these are all just refinements though and this looks great to me.
JamesMc86 commentedon Apr 7, 2024
Ah, I just tried the clone implementation, but it actually requires a
try_clone
method, as it can fail.JamesMc86 commentedon Apr 7, 2024
So I've been getting my head around more of this at the HDF layer. Is it right to say the ObjectReference is the pre 1.12 type and then since 1.12 we have the standard reference type which is an opaque type over object, region or attribute references.
Therefore, for backward compatibility, we could have an ObjectReference as we do now, which could then wrap the standard reference for these in the newer versions.
mulimoen commentedon Apr 7, 2024
I think the different reference types are the old type as you suggest with separate Object and Data region references, which is replaced by a more general reference type H5R_ref_t (which we call
StdReference
but that is a very poor name). I think we need to support both types since the original issue has the old reference type (H5T_STD_REF_OBJECT
).JamesMc86 commentedon Apr 8, 2024
That sounds great. Yesterday, I did a bit more on the refactoring at https://github.com/WiresmithTech/hdf5-rust/tree/feature/refs2. I hope you are happy with me plodding away on it. I've separated the two types into their own modules so we can easily gate the new types on the HDF version number.
With the references I'm thinking:
So you end up with something like:
< v1.12
What do you feel about this? The alternatives I've thought about:
mulimoen commentedon Apr 8, 2024
Great that you are working on this. There is a big feature space to explore. Some of the incompatibility can be resolved by using traits for
dereference
etc.The reference types are not compatible on disk unfortunately, and we should keep both and make them both available for backwards compatibility with e.g. NetCDF files. Even though 1.10 (and 1.12) has no new planned releases, the old API is very unlikely to be removed.
JamesMc86 commentedon Apr 8, 2024
Perfect - I'll assume we need both then and work accordingly.