ENH: Add static and dynamic dtype aliases to NIfTI images #1096

effigies · 2022-03-19T21:40:22Z

This PR adds dtype aliases that can be passed in three ways:

Nifti1Image(data, affine[, header], dtype=alias)
img.set_data_dtype(alias)
img.to_filename(fname, dtype=alias) (and likewise img.to_bytes(fname, dtype=alias))

Aliases

`'mask'`

This is a static alias for uint8.

`'smallest'`

This requires an array to be integer data. It then selects the smallest dtype among the Analyze-compatible uint8, int16, int32 set.

`'compat'`

This is currently 'smallest' for integer data and float32 for floating point data (assuming no values are out of range). I'm not entirely pleased with this. Here's another option that I'd like opinions on before going ahead and implementing:

No change for uint8, int16, int32 or float32
int8 -> uint8 if min >= 0 else int16
uint16 -> int16 if max < 32767 else int32
All other ints become int32, raising an error on out-of-range values
All floats become float32, raising an error on out-of-range values

This ~~would have~~ has the advantage that arrays with Analyze-compatible types won't need their values inspected. The arguable disadvantage is that masks that are int64s will become int32 instead of uint8, but this mode is not intended to be smart.

Semantics

>>> img.set_data_dtype('mask')
>>> img.header.set_data_dtype('mask')
Exception ...

Because the value of the final dtype depends on the values in the data array, this is implemented via img.get/set_data_dtype(), and not img.header.get/set_data_dtype(). This may end up being surprising for people who have used them interchangeably, but should not affect existing code that does not use aliases. If this seems problematic, we could either set a flag to warn if the header methods are being used on a header associated with an image that uses aliases. Alternately, we could monkey-patch the header to use the image methods, restoring an expectation that these are equivalent.

Pinging @neurolabusc @jbteves @jeromedockes for feedback.

This PR builds on #1082. If you intend to review code, I would suggest doing it in a per-commit fashion.

codecov · 2022-03-19T23:03:25Z

Codecov Report

Merging #1096 (58d37a2) into master (d0532ec) will increase coverage by 0.01%.
The diff coverage is 96.49%.

@@            Coverage Diff             @@
##           master    #1096      +/-   ##
==========================================
+ Coverage   92.26%   92.28%   +0.01%     
==========================================
  Files         100      100              
  Lines       12269    12326      +57     
  Branches     2399     2416      +17     
==========================================
+ Hits        11320    11375      +55     
- Misses        624      625       +1     
- Partials      325      326       +1

Impacted Files	Coverage Δ
nibabel/nifti1.py	`92.57% <96.49%> (+0.33%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0532ec...58d37a2. Read the comment docs.

jeromedockes · 2022-03-28T19:57:16Z

Thanks! I think that makes sense. About the names, it is not obvious that "smallest" implies integers, and "compat" stands for compatibility with other tools? why are float64 not allowed?

effigies · 2022-03-28T20:13:31Z

"compat" is "Analyze-compatible", which ensures maximum compatibility with other tools (following this table). If someone wants their data to be float64, they don't need to use "compat" or any other dynamic dtype at all, so I'm not sure the overlap of "give me a compatible type" and "I have float64 data and don't mind keeping it that way" is that large. Were you thinking of a specific use-case?

Agreed that "smallest" isn't obviously int. I'm okay changing it to something else that would be clearer.

jeromedockes · 2022-03-28T20:30:37Z

"compat" is "Analyze-compatible", which ensures maximum compatibility with other tools (following [this table](#1046 (comment))). If someone wants their data to be float64, they don't need to use "compat" or any other dynamic dtype at all, so I'm not sure the overlap of "give me a compatible type" and "I have float64 data and don't mind keeping it that way" is that large. Were you thinking of a specific use-case?

No I wasn't thinking of a specific use-case, just trying to make sure I get the meaning of these aliases! the linked table helps, thanks

Agreed that "smallest" isn't obviously `int`. I'm okay changing it to something else that would be clearer.

I have no strong opinion about that :)

effigies · 2022-06-03T14:30:38Z

@neurolabusc @jbteves Could I bug you for review of the API here, if not the code? I don't want to add unintuitive aliases and then go through a process of encouraging people to move to better ones later. If we can't settle on something satisfactory now, I'd rather push this off to another release.

neurolabusc · 2022-06-03T15:22:42Z

@effigies I think this is great. I think this will help interchange of NIfTI data between tools.

One feature that I am an advocate of, but where others likely have different opinions is uint16 -> int16 if max < 32767 else int32. Note that AFNI will promote uint16 to its native float32 while preserving int16 (since it is a native format). AFNI will fail if one analyzes two (or more) fMRI series where one series is saved as int16 and the other is uint16. Perhaps @mrneont, @afni-dglen, @afni-rickr have thoughts on this (and perhaps AFNI has been upgraded to avoid this issue since 2019.

I have in general tried to make the output of dcm2niix consistent across versions, but the default behavior of dcm2niix was updated to attempt to help AFNI users. Current versions of dcm2niix use -l o by default, while prior versions defaulted to -l n:

-l : losslessly scale 16-bit integers to use dynamic range (y/n/o [yes=scale, no=no, but uint16->int16, o=original], default o)

To reiterate, I support your proposed setting, I just think we should get a consensus from the AFNI team.

jbteves · 2022-06-03T15:50:47Z

nibabel/nifti1.py

+        >>> img.get_data_dtype() == np.dtype('int64')
+        True
+        """
+        # Numpy dtype comparison can fail in odd ways, check for aliases only if str


Might be worth a reference to somewhere where these odd failures are described since these comparisons may change over time. Certainly nothing worth blocking this PR over, though.

Updated the comment to be more explicit. Thanks for the feedback!

That's perfect, thank you!

jbteves · 2022-06-03T16:01:25Z

I think this looks good (and sorry for not replying to your initial request for comment on this PR). I have the same concern as Chris so I'd like to see AFNI team members weigh in as well.

pep8speaks · 2022-06-03T17:23:54Z

Hello @effigies, Thank you for updating!

Cheers! There are no style issues detected in this Pull Request. 🍻 To test for issues locally, pip install flake8 and then run flake8 nibabel.

Comment last updated at 2022-06-03 18:19:31 UTC

effigies · 2022-06-03T19:08:46Z

Thanks for the reviews @neurolabusc and @jbteves!

One feature that I am an advocate of, but where others likely have different opinions is uint16 -> int16 if max < 32767 else int32. Note that AFNI will promote uint16 to its native float32 while preserving int16 (since it is a native format).

So I definitely don't want to expand uint16 into float32 as opposed to int32, as the integer nature of the data would be lost. I'm content to be inconsistent with AFNI here.

AFNI will fail if one analyzes two (or more) fMRI series where one series is saved as int16 and the other is uint16.

As you say that seems to be fixed in recent AFNI. I think the case where we might induce inconsistent types would be something like this:

img = nb.load("uint16.nii.gz")
for i in range(img.shape[3]):
    new_img = nb.Nifti1Image(img.dataobj[..., i], img.affine, img.header, dtype="compat")
    new_img.to_filename(f"vol{i:03d}.nii.gz")

Here we could get some volumes that become int16 and some that become int32, which might annoy a downstream tool that wants to re-concatenate. I think the "fix" for that would probably be an alternative compat mode where we go purely by safe casting, so int8 becomes int16, uint16 becomes int32. But I think we can push that off for now. This puts in the infrastructure and adding new modes will be a simple task if the situation calls for it.

Unless there are any objections, I will merge this by EOD.

effigies force-pushed the enh/dtype_aliases branch from b823f8d to 423ee28 Compare March 19, 2022 22:55

effigies force-pushed the enh/dtype_aliases branch from 423ee28 to 88480c6 Compare March 20, 2022 01:37

effigies force-pushed the enh/dtype_aliases branch from 6c64ad1 to caea693 Compare March 29, 2022 14:45

This was referenced May 8, 2022

RF: Cleanup Makefile to bring parity with CI, remove tox + nose references #1105

Open

Pre-release job is failing against master #1108

Closed

effigies force-pushed the enh/dtype_aliases branch from caea693 to 33dcfd0 Compare May 26, 2022 12:03

jbteves reviewed Jun 3, 2022

View reviewed changes

effigies added 2 commits June 3, 2022 12:19

ENH: Add aliases for set/get_data_dtype on NIfTI images

c9b2d29

TEST: Test dynamic and static dtype aliases

d541fdd

effigies force-pushed the enh/dtype_aliases branch from 33dcfd0 to 0978eed Compare June 3, 2022 17:23

effigies force-pushed the enh/dtype_aliases branch from b59aaf9 to 3dc9604 Compare June 3, 2022 17:29

effigies added 3 commits June 3, 2022 13:52

RF: Add simpler Analyze-compatible helper

c7031c3

ENH: Improve comment explaining alias check for strings

192f77b

STY: Add line breaks to docstrings

94e3d1d

effigies force-pushed the enh/dtype_aliases branch 2 times, most recently from 9827468 to 8edc25e Compare June 3, 2022 17:53

effigies added 2 commits June 3, 2022 13:54

DOC: Update doctests with correct outputs

5698d0d

TEST: Update tests to match new behavior

a8b6ff9

effigies force-pushed the enh/dtype_aliases branch from 8edc25e to a8b6ff9 Compare June 3, 2022 17:54

FIX: Python 3.7 compatiblity

58d37a2

effigies merged commit 1312493 into nipy:master Jun 3, 2022

effigies deleted the enh/dtype_aliases branch June 3, 2022 20:08

effigies mentioned this pull request Jan 11, 2023

Intent to deprecate: 64-bit integer NIfTI images #1089

Closed

effigies mentioned this pull request Feb 20, 2023

Helper function for saving compact integer datatypes #1046

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH: Add static and dynamic dtype aliases to NIfTI images #1096

ENH: Add static and dynamic dtype aliases to NIfTI images #1096

Uh oh!

effigies commented Mar 19, 2022 •

edited

Loading

Uh oh!

codecov bot commented Mar 19, 2022 •

edited

Loading

Uh oh!

jeromedockes commented Mar 28, 2022

Uh oh!

effigies commented Mar 28, 2022

Uh oh!

jeromedockes commented Mar 28, 2022 via email

Uh oh!

effigies commented Jun 3, 2022

Uh oh!

neurolabusc commented Jun 3, 2022

Uh oh!

jbteves Jun 3, 2022

Uh oh!

effigies Jun 3, 2022

Uh oh!

jbteves Jun 3, 2022

Uh oh!

jbteves commented Jun 3, 2022

Uh oh!

pep8speaks commented Jun 3, 2022 •

edited

Loading

Uh oh!

effigies commented Jun 3, 2022

Uh oh!

Uh oh!

ENH: Add static and dynamic dtype aliases to NIfTI images #1096

ENH: Add static and dynamic dtype aliases to NIfTI images #1096

Uh oh!

Conversation

effigies commented Mar 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Aliases

'mask'

'smallest'

'compat'

Semantics

Uh oh!

codecov bot commented Mar 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jeromedockes commented Mar 28, 2022

Uh oh!

effigies commented Mar 28, 2022

Uh oh!

jeromedockes commented Mar 28, 2022 via email

Uh oh!

effigies commented Jun 3, 2022

Uh oh!

neurolabusc commented Jun 3, 2022

Uh oh!

jbteves Jun 3, 2022

Choose a reason for hiding this comment

Uh oh!

effigies Jun 3, 2022

Choose a reason for hiding this comment

Uh oh!

jbteves Jun 3, 2022

Choose a reason for hiding this comment

Uh oh!

jbteves commented Jun 3, 2022

Uh oh!

pep8speaks commented Jun 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2022-06-03 18:19:31 UTC

Uh oh!

effigies commented Jun 3, 2022

Uh oh!

Uh oh!

effigies commented Mar 19, 2022 •

edited

Loading

`'mask'`

`'smallest'`

`'compat'`

codecov bot commented Mar 19, 2022 •

edited

Loading

pep8speaks commented Jun 3, 2022 •

edited

Loading