-
Notifications
You must be signed in to change notification settings - Fork 262
RF+ENH: nib-diff - allow to specify absolute and/or relative maximal diff to tolerate #661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…differences to tolerate So now it should be possible to get an idea on how much data in the given files differs: $> nib-diff --ma 0.000001 --mr .001 ./tests-run/output/./sub-1_T1w_5mm_noise_corrected.nii.gz /tmp/sub-1_T1w_5mm_noise_corrected.nii.gz These files are different. Field 1:sub-1_T1w_5mm_noise_corrected.nii.gz 2:sub-1_T1w_5mm_noise_corrected.nii.gz DATA(md5) 65df09c06b236342eaf7e2fe57aabf55 3c6e9069e6e054e714f2894419848df0 DATA(diff 1:) - abs: 7.6293945e-06, rel: 0.002224694
Some style stuff to get out of the way: nibabel/cmdline/diff.py:164:29: E226 missing whitespace around arithmetic operator |
For the diff function itself:
|
In response to your TODO concerns above:
|
Codecov Report
@@ Coverage Diff @@
## master #661 +/- ##
==========================================
+ Coverage 88.86% 88.91% +0.05%
==========================================
Files 93 93
Lines 11378 11478 +100
Branches 1869 1899 +30
==========================================
+ Hits 10111 10206 +95
- Misses 930 933 +3
- Partials 337 339 +2
Continue to review full report at Codecov.
|
…put, added tests for coverage
Coverage is handled and Travis passed and the only reason AppVeyor isn't cruising is because it's glitched out with some NiBabel stuff unrelated to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Thanks Chris!
Left minor comments on fixups
nibabel/cmdline/diff.py
Outdated
|
||
Returns | ||
------- | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be so kind to ad address this one as well?
@@ -11,7 +11,7 @@ | |||
import nibabel as nib | |||
import numpy as np | |||
from nibabel.cmdline.utils import * | |||
from nibabel.cmdline.diff import get_headers_diff, display_diff, main, get_data_diff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of us has managed to make this file executable, please undo:
If you like a challenge - undo by rewriting that original commit. Workflow:
- fix, commit
git rebase -i BADCOMMIT^
where you reposition fixing commit after the one to fix, and give its
status to squash them into onegit push -f
since now you rewritten a commit
@@ -72,11 +72,14 @@ def check_nib_diff_examples(): | |||
fnames = [pjoin(DATA_PATH, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here about permissions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jk I got it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good ;)
nibabel/tests/test_scripts.py
Outdated
for item in checked_fields: | ||
if item not in stdout: | ||
print(item) | ||
print(stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Gods dislike such printouts (although I am not sure if that want left by me here :-)). This print will get lost since when you run all tests at once, errors details reported at the end whenever print happened long before. Add a msg
to your assert below providing what you want us to see when it fails
No feedback will be considered to be positive feedback, so we would leave making and logic as is. Used this prototype already a few times, came in handy |
Wow, appveyor really went nuts there. Hopefully some other pr will fix it up My favorite there
|
…ing script in test_scripts
AppVeyor's fail here is fake news, it's tripping up again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable. Some suggestions for clarity.
nibabel/cmdline/diff.py
Outdated
@@ -101,8 +116,8 @@ def get_headers_diff(file_headers, names=None): | |||
return difference | |||
|
|||
|
|||
def get_data_diff(files): | |||
"""Get difference between md5 values | |||
def get_data_md5_diff(files): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just future-proofing: How about naming it get_data_hash_diff
? (Understood that the hash will be MD5 for the foreseeable future.)
Parameters | ||
---------- | ||
files: list of (str or ndarray) | ||
If list of strings is provided -- they must be existing file names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ndarray
I assume means a data block equivalent to one loaded with nib.load().get_fdata()
or similar?
nibabel/cmdline/diff.py
Outdated
str: absolute and relative differences of each file, given as float | ||
""" | ||
# we are doomed to keep them in RAM now | ||
data = [f if isinstance(f, np.ndarray) else nib.load(f).get_data() for f in files] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that get_data()
is on its way out, I would suggest using either dataobj.get_unscaled()
or get_fdata(np.float32)
here, depending on whether you're interested in on-disk values or (more likely) scaled values.
nibabel/cmdline/diff.py
Outdated
Returns | ||
------- | ||
OrderedDict | ||
str: absolute and relative differences of each file, given as float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the shape of the output, given this. Are the values 2-tuples, or a list of N-1 2-tuples? And what is the absolute diff for a file? Presumably the (max/mean/median) of the voxelwise absolute diffs. It would help to be explicit here.
nibabel/cmdline/diff.py
Outdated
type=float, | ||
default=0.0, | ||
help="Maximal relative difference in data between files to tolerate." | ||
" If also --data-max-abs-diff specified, only the data points " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If --data-max-abs-diff is also specified
nibabel/cmdline/diff.py
Outdated
""" | ||
|
||
# we are doomed to keep them in RAM now | ||
data = [f if isinstance(f, np.ndarray) else nib.load(f).get_fdata() for f in files] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that get_fdata()
returns float64 arrays by default. If you're hoping to keep things moderately compact, you could use get_fdata(dtype=np.float32)
. The precision loss should not affect equivalent files, and should be in the noise for any plausible MRI data.
nibabel/cmdline/diff.py
Outdated
""" | ||
|
||
# we are doomed to keep them in RAM now | ||
data = [f if isinstance(f, np.ndarray) else nib.load(f).get_fdata(dtype=np.float32) for f in files] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long for the style checks.
@chrispycheng @yarikoptic I just went ahead and fixed the style issue. I'm happy to merge if you're all set on this one. |
in local conversion with @chrispycheng I have questioned the change in 034c276 "hardcoding" the data type to float32. I think no type conversion should be done so if files are of different data type, and thus possibly of different values (e.g. 1/3-rd would be different in float32 and float64), we could see that. so may be that change should be reverted? |
Do you want to compare on-disk values ( |
|
Okay. Just making sure that's what you wanted. If diffing BOLD series, that could get expensive quickly. |
indeed... but if we decide to provide help for those, we should just add another parameter ( |
Hi @effigies, need your guidance here with appveyor.
It is a known issue to the hypothesis people (HypothesisWorks/hypothesis#1091) which they decided just to ignore altogether. The funny part is that there is a whl for hypothesis for python3, and some times it gets installed (just fine) instead of trying to get it installed from the source tarball. That is when appveyor doesn't fail. |
My best guess from reading this is we need to update setuptools in appveyor. I'll look around a little. |
Watch this build: https://ci.appveyor.com/project/nipy/nibabel/build/1.0.501 |
Feel free to cherry-pick 716b1c6 (on |
Pushed the Appveyor update directly to this branch, as it at least causes no harm. Please let me know what the status is on this PR. |
@effigies seems like still no dice with appveyor |
Those AppVeyor bugs are hitting every PR and master. Nothing to do with hypothesis that I can see. |
"those" is a spectrum here ;-) Some were due to hypothesis, some not. |
AVSD - AppVeyor Spectrum Disorder |
eh, this one is a base now for 2 other PRs (#672 and #678), besides that commit to update setuptools which didn't help. @effigies - what do you prefer?
|
Let's close this in favor of #678. I'll update the name over there. |
So now it should be possible to get an idea on how much data in the given files differs:
and could be applied to >2 files as well
TODOs
data_max_abs_diff
) - they are kinda mouthful but wanted to be specific happen we add some later on dedicated to header etcattn @chrispycheng