Skip to content

pandas.load_hdf() fails if you pass "format" kwarg #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nickponvert opened this issue Apr 10, 2019 · 1 comment
Closed

pandas.load_hdf() fails if you pass "format" kwarg #513

nickponvert opened this issue Apr 10, 2019 · 1 comment

Comments

@nickponvert
Copy link
Contributor

During the new hire software orientation, we attempted to run some VBA notebooks. I encountered errors in functions that call pandas.read_hdf() and pass format='fixed' as a kwarg.

This is with h5py==2.9.0, pandas==0.24.2

Here is the traceback I am getting:

ValueErrorTraceback (most recent call last)
<ipython-input-20-3164b7df868e> in <module>()
----> 1 dataset=visual_behavior_ophys_dataset.VisualBehaviorOphysDataset(experiment_id, cache_dir=cache_dir)

/home/nick.ponvert/src/visual_behavior_analysis/visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py in __init__(self, experiment_id, cache_dir, **kwargs)
     57         self.cache_dir = cache_dir
     58         self.cache_dir = self.get_cache_dir()
---> 59         self.roi_metrics = self.get_roi_metrics()
     60         if self.roi_metrics.cell_specimen_id.values[0] is None:
     61             self.cell_matching = False

/home/nick.ponvert/src/visual_behavior_analysis/visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py in get_roi_metrics(self)
    256 
    257     def get_roi_metrics(self):
--> 258         self._roi_metrics = pd.read_hdf(os.path.join(self.analysis_dir, 'roi_metrics.h5'), key='df', format='fixed')
    259         return self._roi_metrics
    260 

/home/nick.ponvert/.conda/envs/python2/lib/python2.7/site-packages/pandas/io/pytables.pyc in read_hdf(path_or_buf, key, mode, **kwargs)
    366                 'File {path} does not exist'.format(path=path_or_buf))
    367 
--> 368         store = HDFStore(path_or_buf, mode=mode, **kwargs)
    369         # can't auto open/close if we are using an iterator
    370         # so delegate to the iterator

/home/nick.ponvert/.conda/envs/python2/lib/python2.7/site-packages/pandas/io/pytables.pyc in __init__(self, path, mode, complevel, complib, fletcher32, **kwargs)
    461 
    462         if 'format' in kwargs:
--> 463             raise ValueError('format is not a defined argument for HDFStore')
    464 
    465         try:

ValueError: format is not a defined argument for HDFStore

After doing some digging, it look like there was a recent change to pandas that introduced this particular exception. See: pandas-dev/pandas#13291

It looks like pandas was allowing the 'format' kwarg to be passed to HDFStore.open, which ignored it because it isn't a defined parameter for that method. So, the recent change causes pandas to complain if you try to pass 'format' as a kwarg in read_hdf().

Proposed fix: We should change all the lines that pass format as a kwarg to pandas.read_hdf() to be compatible with the latest version of pandas. This shouldn't change what is actually loaded, since the format kwarg was being ignored before.

nickponvert added a commit that referenced this issue Apr 11, 2019
This addresses issue #513 and makes PR #514 unnecessary. For future reference, here's the explanation for the #513 fix: 

This PR addresses issue #513 by changing all of the calls to pandas.read_hdf() to no longer pass 'format' as a kwarg. We used to do this, and the passed kwarg was ignored by HDFStore.open because it wasn't a defined keyword arg for that method, but no alarms were raised.

See: pandas-dev/pandas#13291

In newer versions of pandas there is an exception raised if you try to pass 'format' to the read_hdf() method. Removing this kwarg:

- Won't change the output of read_hdf, as the kwarg was being ignored before, and
- Allows me to run Marina's old notebooks with VBA analysis code and pandas==0.24.2
@nickponvert
Copy link
Contributor Author

Fixed by #518

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant