Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

speed up loading, and speed up tests #245

Merged
merged 2 commits into from
Oct 20, 2021
Merged

Conversation

peterdudfield
Copy link
Contributor

@peterdudfield peterdudfield commented Oct 19, 2021

Pull Request

Description

Save and Load netcdf in parallel

Fixes #244

How Has This Been Tested?

wiht unittest, not on full netcdf files

  • No
  • Yes

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

Sorry, something went wrong.

Copy link
Member

@jacobbieker jacobbieker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@peterdudfield peterdudfield merged commit 37b5b9c into main Oct 20, 2021
@peterdudfield peterdudfield deleted the issue/244-mulit-process branch October 20, 2021 10:23
Copy link
Member

@JackKelly JackKelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

futures.ProcessPoolExecutor might be faster than ThreadPoolExecutor (because the data compression is probably CPU-bound)?

One small comment...

# Submit tasks to the executor.
for data_source in self.data_sources:
if data_source is not None:
_ = executor.submit(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong but I think it's a good idea to keep these Future objects in a list, and then loop through the list of Future objects and wait on future.result() (just like you do in load_netcdf(). Mostly because I think that Exceptions from save_netcdf might get silently swallowed until you call future.result()? Or maybe Exceptions only get silently swallowed if using ProcessPoolExecutor, I'm not sure?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that I'm not sure, have to look into that

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi process when saving to netcdf
3 participants