Skip to content

[DataPipe] DataPipe Deprecation Tracker #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
NivekT opened this issue Jan 11, 2022 · 10 comments
Open

[DataPipe] DataPipe Deprecation Tracker #163

NivekT opened this issue Jan 11, 2022 · 10 comments

Comments

@NivekT
Copy link
Contributor

NivekT commented Jan 11, 2022

We have a number of DataPipes that are being deprecated. Our general policy is that we first mark the DataPipe as deprecated with a warning, and wait at least one release cycle (~3 months) before removing it. Note that some DataPipes will be removed from the PyTorch Core library but will remain in TorchData, and some others are renamed.

Status Types:

  • Deprecated - marked as deprecated with a warning
  • Removed - removed from repository

DataLoader2 Tracker

Name Deprecation Date Status Earliest Removal Version
PrototypeMultiProcessingReadingService -> MultiProcessingReadingService 0.6 Deprecated 0.8

IterDataPipe Tracker

Name Functional API Module Deprecation Date Status Earliest Removal Version
BucketBatcher NA Core Sep 30th, 2021 Removed (moved to TorchData)
HTTPReader NA Core Sep 30th, 2021 Removed (moved to TorchData)
LineReader NA Core Sep 30th, 2021 Removed (moved to TorchData)
TarArchiveReader NA Core Sep 30th, 2021 Removed (moved to TorchData)
ZipArchiveReader NA Core Sep 30th, 2021 Removed (moved to TorchData)
FileLoader NA Core Jan 5th, 2022 Removed (use FileOpener) 1.13 (Sept 2022)
FileLoader NA Data Jan 5th, 2022 Removed (use FileOpener)
IoPathFileLoader load_file_by_iopath Data Jan 5th, 2022 Removed (use IoPathFileOpener)
RoutedDecoder routed_decode Core Jan 10th, 2022 Deprecated 1.13 (Sept 2022)
TarArchiveReader read_from_tar Data Feb 22th, 2022 Removed (use TarArchiveLoader) 0.5 (Sept 2022)
XzFileReader read_from_xz Data Feb 22th, 2022 Removed (use XzFileLoader) 0.5 (Sept 2022)
ZipArchiveReader read_from_zip Data Feb 22th, 2022 Removed (use ZipArchiveLoader) 0.5 (Sept 2022)
Filter filter Core 1.12 Removed argument (drop_empty_batches) 2.0 (Nov 2022)
FSSpecFileOpener open_files_by_fsspec Data 0.4 open_file_by_fsspec is Removed 0.6 (Nov 2022)
IoPathFileOpener open_files_by_fsspec Data 0.4 open_file_by_iopath is Removed 0.6 (Nov 2022)

MapDataPipe Tracker

Nothing for now

cc: @ejguan @VitalyFedyunin @NivekT

NivekT added a commit to pytorch/pytorch that referenced this issue Jan 11, 2022
We labeled these DataPipes as deprecated on Sep 30th, 2021 (#65827). Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) to continue using them. We will be checking for any downstream library usage before landing this PR.

All deprecation related to DataPipes are tracked in pytorch/data#163

DataPipes impacted by this PR:

| Name | Functional API  | Deprecation Date | What Happens Now 
| ------------- | ------------- | ------------- | ------------- |
| BucketBatcher | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| HTTPReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| LineReader | NA  | Sep 30th, 2021 | Remove (moved to TorchData)
| TarArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| ZipArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)

Lastly, tests related to those DataPipes will also be removed or migrated to TorchData (if they do not already exist there).


Differential Revision: [D33532272](https://our.internmc.facebook.com/intern/diff/D33532272)

cc ezyang gchanan @VitalyFedyunin ejguan @NivekT

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jan 11, 2022
We labeled these DataPipes as deprecated on Sep 30th, 2021 (#65827). Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) to continue using them. We will be checking for any downstream library usage before landing this PR.

All deprecation related to DataPipes are tracked in pytorch/data#163

DataPipes impacted by this PR:

| Name | Functional API  | Deprecation Date | What Happens Now 
| ------------- | ------------- | ------------- | ------------- |
| BucketBatcher | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| HTTPReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| LineReader | NA  | Sep 30th, 2021 | Remove (moved to TorchData)
| TarArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| ZipArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)

Lastly, tests related to those DataPipes will also be removed or migrated to TorchData (if they do not already exist there).


Differential Revision: [D33532272](https://our.internmc.facebook.com/intern/diff/D33532272)

cc ezyang gchanan @VitalyFedyunin ejguan @NivekT

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jan 11, 2022
We labeled these DataPipes as deprecated on Sep 30th, 2021 (#65827). Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) to continue using them. We will be checking for any downstream library usage before landing this PR.

All deprecation related to DataPipes are tracked in pytorch/data#163

DataPipes impacted by this PR:

| Name | Functional API  | Deprecation Date | What Happens Now 
| ------------- | ------------- | ------------- | ------------- |
| BucketBatcher | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| HTTPReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| LineReader | NA  | Sep 30th, 2021 | Remove (moved to TorchData)
| TarArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| ZipArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)

Lastly, tests related to those DataPipes will also be removed or migrated to TorchData (if they do not already exist there).


Differential Revision: [D33532272](https://our.internmc.facebook.com/intern/diff/D33532272)

cc ezyang gchanan @VitalyFedyunin ejguan @NivekT

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jan 12, 2022
We labeled these DataPipes as deprecated on Sep 30th, 2021 (#65827). Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) to continue using them. We will be checking for any downstream library usage before landing this PR.

All deprecation related to DataPipes are tracked in pytorch/data#163

DataPipes impacted by this PR:

| Name | Functional API  | Deprecation Date | What Happens Now 
| ------------- | ------------- | ------------- | ------------- |
| BucketBatcher | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| HTTPReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| LineReader | NA  | Sep 30th, 2021 | Remove (moved to TorchData)
| TarArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| ZipArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)

Lastly, tests related to those DataPipes will also be removed or migrated to TorchData (if they do not already exist there).


Differential Revision: [D33532272](https://our.internmc.facebook.com/intern/diff/D33532272)

cc ezyang gchanan @VitalyFedyunin ejguan @NivekT

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jan 12, 2022
We labeled these DataPipes as deprecated on Sep 30th, 2021 (#65827). Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) to continue using them. We will be checking for any downstream library usage before landing this PR.

All deprecation related to DataPipes are tracked in pytorch/data#163

DataPipes impacted by this PR:

| Name | Functional API  | Deprecation Date | What Happens Now 
| ------------- | ------------- | ------------- | ------------- |
| BucketBatcher | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| HTTPReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| LineReader | NA  | Sep 30th, 2021 | Remove (moved to TorchData)
| TarArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)
| ZipArchiveReader | NA | Sep 30th, 2021 | Remove (moved to TorchData)

Lastly, tests related to those DataPipes will also be removed or migrated to TorchData (if they do not already exist there).


Differential Revision: [D33532272](https://our.internmc.facebook.com/intern/diff/D33532272)

cc ezyang gchanan @VitalyFedyunin ejguan @NivekT

[ghstack-poisoned]
@ejguan
Copy link
Contributor

ejguan commented Mar 1, 2022

For TarArchiveReader, should we add a deprecation warning in main branch as 0.3.0 branch cut has been finished.

facebook-github-bot pushed a commit that referenced this issue Mar 2, 2022
Summary:
Pull Request resolved: #272

Tracked by #163

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34562086

Pulled By: NivekT

fbshipit-source-id: c930c190da57a82c491da265da5848f94aadd85e
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Apr 22, 2022
…taPipe arguments

Last patch to align DataPipe API with TorchArrow DataFrame

For deprecation warning of DataPipe argument:
```
The argument `drop_empty_batches` of `FilterIterDataPipe()` is deprecated since 1.12 and will be removed in 1.14.
See pytorch/data#163 for details.
```
Pull Request resolved: #76060
Approved by: https://github.com/NivekT
facebook-github-bot pushed a commit to pytorch/pytorch that referenced this issue Apr 26, 2022
…taPipe arguments (#76060)

Summary:
Last patch to align DataPipe API with TorchArrow DataFrame

For deprecation warning of DataPipe argument:
```
The argument `drop_empty_batches` of `FilterIterDataPipe()` is deprecated since 1.12 and will be removed in 1.14.
See pytorch/data#163 for details.
```

Pull Request resolved: #76060
Approved by: https://github.com/NivekT

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/ec591087fb95525b430f2bbcdd023567e2fb1cc9

Reviewed By: seemethere, osalpekar

Differential Revision: D35874488

Pulled By: ejguan

fbshipit-source-id: 83f1cfc6430e6294b332195e9f056af4d2405dce
NivekT added a commit to pytorch/pytorch that referenced this issue Jun 6, 2022
…Pipe names"


Fixes pytorch/data#480.

For the list of deprecated DataPipes (or functional names), see pytorch/data#163

Testing:
```python
IterableWrapper(range(10)).open_file_by_iopath()
```
Returns:
```
/Users/.../pytorch/torch/utils/data/datapipes/utils/common.py:171: FutureWarning: `IoPathFileOpener()`'s functional API `.open_file_by_iopath()` is deprecated since 1.12 and will be removed in 1.14.
See pytorch/data#163 for details.
Please use `.open_files_by_iopath()` instead.
  warnings.warn(msg, FutureWarning)
  ```

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jun 6, 2022
…Pipe names"


Fixes pytorch/data#480.

For the list of deprecated DataPipes (or functional names), see pytorch/data#163

Testing:
```python
IterableWrapper(range(10)).open_file_by_iopath()
```
Returns:
```
/Users/.../pytorch/torch/utils/data/datapipes/utils/common.py:171: FutureWarning: `IoPathFileOpener()`'s functional API `.open_file_by_iopath()` is deprecated since 1.12 and will be removed in 1.14.
See pytorch/data#163 for details.
Please use `.open_files_by_iopath()` instead.
  warnings.warn(msg, FutureWarning)
  ```

[ghstack-poisoned]
NivekT added a commit to pytorch/pytorch that referenced this issue Jun 6, 2022
…Pipe names"


Fixes pytorch/data#480.

For the list of deprecated DataPipes (or functional names), see pytorch/data#163

Testing:
```python
IterableWrapper(range(10)).open_file_by_iopath()
```
Returns:
```
/Users/.../pytorch/torch/utils/data/datapipes/utils/common.py:171: FutureWarning: `IoPathFileOpener()`'s functional API `.open_file_by_iopath()` is deprecated since 1.12 and will be removed in 1.14.
See pytorch/data#163 for details.
Please use `.open_files_by_iopath()` instead.
  warnings.warn(msg, FutureWarning)
  ```

[ghstack-poisoned]
@ejguan
Copy link
Contributor

ejguan commented Sep 19, 2022

Another Misc tracker:

Name Module Deprecation Version Status Earliest Removal Version
torch.utils.data.graph.traverse Core 1.13 Deprecating 1.15 / 2.1

facebook-github-bot pushed a commit that referenced this issue Nov 9, 2022
Summary:
- Continue to deprecate stuff based on #163
- Fix a typo

Pull Request resolved: #890

Reviewed By: NivekT

Differential Revision: D41130080

Pulled By: ejguan

fbshipit-source-id: d8c1a6a4732211fcf75a08d5cb5e090a587b555b
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Nov 10, 2022
…ional APIs (#88693)

- Deprecating based on pytorch/data#163

Corresponding PRs from TorchData: pytorch/data#890
Pull Request resolved: #88693
Approved by: https://github.com/NivekT
@BlueskyFR
Copy link

I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?

@ejguan
Copy link
Contributor

ejguan commented Nov 23, 2022

I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?

@BlueskyFR
IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a demux based on file types then decode each datapipe correspondingly then mux them together. Glad to hear your use case.

@BlueskyFR
Copy link

I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?

@BlueskyFR
IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a demux based on file types then decode each datapipe correspondingly then mux them together. Glad to hear your use case.

I don't understand: how should I proceed to decode a PNG image in the current state then?

@ejguan
Copy link
Contributor

ejguan commented Nov 23, 2022

You can use a map function like datapipe.map(decode_fn) to decode the PNG image

@BlueskyFR
Copy link

You can use a map function like datapipe.map(decode_fn) to decode the PNG image

Okay, but why was support for decoding dropped then?

@ejguan
Copy link
Contributor

ejguan commented Nov 23, 2022

Okay, but why was support for decoding dropped then?

decoding didn't do more things like a map function, except we provided a few decoding functions for convenient. And, in order to support routed_decode, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes the routed_decode more complicated and redundant. For example of your use case (decoding PNG), the routed_decode would add more decoding handlers such as json, pickle, etc. into this DataPipe.

As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism

@BlueskyFR
Copy link

Okay, but why was support for decoding dropped then?

decoding didn't do more things like a map function, except we provided a few decoding functions for convenient. And, in order to support routed_decode, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes the routed_decode more complicated and redundant. For example of your use case (decoding PNG), the routed_decode would add more decoding handlers such as json, pickle, etc. into this DataPipe.

As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism

Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed

@ejguan
Copy link
Contributor

ejguan commented Nov 23, 2022

Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed

It depends on if your decode_fn supports batched decoding in high performance (multithreading). Otherwise, I think it's going to be similar to do decoding per image.

kulinseth pushed a commit to kulinseth/pytorch that referenced this issue Dec 10, 2022
…ional APIs (pytorch#88693)

- Deprecating based on pytorch/data#163

Corresponding PRs from TorchData: pytorch/data#890
Pull Request resolved: pytorch#88693
Approved by: https://github.com/NivekT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants