Skip to content

x/image/tiff: package doesn't support CCITT Fax 4 Compression #19443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sanbornm opened this issue Mar 7, 2017 · 19 comments
Closed

x/image/tiff: package doesn't support CCITT Fax 4 Compression #19443

sanbornm opened this issue Mar 7, 2017 · 19 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@sanbornm
Copy link

sanbornm commented Mar 7, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.8 darwin/amd64

What operating system and processor architecture are you using (go env)?

Darwin/amd64

What did you do?

Take a tiff image with CCITT Fax 4 Compression and attempt to decode it with the golang.org/x/image/tiff package.

What did you expect to see?

I expected a level 4 compressed tiff to decode properly.

What did you see instead?

tiff: unsupported feature: compression value 4

@ALTree ALTree changed the title image/tiff package doesn't support certain tiff files x/image/tiff: package doesn't support certain tiff files Mar 7, 2017
@ALTree ALTree added this to the Unreleased milestone Mar 7, 2017
@odeke-em
Copy link
Member

odeke-em commented Mar 7, 2017

@sanbornm would it be possible for you to provide a sample offending image and a working repro program? It makes it easier for folks to investigate.

Thanks.

@sanbornm
Copy link
Author

sanbornm commented Mar 7, 2017

new.tiff.zip

package main

import (
	"bytes"
	"image"
	"image/png"
	"io/ioutil"
	"log"

	_ "golang.org/x/image/tiff"
)

func main() {
	b, err := ioutil.ReadFile("new.tiff")

	reader := bytes.NewReader(b)
	img, _, err := image.Decode(reader)
	if err != nil {
		log.Fatal(err)
	}

	w := bytes.NewBuffer([]byte{})
	err = png.Encode(w, img)
	if err != nil {
		log.Fatal(err)
	}
}

Output:

2017/03/07 09:03:46 tiff: unsupported feature: compression value 4
exit status 1

@ajaxray
Copy link

ajaxray commented Mar 19, 2018

I am having similar issue -

[DEBUG]  image.go:278 Error decoding file: tiff: unsupported feature: compression value 4
[ERROR]  image.go:80 Error loading image: tiff: unsupported feature: compression value 4
Error: tiff: unsupported feature: compression value 4

Here is a sample image
(scanned using Kodak i2420 scanner).

@andybons andybons changed the title x/image/tiff: package doesn't support certain tiff files x/image/tiff: package doesn't support CCITT Fax 4 Compression Mar 19, 2018
@andybons andybons added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Mar 19, 2018
@hhrutter
Copy link

I can contribute an enhanced version of x/image/tiff that features:

Right now it's all located at pdfcpu.
Please let me know if this is worth a PR.
Thank you!

@bsiegert
Copy link
Contributor

CCITT Fax compression seemed unlikely to be useful when this was written.

Sure, please send CLs to me :)

@hhrutter
Copy link

Kind of unreal to get feedback after 6 months.
@bsiegert are you one of the maintainers of this package?

@bsiegert
Copy link
Contributor

bsiegert commented Apr 13, 2019 via email

@hhrutter
Copy link

Excellent!

Please have a look at here and here and let me know how to proceed.

There are dependencies on compress/lzw which I improved and started a golang proposal for and
also we are going to need to find the right place for a ccitt decoder which hopefully soon gets an encoder as well.

Meanwhile I am busy resolving this pdfcpu issue.

@marcellanz
Copy link

Ah.... good to see progress here, I started some work for a t.4 and t.6 decoder (just for fun) as the issue seemed to be well asleep.

@bsiegert
Copy link
Contributor

@hhrutter Maybe let's start with CCITT Fax G4 compression. You say you have code for that?

Have you ever contributed to Go? If not, see https://golang.org/doc/contribute.html for how to get started. Send a CL for golang/image (golang.org/x/image) and start the commit message with "tiff: ". You can set @nigeltao and me as the reviewers.

Alternatively, you can start a pull request on github, which will create a Gerrit CL. This is a bit more clunky though.

@hhrutter
Copy link

hhrutter commented Apr 25, 2019

I have a rudimentary decoder for G3 and G4 that I can contribute.
It's basic because it is hard to find samples for all possible variations.
G4 is much more complete than G3 for the same reason.
I picked up all tiff files with embedded CCITT encoded images from the issues in this repo
and use them in testcases to prove correct decoding.

I did not write the encoder yet because these days CCITT G3/G4 encoding does not seem to have that much priority for users like CCITT decoding.

Please let me know if you accept a CCITT reader without the corresponding writer.
The main issue with this may be testing.

I am aware having a Reader/Writer pair enables writing nice tests.
Lacking a ccitt/writer for testcases I resorted to a workaround comparing the G3/G4 compressed binary byte streams as extracted from TIFF with reference PNG files that represent the result of a successful decoding of the very byte streams in the first place.

Another sub issue would be finding the right location for a ccitt package.
There is tiff/lzw but that is just a tmp hack in order to postpone revising compress/lzw if I understood correctly. There were already long discussions about this and I have prepared a solution in pdfcpu but here is not the right place to address this.

There is image/bmp. I believe it make sense to have image/ccitt.
I think CCITT compression is an important enough topic on its own not to hide it inside of image/tiff.

Other than that I think I am setup for contributing. I signed up with Gerrit a year ago although no contribution yet.

@nigeltao
Copy link
Contributor

Please let me know if you accept a CCITT reader without the corresponding writer.

Yes, that's fine. For example, golang.org/x/image/webp has a reader without a writer, and people still find it useful.

The main issue with this may be testing.

Yeah, that could be tricky. We'd probably want to check in some test files into the repo, but be aware of any copyright concerns with those files. Ideally we'd find some files with e.g. a Creative Commons or equally liberal license, or create your own test files (either manually, or transforming your own images through some other program such as ImageMagick).

I think CCITT compression is an important enough topic on its own not to hide it inside of image/tiff.

I'm not so familiar with CCITT. golang.org/x/image/ccitt might still be the right location, but do people ship 'CCITT files' or 'other files that embed CCITT somehow` that aren't TIFF files?

@marcellanz
Copy link

The main issue with this may be testing.

Yeah, that could be tricky. We'd probably want to check in some test files into the repo, but be aware of any copyright concerns with those files. Ideally we'd find some files with e.g. a Creative Commons or equally liberal license, or create your own test files (either manually, or transforming your own images through some other program such as ImageMagick).

I saw that LOC (Library of Congress) has some interest in TIFF preservation:
https://www.loc.gov/preservation/digital/formats/fdd/fdd000022.shtml

There are images available with different compression embedded:
ftp://ftp.loc.gov/pub/preservation/guidelines/images
but don't know about its copyright.

I tried to find also "reference files" from ITU-T specifically for T.4 and T.6 but didn't find any.

@hhrutter
Copy link

hhrutter commented Apr 26, 2019

Regarding the ccitt package location.
I would not put it inside of golang.org/x/image/tiff for the reason I already stated and also
it looks like this package needs a lot of work anyways. TIFF is such a flexible format with lots of options some of them still not implemented, better keep the package small as can be for now and leave room to grow and for logical extension. On the other hand I think it's a valid statement to say you are only ever going to need ccitt decoding/encoding within the tiff space but there are always exceptions. So this is my take - but I am not in a position to make the call.

Thanks for the pointer to the library of congress ftp server.
I found plenty of samples to stress test the decoder.
I also have access to a scanner that can produce PDFs with embedded ccitt group4 compressed images but from a testing point of view this would only be just one of many different ccitt encoders out there.

@bsiegert
Copy link
Contributor

@hhrutter No one said that it should be part of golang.org/x/image/tiff. golang.org/x/image/ccitt would be a good option if there are files in CCITT encoding that are not tiff (which I doubt). Otherwise, it can be golang.org/x/image/tiff/ccitt, for instance.

I don't understand these two bits:

and also it looks like this package needs a lot of work anyways

Is that a reason for not working on it?

better keep the package small as can be for now and leave room to grow and for logical extension

Are you suggesting that the tiff package should not be extended?

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/174139 mentions this issue: tiff: Add support for CCITT group 3/4 compression

@hhrutter
Copy link

To the contrary.
Of course it definitely should be extended.
There is just tons of work ahead and one needs to understand where I am coming from
and what has been discussed in this proposal.

Let me give you some background:
I created the ccitt decoder out of an urgent need for pdfcpu - a PDF processor.
PDF defines stream filters that allow object streams to be encoded in a certain way and the CCITTFaxDecode filter is one of them. PDF processors like pdfcpu implement these filters as this is an assumption for PDF reader/writers. If you take a look at here then you get a better picture.

The same applies to lzw. There is the PDF LZWDecode filter.
Since neither compress/lzw nor image/tiff/lzw works as laid out in the PDF spec
I created a seperate extension and put it under pdfcpu/lzw analog to pdfcpu/ccitt.
pdfcpu/lzw works for the standard case and both pdf and tiff.
Therefore pdfcpu has been providing an extension of image/tiff under pdfcpu/image/tiff
A version of tiff that does not need to have its own lzw reader and rather uses a central one that's
supposed to be an updated compress/lzw and in addition is also processes cmyk and it writes lzw.

When I say there is a lot of work needed on the tiff package one
example would be to get rid of tiff/lzw and rather have tiff use a consolidated compress/lzw.
The discussion is laid out in this proposal. but because of the Go 1 backwards compatibility promise extending compress/lzw was put on hold.

In order to make progress it really helps to get familiar with the structure laid out in pdfcpu. It goes like:

github.com/hhrutter/pdfcpu
ccitt
lzw
pkg
...
filter
...
tiff

My wishlist is to to get rid of:
github.com/hhrutter/pdfcpu/ccitt and instead import compress/ccitt
github.com/hhrutter/pdfcpu/lzw and instead import compress/lzw
github.com/hhrutter/pdfcpu/tiff and instead import x/image/tiff

also for the sake of a better project package structure.

I am not saying it has to be that way but this is how I envision it.
I just think it is important to understand these issues
and even if this github issue is only about supporting CCITT G3 and G4 compression for tiff images
one has to keep in mind the bigger picture.

For now I can see 2 issues that need to be addressed simultaneously:

  1. We need a separate location for ccitt compression (for PDF Readers/Writers)
  2. We need to support tiff with ccitt compression

Later we need to
3) Enhance compress/lzw for GIF, PDF and TIFF support.
4) Have image/tiff use this consolidated version of compress/lzw and get rid of image/tiff/lzw
5) Support lzw compression in addition to existing decompression

and on top of that I can contribute
5) CMYK support for image/tiff

And we definitely need to have the CCITT decoding/encoding in a seperate package from tiff.
This is because CCITT compression is a member of the filter set defined in the PDF spec for authors of PDF readers and writers.

I already submitted what I am proposing.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/174979 mentions this issue: ccitt: new package for CCITT images

gopherbot pushed a commit to golang/image that referenced this issue May 7, 2019
This initial commit only provides a decoder that generates modes and run
lengths from a bit stream. Future commits will add more features.

Updates golang/go#19443

Change-Id: I7bba1226a69a83e636e407e4d72ffd5630562e6b
Reviewed-on: https://go-review.googlesource.com/c/image/+/174979
Reviewed-by: Benny Siegert <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/182219 mentions this issue: ccitt: implement NewReader and DecodeIntoGray

gopherbot pushed a commit to golang/image that referenced this issue Jun 18, 2019
The testdata/bw-gopher.png image was derived from
testdata/bw-uncompressed.tiff

The testdata/*ccitt* data was generated from a custom CCITT encoder:
https://go-review.googlesource.com/c/image/+/174139/10/ccitt/writer.go

Updates golang/go#19443

Change-Id: I53b7acd807aa9a627517ede5bc3316d6f1fe4724
Reviewed-on: https://go-review.googlesource.com/c/image/+/182219
Reviewed-by: Benny Siegert <[email protected]>
@golang golang locked and limited conversation to collaborators Jun 21, 2020
GalaxyForcew added a commit to GalaxyForcew/A1bisshy that referenced this issue Jun 6, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
yi-ge3 added a commit to yi-ge3/wislie that referenced this issue Jun 6, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
balloontmz6 added a commit to balloontmz6/Likewise42l that referenced this issue Jun 6, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
snapbakkhfbav added a commit to snapbakkhfbav/SayedBaladohr that referenced this issue Oct 6, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
MiderWong5ddop added a commit to MiderWong5ddop/sidie88f that referenced this issue Oct 7, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
rorypeckwnt4v added a commit to rorypeckwnt4v/LearnByBhanuPrataph that referenced this issue Oct 7, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
egorovcharenko9 added a commit to egorovcharenko9/RiceBIOC470z that referenced this issue Oct 7, 2022
The algorithm is described at https://www.itu.int/rec/T-REC-T.6/en

Fixes golang/go#19443

Change-Id: Ib8a078ab43c78d1f58d2ac849ed455b05dc209e9
Reviewed-on: https://go-review.googlesource.com/c/image/+/174139
Reviewed-by: Benny Siegert <[email protected]>
Reviewed-by: Nigel Tao <[email protected]>
Run-TryBot: Benny Siegert <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

10 participants