Skip to content

Conversation

ctrueden
Copy link
Member

This is useful to pass images to APIs that work with collections, such as Guava's com.google.common.math.Quantiles.

See also this post on the Image.sc Forum.

@ctrueden
Copy link
Member Author

ctrueden commented Mar 1, 2019

A couple of notes.

  1. I just noticed @acardona wrote a similar thing on the collection branch. But it doesn't look like a PR was ever filed. Also, that work always maps IterableInterval<T> and RandomAccessibleInterval<T> to the same type Collection<T>, without a way to bake in a conversion function. But if this PR merges, and @acardona is happy with it, then we should probably also delete the collection branch, since it would not be necessary anymore.

  2. There is a subtle (to me 😉) bug in the current implementation, relating to images with non-flat iteration order. You can see the problem by adding the following code to the beginning of the testDoubleList() method:

    final Img< DoubleType > imgDouble = new CellImgFactory<>( new DoubleType() ).create( 30, 40, 50 );
    double v = 0; for ( final DoubleType t : imgDouble ) t.set( v++ );

    The issue is that this code uses IntervalIndexer to map between 1D and nD coordinates, but that function uses flat iteration order. So for things like CellImg, the original image iteration order (not flat) does not match the 1D rasterization order of the wrapped collection (flat). Is there an efficient way to introspect a CellImg to discern its 1D-to-nD ordering? If not, I am not sure how to solve this problem. Do we feel that the natural iteration order of the Img necessarily needs to match the natural iteration order of its wrapped List?

Copy link
Contributor

@maarzt maarzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this commit. Enabling the conversions:

  1. InterableInterval -> Collection
  2. RandomAccessibleInterval -> List

is nice. But I don't like that we need to add so many methods. We would need to add even more methods for each new type, and collection class supported. I would definitely drop the support for boxed Byte and Short.
One could avoid having so many methods, be using some builder pattern. Or providing a type converter as parameter just like Converters.convert(image, converter, type).

Another thougth: This doesn't seem to be so much imglib2-core to me. The functionality is comparable to ImageJFunctions.wrap(...) in imglib2-ij. I personally would consider adding it to algorithms.

public final class FlatCollections
{

public static < T, E > Collection< E > collection( final IterableInterval< T > ii, final Function< T, E > converter )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think it would be nice to use image as a parameter name rather than ii.
  2. The method would benefit from a javadoc.

return collection( ii, t -> t.getBigInteger() );
}

public static < T, E > List< E > list( final RandomAccessibleInterval< T > ii, final Function< T, E > converter )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. First parameter must not be named ii. Use image instead.
  2. Javadoc would be nice

}
}

private static int size32( final long size )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Name size32 doesn't make much sense. sizeAsInt would be better.

return collection( ii, t -> t.get() );
}

public static Collection< Byte > byteCollection( final IterableInterval< ByteType > ii )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not necessary. It's performance and memory consumption is probably the same as integerCollection. And ByteType is also not very widely used.

return collection( ii, t -> t.getIntegerLong() );
}

public static Collection< Short > shortCollection( final IterableInterval< ShortType > ii )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed integerCollection could be used instead

return list( ii, t -> t.get() );
}

public static List< Byte > byteList( final RandomAccessibleInterval< ByteType > ii )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove method. integerList can be used instead.

return list( ii, t -> t.getIntegerLong() );
}

public static List< Short > shortList( final RandomAccessibleInterval< ShortType > ii )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove method. integerList can be used instead.


private final Img< ShortType > imgShort = ArrayImgs.shorts( new short[] { 32767, 32766, 32765, 32764, 32763, 10000, 10001, 10002, 10003, -32768, -32767, -32766 }, 2, 3, 2 );

private final Img< UnsignedLongType > imgBig = ArrayImgs.unsignedLongs( new long[] { 2 * Long.MAX_VALUE, 3 * Long.MAX_VALUE, 4 * Long.MAX_VALUE, Long.MIN_VALUE, Long.MIN_VALUE / 2, Long.MIN_VALUE / 3 }, 3, 2 );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not test that many values. Even testing one value per type should be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no performance issue with these tests. Testing more edge cases is good. I see no benefit in cutting down these tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see no performance issue. My points are these:

  1. The tests could be simpler. A simple test, makes maintenance easier. When it fails, finding the bug is easier.
  2. The test implement the same algorithm as the tested code. If there's a bug with for example the usage of IntervalIndexer, the test want show that, because it would compare the wrong result against an equally wrong expected result. So having a test, which is simple enough to not use for example IntervalIndexer is actually a benefit.
  3. There are better places to put test getRealDouble and similar methods. See PR Simple Conversion between RealTypes #249 fixes a bug in Unsigned128BitInteger.getRealDouble and also adds tests.

Copy link
Contributor

@maarzt maarzt Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other side. The tests can stay as the are they are good enough.

@Test
public void testCollection()
{
assertImageEqualsCollection( imgString, FlatCollections.collection( imgString, s -> s ), Objects::equals );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be instead:

List<String> pixels = Array.asList("A", "B","C","D");
Img<String> img = new ListImg(pixels, 2, 2);
List<String> result = FlatCollections.collection(img, s -> s);
assertCollectionsEquals(pixels, result);

@Test
public void testIntegerCollection()
{
assertImageEqualsCollection( imgByte, FlatCollections.integerCollection( imgByte ), ( e, a ) -> e.getInteger() == a );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simpler, if we test only for one value:

assertCollectionEquals( Arrays.asList(42), FloatCollections.integerCollection(ArrayImgs.bytes(byte[]{42},1)) );

This is useful to pass images to APIs that work with collections,
such as Guava's com.google.common.math.Quantiles.

See also:
https://forum.image.sc/t/imglib2-development-a-question-about-img-t/23186/8
@ctrueden ctrueden force-pushed the collection-wrappers branch from 6c9196c to 2f58e5a Compare March 5, 2019 20:31
@ctrueden
Copy link
Member Author

ctrueden commented Mar 5, 2019

Thanks for the review, @maarzt!

I force-pushed an update with the following improvements:

  1. I added javadoc to all public methods.
  2. I renamed the ii parameters to image.

I did not (yet) change either of the following:

  • Request to remove the byte and short signatures.
  • Request to make the testing more minimal.

I think it is a mistake to cut down the testing. Fewer edge cases would be covered. The tests as written are simple to understand and not burdensome to maintain.

Regarding removal of the byte and short signatures: generally speaking, I agree with you that less code is better. Every line of code is a liability, as they say. However, in this case, there are 8 primitive types, and each of them (except char) corresponds well to a particular subset of the ImgLib2 type hierarchy, as written here. In my opinion, removing byte and short here would make only make this code more confusing to newcomers, who would see that most but not all of the primitive types have corresponding signatures, and wonder why not. These methods are a convenience, only one line each, so near zero burden to maintain. But replicating them via the general method is potentially error-prone, because each one-line conversion embeds the correct method call (e.g. t.get() vs. .getInteger() depending on the type) to access the underlying data most directly.

Finally, I disagree that this utility class would be better served in imglib2-algorithm, for the following reasons:

  • Other conversion methods such as Converters and Views exist in imglib2 core, not imglib2-algorithm.
  • The imglib2-algorithm library brings in many dependencies, none of which are necessary for FlatCollections.

If it is decided that this utility class is too "bloaty" for ImgLib2 core, I'd favor just putting it into ImageJ Ops instead. But I think it would be a shame to reduce its visibility that way.

@maarzt
Copy link
Contributor

maarzt commented Mar 6, 2019

After thinking about it a little bit more. I agree that this PR is mostly fine as it is. Tests are ok, having all the methods is ok, having it as part of imglib2 is fine for me too. But I still have some worries:

  • About Byte and Short. I see how supporting byte and short is more inviting to newcomers. At the same time I'm worried, because we only support the rarely used signed ByteType and not the much more common UnsignedByteType. So supporting ByteType but not the UnsignedByteType seems to be even more confusing than only having IntegerType???
  • More javadoc is needed: The javadoc needs to state that these methods have low performance, because of there bad memory footprint related to the boxed primitives. And that the collection don't guaranty any particular iteration order.

ctrueden added 2 commits March 6, 2019 10:38
They only work for ByteType and ShortType respectively,
which are not very frequently used. And it is rare in practice to need
specifically a Collection or List of these primitive types. They can
still be done by calling the general methods with a custom function.
For Collection, the order will always match the IterableInterval.
For List, the order will always be flat.
@ctrueden
Copy link
Member Author

ctrueden commented Mar 6, 2019

I understand what you are saying about the tests. The way they are written, they are regression tests. They aim to be a little more thorough (exercising multiple methods of Collection and List), at the expense of layers of indirection with the helper methods and reuse of images. If you want to rewrite them to your liking, go for it. I won't be offended. But I don't have time personally.

I pushed two more commits: one removing the byte and short wrappers, and one adding a note to the javadoc about the iteration order.

Regarding performance of the wrapped collections, there should be virtually no memory overhead. Yes, there is boxing. However, it's not like ArrayList<Double> where there will be size() Double objects in existence simultaneously. If you really want me to benchmark it I will, but I do not expect the memory use to be too bad. Only time performance potentially impacted compared to primitives, as transiently creating all those boxed objects takes CPU time too.

@ctrueden ctrueden requested a review from maarzt March 6, 2019 17:10
maarzt added 3 commits March 11, 2019 13:43
The typing can be relaxed a bit. Now it's possible to use
methods in generic code, without a type variable:
 Img< ? extends RealType< ? > > image = ...;
 List< Double > list = FlatCollections.doubleList( image );

Previously the method call required a type variable <T extends RealType<T>>:
 Img< T > image = ...;
 List< Double > list = FlatCollections.doubleList( image );
Copy link
Member

@tpietzsch tpietzsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ready to merge

Don't use IntervalIndexer it makes the test unnecessary complicated.
Call different methods to test the list.
Avoid number overflow when specifying values for image of UnsignedLongType.
@maarzt maarzt merged commit 73852cd into master Mar 14, 2019
@tpietzsch tpietzsch deleted the collection-wrappers branch March 14, 2019 14:37
ctrueden referenced this pull request in ctrueden/sandbox Mar 14, 2019
This provides utility classes converting from II<RealType> and
RAI<RealType> to Collection<Double> and List<Double>, respectively.
@ctrueden
Copy link
Member Author

Thanks @maarzt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants