Re-using the same Dataview with Bitmaps in memory, breaks when fitting different models or run cross validation on it

### System information

- **OS version/distro**: Windows 10
- **.NET Version (eg., dotnet --info)**: .NET Core 2.2 

### Issue

- **What did you do?**
I had a working pipeline for training image classification with cross-validation on the previous ML.NET version, using file paths as input. Now, being able to load Bitmaps, I am trying to setup a similar pipeline, but allowing training and predictions from in-memory bitmaps.
- **What happened?**
The training works if I just Fit the data,
`ITransformer mlModel = pipeline.Fit(trainData);`
 but it fails if I try to use CrossValidate
`var cvResults = _mlContext.MulticlassClassification.CrossValidate(trainData, pipeline, numberOfFolds);`
- **What did you expect?**
I expected a pipeline that worked with Fit to work with CrossValidate, but it seems the internal multiple passes do something to the Bitmaps (they lose data).
### Source code / logs
My current pipeline, based on [this sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/DeepLearning_ImageClassification_TensorFlow) is this:
```C#
var pipeline = _mlContext.Transforms.Conversion.MapValueToKey("Label")               
                .Append(_mlContext.Transforms.ResizeImages(outputColumnName: TensorFlowModelSettings.inputTensorName, imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image)))                
                .Append(_mlContext.Transforms.ExtractPixels(outputColumnName: TensorFlowModelSettings.inputTensorName, interleavePixelColors: ImageSettings.channelsLast, offsetImage: ImageSettings.mean/*, inputColumnName: nameof(ImageInputData.Image)*/))                
                .Append(_mlContext.Model.LoadTensorFlowModel(tensorFlowModelFilePath).
                ScoreTensorFlowModel(outputColumnNames: new[] { TensorFlowModelSettings.outputTensorName },
                                    inputColumnNames: new[] { TensorFlowModelSettings.inputTensorName }, addBatchDimensionInput: false))                
                .Append(_mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy(labelColumnName: "Label", featureColumnName: TensorFlowModelSettings.outputTensorName))
                .Append(_mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"))
                .AppendCacheCheckpoint(_mlContext);
```

The error log includes the following exceptions:

```
System.ArgumentException: Parameter is not valid.
   at System.Drawing.Image.get_Height()
   at Microsoft.ML.Transforms.Image.ImageResizingTransformer.Mapper.<>c__DisplayClass3_0.<MakeGetter>b__1(Bitmap& dst)
   at Microsoft.ML.Transforms.Image.ImagePixelExtractingTransformer.Mapper.<>c__DisplayClass5_0`1.<GetGetterCore>b__1(VBuffer`1& dst)
   at Microsoft.ML.Data.DataViewUtils.Splitter.InPipe.Impl`1.Fill()
   at Microsoft.ML.Data.DataViewUtils.Splitter.<>c__DisplayClass9_0.<SplitCore>b__1()
```

```
System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.ArgumentException: Parameter is not valid.
   at System.Drawing.Image.get_Height()
   at Microsoft.ML.Transforms.Image.ImageResizingTransformer.Mapper.<>c__DisplayClass3_0.<MakeGetter>b__1(Bitmap& dst)
   at Microsoft.ML.Transforms.Image.ImagePixelExtractingTransformer.Mapper.<>c__DisplayClass5_0`1.<GetGetterCore>b__1(VBuffer`1& dst)
   at Microsoft.ML.Data.DataViewUtils.Splitter.InPipe.Impl`1.Fill()
   at Microsoft.ML.Data.DataViewUtils.Splitter.<>c__DisplayClass9_0.<SplitCore>b__1()
   --- End of inner exception stack trace ---
   at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes)
   at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore()
   at Microsoft.ML.Data.RootCursorBase.MoveNext()
   at Microsoft.ML.Data.DataViewUtils.Splitter.<>c__DisplayClass5_1.<ConsolidateCore>b__2()
```

This is my first issue here, and I apologize if I overlooked something. I found no posts about this error anywhere.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Re-using the same Dataview with Bitmaps in memory, breaks when fitting different models or run cross validation on it #4126

System information

Issue

Source code / logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Re-using the same Dataview with Bitmaps in memory, breaks when fitting different models or run cross validation on it #4126

Description

System information

Issue

Source code / logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions