Skip to content

Image Classification Benchmark #2379

Open
@justinormont

Description

@justinormont

@Anipik: We could make a good benchmark for the image processing pipeline.
I'd recommend using the Dog Breeds vs. Fruits dataset which we used in NimbusML for its image examples. We currently host this dataset in our CDN for NimbusML.

In Python, the dataset / image loader looks like:

# Load image summary data from github
url = "https://express-tlcresources.azureedge.net/datasets/DogBreedsVsFruits/DogFruitWiki.SHUF.117KB.735-rows.tsv"
df_train = pd.read_csv(url, sep = "\t", nrows = 100)
df_train['ImagePath_full'] = "https://express-tlcresources.azureedge.net/datasets/DogBreedsVsFruits/" + \
                         df_train['ImagePath']
... load images

Purpose of the dataset is for example code & includes ~775 images of dogs & fruit:
image
image

(copied from PR -- #2372 (review))

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Priority of the issue for triage purpose: Needs to be fixed at some point.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions