Skip to content

Add multimodal example #6313

@mscheong01

Description

@mscheong01

Feature Description

Add example for multimodal capabilities

Motivation

#5882 took out the multimodal features from the server. Given it's a highly requested feature, our plan would be to reintroduce it at some point (#6168). How about we set up a solid multimodal example elsewhere and then port it to the server example later on?

Possible Implementation

Implementation based on the removed code from https://github.com/ggerganov/llama.cpp/pull/5882/files which had already implemented this feature in the server.cpp example, hopefully with some performance optimization.
For the example, image file could be provided via command line option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllavaLLaVa and multimodal

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions