This is my implementation of the CLIP (Contrastive Language-Image Pretraining) model.
-
clip_deployment/
Contains a Gradio app that lets you upload an image and returns the top 5 matching captions from a pre-stored captions list. -
CLIP_training.ipynb
A Jupyter notebook where the entire training process of the CLIP model is implemented.
-
Training
Open and run theCLIP_training.ipynb
notebook to train the CLIP model on your dataset. -
Deployment
Inside theclip_deployment
folder, launch the Gradio app to interact with the trained model by uploading images and getting the most relevant captions.
Feel free to explore, experiment on this project!