-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[FEAT] Add MobileViT v1 & v2 #6404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Looks great @yassineAlouini. It would be great if we get this implementation. Please have a read at #5319 where we document some best practices for model authoring. Also to avoid licensing problems, let's do a from scratch implementation. |
Perfect, I will work on this today but mostly next week and the week after. Will let you know how my progress goes. 👌 |
I have started the implementation. Seems like a big chunk but excited to do it. 👌 I have found this huggingface implementation, could be useful as another inspiration: https://huggingface.co/docs/transformers/main/model_doc/mobilevit. [EDIT] It looks like this is a wrapper around the https://github.com/apple/ml-cvnets implementation. 👌 |
Hi @yassineAlouini. Just wanted to touch base on the implementation. Any blockers or need help? |
Hello @datumbox, thanks for checking. So far, so good. It is taking a bit longer since I only had one day of working on it and it is paused for now but might work during the weekends and nights. Do you expect a date for finishing? 🤔 |
hey @yassineAlouini, sounds good. Thanks for the work. There are absolutely no deadlines on our side; just checking that everything goes smoothly and that you don't have a blocker. Let me know if you need anything :) |
Some update @datumbox: I will have some free time for the upcoming few days and should make some progress. Will let you know how it goes. 👌 |
By the way, what is the PyTorch and TorchVision policies for the usage of einops? 🤔 |
@yassineAlouini So far we don't have a model using this. Is there a specific use-case in MobileViT that can't be done otherwise? |
I don't think it is irreplaceable, just wanted to check what is the best practice in torchvision. 👌 |
One additional question regarding the |
@yassineAlouini Makes sense. Let's start by copy-pasting and modifying and see what changes are needed. Then we can decide whether sharing components is worth it. :) |
Some more progress @datumbox: I finally made the V1 work (I think), I am cleaning the code a bit and then will push it for a first round of reviews (to make sure I am on the right track). |
Alright, I have tried running: |
@yassineAlouini thanks! I've responded on the PR, let's continue the discussion there. :) |
Thanks @datumbox (et al) for the code review, I am checking now. 👌 |
@yassineAlouini Given the license of ImageNet, there is no way for us to redistribute it. So I think we might have to wait for them to respond. :( |
Thanks for the feedback @pmeier. I thought 10 days was a long time. 😄 |
@yassineAlouini We try to work something out with @pmeier. He will ping you on email. I also pinged on Twitter two of the people involved with ImageNet to see if they can help. We'll work something out. 🤞 |
There is a copy on Kaggle. https://www.kaggle.com/c/imagenet-object-localization-challenge/ |
Thanks for the link @gau-nernst but it is a smaller dataset if I am not wrong. |
@yassineAlouini I believe it is the ImageNet-1k split that most people commonly refer to as "the ImageNet dataset" (used in ILSVRC). It should be the correct one. Otherwise, HuggingFace is also hosting ImageNet-1k here: https://huggingface.co/datasets/imagenet-1k |
Thanks for the feedback and the link @gau-nernst. Isn't the "real" dataset the 22k one? Anyway, I will give it a try with the smaller one once I have time. |
🚀 The feature
As described in the RFC "Batteries includes, phase 3", I am working on adding MobileViT v1 and v2 inspired by the following code repos/snippets:
The original paper can be found here.
Motivation, pitch
This has been decided in the RFC.
Alternatives
No response
Additional context
No response
cc @datumbox
The text was updated successfully, but these errors were encountered: