Skip to content

Support for Rotary Embeddings for Llama #1885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bradleat opened this issue Mar 22, 2023 · 3 comments
Closed

Support for Rotary Embeddings for Llama #1885

bradleat opened this issue Mar 22, 2023 · 3 comments

Comments

@bradleat
Copy link

It looks like Llama uses an unsupported embedding scheme:

https://nn.labml.ai/transformers/rope/index.html

I'm opening this thread so we can have a conversation about how to support these embeddings within langchain. I'm happy to help, but my knowledge is limited.

@pachacamac
Copy link

pachacamac commented Mar 26, 2023

I got the recently merged embeddings pr of llama working. Outputting a 5121 element long vector. What would be the best way to go about plugging this output to langchain? Take one of the existing embedding classes and base it of that or am I missing something? I mean this works but obviously it's terrible

from langchain.embeddings.base import Embeddings
from typing import List
from pydantic import BaseModel

import subprocess

class ShittyLlamaEmbeddings(Embeddings, BaseModel):
    def _get_embedding(self, text) -> List[float]:
        result = subprocess.run(
            ["./embedding", "-m", "./models/13B/ggml-model-q4_0.bin", "-t", "8", "-p", f"'{text}'"],
            cwd="/wd/llama.cpp",
            capture_output=True,
            text=True
        )
        return [float(n) for n in result.stdout.strip().split(' ')]
    
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        return [self._get_embedding(text) for text in texts]

    def embed_query(self, text: str) -> List[float]:
        return self._get_embedding(text)
    
if __name__ == '__main__':
    sle = ShittyLlamaEmbeddings()
    print(sle.embed_query('house'))

@A-ML-ER
Copy link

A-ML-ER commented Mar 29, 2023

How to convert Llama structure into Faster transformer sturcture ?
it seem has 32 layers with LlamaRotaryEmbedding ?

@dosubot
Copy link

dosubot bot commented Sep 10, 2023

Hi, @bradleat! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about adding support for Rotary Embeddings in Llama. You mentioned that you wanted to discuss how to incorporate these embeddings into langchain and that you are willing to assist, although your knowledge is limited. In the comments, user "pachacamac" mentioned that they got the recently merged embeddings working and asked for guidance on how to plug the output into langchain. User "A-ML-ER" also asked about converting Llama structure into Faster transformer structure.

Based on the comments, it seems that progress has been made towards resolving the issue. User "pachacamac" mentioned that they got the recently merged embeddings working.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 10, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants