Support for Rotary Embeddings for Llama #1885

bradleat · 2023-03-22T15:04:00Z

It looks like Llama uses an unsupported embedding scheme:

https://nn.labml.ai/transformers/rope/index.html

I'm opening this thread so we can have a conversation about how to support these embeddings within langchain. I'm happy to help, but my knowledge is limited.

pachacamac · 2023-03-26T15:34:49Z

I got the recently merged embeddings pr of llama working. Outputting a 5121 element long vector. What would be the best way to go about plugging this output to langchain? Take one of the existing embedding classes and base it of that or am I missing something? I mean this works but obviously it's terrible

from langchain.embeddings.base import Embeddings
from typing import List
from pydantic import BaseModel

import subprocess

class ShittyLlamaEmbeddings(Embeddings, BaseModel):
    def _get_embedding(self, text) -> List[float]:
        result = subprocess.run(
            ["./embedding", "-m", "./models/13B/ggml-model-q4_0.bin", "-t", "8", "-p", f"'{text}'"],
            cwd="/wd/llama.cpp",
            capture_output=True,
            text=True
        )
        return [float(n) for n in result.stdout.strip().split(' ')]
    
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        return [self._get_embedding(text) for text in texts]

    def embed_query(self, text: str) -> List[float]:
        return self._get_embedding(text)
    
if __name__ == '__main__':
    sle = ShittyLlamaEmbeddings()
    print(sle.embed_query('house'))

A-ML-ER · 2023-03-29T11:58:55Z

How to convert Llama structure into Faster transformer sturcture ?
it seem has 32 layers with LlamaRotaryEmbedding ?

dosubot · 2023-09-10T16:04:51Z

Hi, @bradleat! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about adding support for Rotary Embeddings in Llama. You mentioned that you wanted to discuss how to incorporate these embeddings into langchain and that you are willing to assist, although your knowledge is limited. In the comments, user "pachacamac" mentioned that they got the recently merged embeddings working and asked for guidance on how to plug the output into langchain. User "A-ML-ER" also asked about converting Llama structure into Faster transformer structure.

Based on the comments, it seems that progress has been made towards resolving the issue. User "pachacamac" mentioned that they got the recently merged embeddings working.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 10, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Rotary Embeddings for Llama #1885

Support for Rotary Embeddings for Llama #1885

bradleat commented Mar 22, 2023

pachacamac commented Mar 26, 2023 •

edited

Loading

A-ML-ER commented Mar 29, 2023

dosubot bot commented Sep 10, 2023

Support for Rotary Embeddings for Llama #1885

Support for Rotary Embeddings for Llama #1885

Comments

bradleat commented Mar 22, 2023

pachacamac commented Mar 26, 2023 • edited Loading

A-ML-ER commented Mar 29, 2023

dosubot bot commented Sep 10, 2023

pachacamac commented Mar 26, 2023 •

edited

Loading