Eval Process

Models

test-set generation model default gpt-4-turbo
vektor store embeddings model default text-embedding-3-large
model to be tested
model evaluating the answers default gpt-4-turbo

Questions

How many parameters do we want to use? do we want to use multiple judges? What kind of vectore stores do we want to use? https://docs.ragas.io/en/stable/howtos/customizations/testgenerator/_persona_generator/ Do want to detect hallucinations?

Documents

create from input
saved in pkl

Vektor store

DocArrayInMemorySearch
- created from docs
ChromaDBFactory
- needs to be created with name
- or loaded by name

curl -X POST "http://localhost:9876/documents/" \
-H "Content-Type: application/json" \
-d '{
    "name": "web-scraping-006",
    "sources": [
        {
            "name": "products",
            "type": "url",
            "url": "https://web-scraping.dev/products"
        },
        {
            "name": "reviews",
            "type": "url",
            "url": "https://web-scraping.dev/reviews"
        }
    ],
    "embedding_model": "openai/text-embedding-3-large"
}'

# Create a document with uploaded files (using JSON body)
curl -X POST "http://localhost:9876/documents/" \
-H "Content-Type: application/json" \
-d '{
    "name": "document-with-files",
    "embedding_model": "ollama/nomic-embed-text:latest",
    "file_ids": [1, 2]
}'

openai/gpt-4-turbo ollama/deepseek-r1:7b

curl -X POST "http://localhost:9876/testsets/" \
-H "Content-Type: application/json" \
-d '{
    "model_type": "openai/gpt-4-turbo",
    "embedding_model": "openai/text-embedding-3-large",
    "document": 17,
    "name": "story-001-001",
    "num_questions": 10,
    "agent_description": "A chatbot answering questions about a story"
}'

curl -X POST "http://localhost:9876/testsets/" \
-H "Content-Type: application/json" \
-d '{
    "model_type": "ollama/deepseek-r1:32b",
    "embedding_model": "ollama/nomic-embed-text:latest",
    "document": 135,
    "name": "d4f-100doc-odeep32bnomic-001-001",
    "num_questions": 50,
    "agent_description": "Ein Chatbot welcher Fragen für das Startup Develop 4 Future beantwortet."
}'

curl -X POST "http://localhost:9876/process/" \
-H "Content-Type: application/json" \
-d '{
"llm_to_be_evaluated_type": "ollama/deepseek-r1:32b",
"judge_llm_type": "ollama/deepseek-r1:32b",
"testset": 36
}'

curl -X POST "http://localhost:9876/process/" \
-H "Content-Type: application/json" \
-d '{
"llm_to_be_evaluated_type": "openai/gpt-4-turbo",
"judge_llm_type": "openai/gpt-4-turbo",
"testset": 36
}'

models/embeddings needed

Documents

Vectorstore

Testset

KnowledgeBase for question generation

Eval

Vectorstore
KnowledgeBase
Model to be evaluated
Evaluator

Embedding dimensions must be the same so we will take the same embeddings from the documents in Testsets and Evaluation

Vectorstore => embeddings KnowledgeBase => llm & embedding Evaluator => llm

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
app		app
bachelor		bachelor
example_upload_data		example_upload_data
frontend		frontend
migrations		migrations
notebooks		notebooks
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
migration.py		migration.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Eval Process

Models

Questions

Documents

Vektor store

models/embeddings needed

Documents

Testset

Eval

About

Uh oh!

Releases

Packages

Uh oh!

Languages

BartzLeon/rag-evaluator

Folders and files

Latest commit

History

Repository files navigation

Eval Process

Models

Questions

Documents

Vektor store

models/embeddings needed

Documents

Testset

Eval

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages