Information Retrieval System

A text retrieval system implementing various information retrieval models with a Streamlit-based user interface.

Features

Multiple retrieval models:
- Boolean Retrieval Model
- Vector Space Model (VSM)
- Latent Semantic Analysis (LSA)
- Combined Model (Boolean + Vector)
Interactive search interface
Document statistics and visualizations
Model evaluation metrics

Installation

Clone this repository:

git clone https://github.com/dangvonguyen/IR-CS419.P21.git
cd IR-CS419.P21

Set up the environment and install dependencies

# Using pip
python -m venv .venv
source .venv/bin/activate
pip install -e .

# Using uv (faster installation)
uv sync
source .venv/bin/activate

Usage

Run the application using Streamlit:

streamlit run app.py

Using the Application

Load Data: Use the sidebar to select data source and parameters
Select Model: Choose between Boolean, VSM, LSA, or Combined retrieval models
Search: Enter queries in the search tab to retrieve relevant documents
Analyze: View document statistics and model performance in the Statistics tab
Browse: View loaded documents in the Documents tab

Project Structure

app.py: Main Streamlit application
src/models/: Implementation of retrieval models
- boolean_model.py: Boolean retrieval with inverted index
- vsm_model.py: Vector Space Model with TF-IDF
- lsa_model.py: Latent Semantic Analysis model
- combined_model.py: Combined Boolean and Vector model
src/utils.py: Utility functions for text processing
src/evaluate.py: Evaluation metrics for retrieval models
ui/: User interface components

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
ui		ui
.gitignore		.gitignore
.ruff.toml		.ruff.toml
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Information Retrieval System

Features

Installation

Usage

Using the Application

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

dangvonguyen/IR-CS419.P21

Folders and files

Latest commit

History

Repository files navigation

Information Retrieval System

Features

Installation

Usage

Using the Application

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages