I'm passionate about building AI systems that actually solve real problems. Currently working on OCR pipelines and RAG architectures as a Google Summer of Code 2025 contributor at Extralit. When I'm not training models, you'll find me annoying my cat or playing my guitar.
class Priyankesh:
def __init__(self):
self.role = "AI/ML Engineer"
self.current_work = [
"Google Summer of Code 2025 - Extralit OCR Pipeline",
"Deep Learning & Generative AI Research",
"Production-grade RAG Systems"
]
self.tech_stack = {
"languages": ["Python", "SQL", "R", "JavaScript"],
"ai_ml": ["PyTorch", "TensorFlow", "Scikit-learn", "Transformers"],
"data": ["Pandas", "NumPy", "Power BI", "Elasticsearch"],
"deployment": ["Streamlit", "Hugging Face", "Docker", "Redis"]
}
def current_projects(self):
return [
"π¬ Deep Research AI Agent with 90%+ citation accuracy",
"π Production OCR pipeline with Table-Transformer models",
"π€ AI-powered web scraper processing 1000+ URLs concurrently"
]
Core Stack: Python β’ PyTorch β’ PyTorch β’ TensorFlow β’ Scikit-learn
Data & ML Ops: Pandas β’ Redis β’ Elasticsearch β’ Streamlit
Deployment: Docker β’ Hugging Face β’ Power BI β’ Git
π Some Cool Wins
- Google Summer of Code 2025 - Extralit OCR Pipeline
- Amazon ML Summer School - Top 3.5% (10K+ applicants)
- UST D3code Hackathon - National Winner π₯ (8K+ teams)
- Industrial Ideathon 2025 - 1st Runner Up (Awarded by Delhi CM)
Multi-stage AI system that processes 50+ web sources simultaneously and generates research reports in minutes instead of hours. Built with custom agents, MCP integration, and achieving 90%+ citation accuracy.
Concurrent data pipeline processing 1000+ URLs with 95% extraction accuracy. Integrates multiple LLM APIs (GPT-4, Gemini, Llama) with smart Pydantic schemas.
Fine-tuned DialoGPT-medium (117M parameters) with 7-message context windowing. Deployed on Hugging Face with optimized inference and conversation memory.
Always up for discussing AI, open source, or just geeking out about the latest ML papers. Hit me up!