All Projects
A comprehensive archive of my machine learning architectures, multimodal AI pipelines, and dataset engineering tools.
DhritiOCR
Document level OCR System for Malayalam
Developed DhritiOCR, a comprehensive text extraction Web UI and ML pipeline. Achieved state-of-the-art performance for an 80M-parameter class by fine-tuning a custom recognition model for Malayalam that overcomes the inherent structural challenges of low-resource scripts.
ADAPT
Audio Data Annotation & Preprocessing
A high-performance CLI dataset generation pipeline for TTS and ASR. Ingests raw audio or YouTube streams and outputs clean, diarized, transcribed data, leveraging CUDA and ROCm for maximum hardware acceleration.
Clara
Cybersecurity Anomaly & Risk Assessor
An intelligent security partner designed to accelerate vulnerability detection through advanced code comprehension. Post-trained on a refined PrimeVul dataset to identify complex security weaknesses that traditional tools overlook.
M-Synth
High-Fidelity Synthetic OCR Dataset Generator for Indic Scripts
A high quality OCR dataset generation toolkit designed to accelerate vision-model training. It automatically curates character-level balanced datasets with highly adjustable visual augmentations, utilizing a custom rendering pipeline that solves complex font-breakage issues across multi-language scripts.
bspwm Dotfiles
Custom bspwm Configuration & Scripts
A vanilla-compatible, laptop-optimized rebuild of gh0stzk's bspwm-dotfiles. Focuses on stability, debloating, and custom shell integrations for dynamic power-profile and refresh-rate switching.
rEFInd-Synthwave
Custom rEFInd UEFI Boot Manager Theme
A premium, retro-futuristic theme for the rEFInd UEFI boot manager. Features custom hand-crafted icons, vibrant neon gradients, and a robust shell backup utility to transform your system's boot experience.