Data Scientist | AI/ML Engineer | Generative AI Specialist
Here is the work I'm most proud of...
Abstract: Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts...
D. Potdar, T. Bavalatti et al. “A Systematic Review of Open Datasets used in Text-to-Image (T2I) Gen AI Model Safety,” IEEE Access, Jan. 2025
It was an absolute honor to share the stage with distinguished leaders, including the President of Duke University, the President of Durham Tech, and the Superintendent of Durham Public Schools. Our panel, 'Perspectives on Collaborative Projects,' explored the challenges and successes of working across disciplines and sectors to tackle pressing regional issues.
With 57 schools under its belt and enrolling 32,000 students, Durham Public Schools (DPS) faced a significant challenge – how can we pinpoint which students need the most financial support to succeed in school?
Read MoreI’m a Data Scientist with 5 years of experience delivering end-to-end machine learning applications with a passion for building scalable AI solutions where safety isn’t an afterthought — it’s a priority. I’ve taken multi-million dollar machine learning projects from prototyping to production, blending technical expertise with a knack for translating complex AI into real business impact. For me, AI isn’t just about algorithms — it’s about solving real problems, driving decisions, and making technology work for people.
A selection of my top projects in AI & Data Science
Duke ProfMatch is an AI-powered academic discovery tool that leverages Graph Retrieval-Augmented Generation (RAG) to help Duke students find professors aligned with their research interests. It integrates entity extraction, knowledge graph construction, and vector search using Neo4j to deliver intelligent faculty recommendations. The platform features an interactive graph-based UI, enhancing research exploration and engagement.
Interpretable X-Ray Classification leverages the Neural Prototype Tree (ProtoTree) to enhance model transparency in chest X-ray diagnostics. Unlike traditional CNN-based models that act as black boxes, ProtoTree integrates decision tree-based interpretability into its deep learning pipeline, enabling clinicians to understand why a model reaches a diagnosis.
While extracting information from documents, traditional object detection and NLP fail to develop a semantic understanding of documents. This project gives code to convert Structured Documents to Graphs using Optical Character Recognition and a GCN implementation in TensorFlow. Read more on Towards Data Science (a Medium Publication).