- Contributed to an end-to-end RAG pipeline including document ingestion, semantic chunking, embeddings, and retrieval logic.
- Built evaluation scripts using RAGAS to measure faithfulness, precision, and recall.
- Helped structure metadata, citation extraction, and document versioning for clearer traceability.
- Assisted with debugging retrieval issues across embeddings, vector search, and chunk quality.
- Collaborated in a multi-developer team using Git, Jira, and code reviews.
Data Science • Machine Learning • Cloud Analytics
Final-semester Master of Data Science student at RMIT. I enjoy working on machine learning, data analytics, and cloud-based projects. Currently exploring retrieval systems, forecasting, and customer analytics. Open to graduate opportunities from Dec 2025.
About Me
I'm a Master of Data Science student at RMIT graduating in December 2025.
My interests are in applying machine learning, analytics, and cloud technologies to solve practical problems. I've worked on projects involving retrieval systems, customer analysis, forecasting, and full-stack development.
I enjoy learning new tools and improving my engineering skills, and I'm looking for opportunities where I can grow, contribute, and apply what I've learned in real projects.
I'm excited to keep learning and apply my skills in real-world environments. I enjoy solving problems, understanding data, and building things that are useful.
Professional Experience
Delivered data-driven insights and automated reporting solutions to improve business decision-making and operational efficiency.
- Enhanced stakeholder engagement: Developed and delivered 5+ interactive dashboards using Tableau and Excel, resulting in a ~10% increase in client engagement metrics.
- Optimized data infrastructure: Refactored SQL queries and ETL processes, reducing report generation latency by ~25% and enabling faster executive decision-making.
- Automated manual workflows: Built Python-based web scraping and reporting automation using BeautifulSoup, eliminating ~30% of manual data collection effort and improving data freshness.
Featured Projects
Solara Healthcare RAG System
Built components of a RAG pipeline including ingestion, semantic chunking, embeddings, retrieval logic, and evaluation using RAGAS. Worked on improving context precision and recall while ensuring clean metadata and traceable citations.
Customer Segmentation & RFM Analysis
Conducted comprehensive customer segmentation analysis for an Australian retail dataset, identifying a high-value segment representing ~15% of total revenue. Implemented data quality improvements and RFM analysis to enable targeted marketing strategies.
- Applied K-Means clustering with optimal cluster selection using elbow method
- Performed data cleaning and validation to correct invalid order IDs
- Built interactive Power BI dashboards for stakeholder visualization
Climate Policy Forecasting Dashboard
Developed predictive models for climate policy analysis using SARIMA time-series forecasting, achieving ~1.2°C Mean Absolute Error. Deployed an interactive Flask dashboard with geospatial mapping to support policy decision-making.
- Implemented SARIMA models with seasonal decomposition and stationarity testing
- Built interactive Flask web application with Folium geospatial visualizations
- Designed intuitive UI for non-technical policy stakeholders
Sales Performance Analytics Platform
Automated reporting infrastructure for 4 years of sales data, enabling real-time insights and identifying opportunities for ~7% revenue uplift through regional performance analysis and trend identification.
- Designed and implemented ETL pipelines for automated data processing
- Created comprehensive Power BI dashboards with drill-down capabilities
- Performed regional analysis to identify growth opportunities
Cloud Music Subscription Platform
Built a scalable cloud-based music streaming platform using Java servlets, AWS DynamoDB, and Apache2. Implemented user authentication, playlist management, and streaming capabilities with cloud-native architecture.
Real Estate Data Quality Pipeline
Developed automated data cleaning pipeline for real estate datasets, implementing address parsing, duplicate detection, and schema validation to improve data quality and reliability in MySQL-based systems.
Technical Expertise
Technical Insights & Learnings
A practical guide to implementing end-to-end RAG systems with focus on ingestion hygiene, comprehensive metadata management, and consistent evaluation approaches.
Insights on designing end-to-end ETL pipelines and effectively communicating data insights to non-technical stakeholders through intuitive visualizations and actionable metrics.
Best practices for communicating model uncertainty, assumptions, and limitations when presenting time-series forecasts to policy makers and decision-makers in high-stakes environments.
Let's Connect
I'm actively seeking full-time and contract opportunities in Data Science, Machine Learning Engineering, and Analytics starting December 2025. I'd love to discuss how I can contribute to your team and continue learning in a professional environment.