- Build and maintain automated pipelines to collect and structure publicly available data (APIs + website/RSS sources) into analysis-ready datasets.
- Apply data quality controls (deduplication, normalisation, timestamp validation, and QA flags) and document rules/assumptions to keep outputs reliable.
- Develop text-processing workflows to convert unstructured content into consistent fields for trend and theme analysis over time.
- Produce stakeholder-friendly summaries of “what changed / why it matters / what to do next”, and iterate based on feedback to improve signal-to-noise.
Data Analytics & Data Science — Reliable Data, Clear Insights
Master of Data Science graduate (RMIT, Dec 2025) with internship experience across data ingestion, cleaning/validation, and stakeholder reporting. I build analysis-ready datasets and dashboards, and apply ML when it improves decisions — with a strong focus on data quality, reproducibility, and practical delivery.
About Me
I’m a Master of Data Science graduate from RMIT (Dec 2025), based in Melbourne.
I work across the full analytics lifecycle: collect data → clean/validate → structure it into report-ready tables → analyse trends/patterns → communicate insights clearly to stakeholders. I care a lot about accuracy, traceability, and repeatable outputs.
Recent work includes healthcare document ingestion (PDF/HTML), metadata governance and deduplication, database design in PostgreSQL, and building QA/audit checks to improve trust in reporting.
I’m comfortable collaborating with both technical and non-technical teams, clarifying requirements, and documenting assumptions so the data stays usable over time.
Professional Experience
- Built ingestion pipelines for a healthcare content library (PDF/HTML), including metadata governance and SHA-256 deduplication to improve dataset reliability.
- Developed automated download and parsing workflows with retries and content-type handling to improve consistency for downstream analysis and retrieval.
- Designed PostgreSQL data structures for content (metadata + embeddings) with indexing patterns, and exposed curated datasets via a FastAPI service with basic automated tests (Pytest).
- Implemented tenant-aware access controls using PostgreSQL Row-Level Security (RLS) to enforce segregation across multi-hospital deployments.
- Built QA and audit tooling (flag logging, CSV exports, citation audit harness, labelled evaluation dataset) and tracked quality trends using reproducible metrics.
- Delivered 5+ stakeholder dashboards (Tableau, Excel) and performed source-to-dashboard cross-checks to improve metric consistency.
- Optimised SQL query performance, reducing data retrieval time by ~25% for executive reporting.
- Automated reporting workflows with Python (BeautifulSoup), cutting manual effort by ~30%.
Featured Projects
Selected work across Business Intelligence, analytics delivery, and applied data systems. Each project links directly to the proof (repo / live demo).
Sales & Customer Performance Dashboards (Tableau | Superstore)
Built an end-to-end Tableau BI dashboard focused on executive KPIs and performance diagnostics. The Sales dashboard includes KPI tiles for Sales/Profit/Quantity with monthly sparklines and min/max markers, sub-category bar-in-bar comparisons (current year vs previous year), and weekly trend views with average reference lines. Interactivity includes a Select Year parameter, a show/hide filter panel, and dashboard navigation.
Healthcare RAG Evaluation (Sanitized)
Sanitized case study capturing documented design decisions and evaluation artifacts without sharing proprietary code.
Australian Retail Customer Segmentation (Python + Power BI)
Segmented Australian retail customers using RFM-style features and clustering, then delivered insights through a Power BI dashboard built on a customer-level dataset.
Sales Analytics Dashboard (Power BI + MySQL)
Power BI sales dashboard packaged with a MySQL database dump so the dataset can be restored locally and the model reproduced.
Australian Climate Forecast Dashboard (Flask + SARIMAX + Folium)
Flask dashboard that generates on-demand weekly SARIMAX forecasts for Australian locations and renders an interactive map.
Real Estate Data Cleaning (MySQL)
Data cleaning workflow for the Nashville housing dataset using MySQL statements (date standardization, address parsing, duplicate removal).
Technical Expertise
Let’s Connect
Open to full-time and contract roles in Data Analytics / Data Science, with a focus on reliable data systems, data quality, and decision-ready reporting.