Hello, I'm

Vaishnavi Bhamare

Master's student in Advanced Data Analytics

I love using data to solve real-world problems and turn messy data into clear, actionable insights. Skilled in Artificial Intelligence, Machine Learning, Python, SQL, Power BI, I enjoy building ETL pipelines, dashboards, and predictive models that help stakeholders make confident decisions.

VIEW MY WORK GET IN TOUCH

education

University of North Texas, Denton, Texas

Master of Science – Advanced Data Analytics · GPA 4.0/4.0

Expected Graduation: May 2026

Veermata Jijabai Technological Institute (VJTI), Mumbai

Bachelor of Technology – Electrical Engineering · GPA 3.8/4.0

Graduated: May 2024

experience

Teaching Assistant · University of North Texas

Oct 2024 – Present · Texas, United States

Supporting 60+ students’ class by assisting with instructions, classroom coordination, and weekly course operations.
Grading assignments, quizzes, projects and exams using standardized rubrics to ensure accuracy and fairness.
Guiding students on assignments, troubleshooting issues and improving hands-on learning.

Research Assistant · University of North Texas

Jan 2025 – Present · Texas, USA

Developing SQL reports and BI dashboards to support research and operational decisions.
Designing data collection and transformation pipelines to improve reporting accuracy and efficiency.
Orchestrating Git-based analytics workflows for reproducible models and experiments.

Data Analytics Intern · TATA Power, Dharavi Receiving Station

Feb 2024 – Jul 2024 · Mumbai, India

Automated Python/SQL data pipelines, improving processing speed by 60% and enabling near real-time transformer monitoring for 50K+ devices.
Performed anomaly detection and trend analysis on transformer performance and reliability.
Proposed automation-focused improvements to reduce manual reporting and error risk.

Analytics Internship · COE Lab, VJTI

May 2023 – Jan 2024 · Mumbai, India

Implemented SQL/Python ETL pipelines to ingest and transform 100K+ grid simulation records, improving data readiness for modeling.
Enhanced integration workflows for large-scale power system simulations.
Built analytical reports summarizing performance trends across operating scenarios.

projects

YouTube Trending Video Analytics Pipeline

SQL · Python (Pandas, SQLite) · Streamlit · Matplotlib

Engineered a robust ETL pipeline to ingest, clean, and process 37,000+ semi-structured video records from YouTube trending datasets.
Integrated SQL with Python scripts to automate data extraction and transformation across multiple categories and time periods.
Developed a real-time analytics dashboard using Streamlit to monitor spikes in views, likes, and category trends, reducing manual reporting time by 80%.
Optimized SQL queries with indexed tables and efficient joins, improving performance by ~40%.

VIEW MY WORK HERE →

Flight Pricing Forecast

Python · Random Forest · ANOVA · Seaborn

Cleaned and prepared 10K+ airline pricing records to build a regression pipeline for fare prediction.
Applied Random Forest Regressor (R² ≈ 0.82) and ANOVA to explain pricing variance across airlines, holidays, and baggage rules.
Suggested pricing segmentation strategies based on customer personas, transferable to domains like cloud or hospital resource allocation.

VIEW MY WORK HERE →

Car Selling Price Prediction

Python · Scikit-learn

Built a scalable preprocessing pipeline (missing value imputation, one-hot encoding, feature scaling) for 6K+ car listings.
Trained Random Forest Regressor with optimized hyperparameters using GridSearchCV.
Visualized top predictive features (age, km driven, fuel type), simulating telemetry insights for IoT systems.

VIEW MY WORK HERE →

Car Sales Performance Dashboard

Power BI · Excel

Built an interactive dashboard analyzing 1,000+ car sales records by model, segment, and region.
Added filters for time, geography, and discounts to help decision-makers explore profitability.
Integrated Excel preprocessing with Power BI for a repeatable, refreshable reporting workflow.

VIEW MY WORK HERE →

Retail Sales Forecasting

Python · Scikit-learn

Built an end-to-end ML pipeline using 120K+ retail transactions to predict daily sales.
Performed cleaning, missing value imputation, outlier checks, and created multiple EDA visualizations.
Engineered pricing gaps, inventory indicators, and profitability features.
Developed and compared Linear Regression, Random Forest, and Gradient Boosting models; Random Forest achieved R² ≈ 0.998 after fixing data leakage.
Showed that units ordered and price were the strongest revenue drivers.

VIEW MY WORK HERE →

ATS Resume Keyword Analyzer

Python · Streamlit

Built an interactive Streamlit web app to evaluate resume–JD alignment using keyword scoring.
Implemented regex-based parsing and a curated keyword map across ML, BI, Analytics, and Cloud categories.
Generated automated improvement suggestions to help users tailor resumes for ATS systems.
Designed a clean UI with dynamic match scores, progress bars, and CSV downloads.

VIEW MY WORK HERE →

Blinkit Sales Analysis – Power BI Dashboard

Power BI · Excel

Built a Power BI dashboard analyzing 8.5K Blinkit grocery items across 1.5K outlets.
Created DAX-based KPIs (Total Sales, Avg Sales, Avg Rating) for data-driven insights.
Visualized sales by outlet type, size, and fat content with dynamic filters and trend charts.
Found that Tier 3 outlets & low-fat products generated the highest revenue.

VIEW MY WORK HERE →

Advanced Sonar Signal Classification System

Python · Scikit-learn · PCA · SHAP · XGBoost · Streamlit

Performed EDA on 2,080 sonar signal instances and used PCA for dimensionality reduction to detect underwater mines vs. rocks.
Evaluated 5+ classifiers and deployed XGBoost (≈95% accuracy) with SHAP explainability.
Deployed via Streamlit for real-time inference with <300ms latency in simulations.

VIEW MY WORK HERE →

Super Store Sales – Power BI Dashboard

Power BI · Excel

Built a Power BI dashboard analyzing 10K+ Super Store transactions across U.S. regions.
Created DAX KPIs and visuals for segment, category, and payment trends.
Found that Consumer segment and Office Supplies dominate total sales, with peaks in Nov–Dec.

VIEW MY WORK HERE →

Diabetes Risk Prediction System

Python · Scikit-learn · Stratified K-Fold

Analyzed 768 patient records and identified key predictors such as glucose and BMI.
Built and compared Random Forest, Logistic Regression, and AdaBoost, achieving ROC-AUC of 0.89.
Created explainability visualizations suitable for bias-sensitive healthcare environments.

VIEW MY WORK HERE →

skills

Programming & Databases

Python R SQL MySQL PostgreSQL MongoDB TensorFlow PyTorch NumPy Pandas Matplotlib Scikit-learn

Data Engineering & Analytics

ETL Pipelines Data Wrangling Random Forest XGBoost Logistic Regression Gradient Boosting A/B Testing Hypothesis Testing SHAP Time Series

BI, Cloud & Tools

Power BI Tableau Looker Excel (Advanced) Streamlit Git & GitHub GCP AWS VS Code

Certifications

Google Data Analytics Professional Certificate IBM Data Analytics Professional Certificate Microsoft Power BI Certificate

Interests

Large-Scale Experimentation Product & Growth Analytics Recommendation Systems Search & Ranking LLM-powered Assistants Causal Inference Data Platform Design Real-time Dashboards

blogs & articles

WHAT A STRATEGY CARD GAME TAUGHT ME ABOUT AI DECISION-MAKING IN 2025

Most people learn AI by training models and tuning metrics. But those exercises don’t show how AI actually thinks. This semester, during an academic research project, I worked with an AI simulation framework designed to play a strategy card game.

MEDIUM ARTICLE

contact

Email: vaishnavibhamare24@gmail.com

LinkedIn: linkedin.com/in/vaishnavibhamare

GitHub: github.com/vaishnavibhamare

Handshake: handshake.com/vaishnavibhamare