Hello, I'm

Vaishnavi Bhamare

Master's student in Advanced Data Analytics

I love using data to solve real-world problems and turn messy data into clear, actionable insights. Skilled in Artificial Intelligence, Machine Learning, Python, SQL, Power BI, I enjoy building ETL pipelines, dashboards, and predictive models that help stakeholders make confident decisions.

Vaishnavi portrait

education

University of North Texas, Denton, Texas
Master of Science – Advanced Data Analytics · GPA 4.0/4.0
Expected Graduation: May 2026
Veermata Jijabai Technological Institute (VJTI), Mumbai
Bachelor of Technology – Electrical Engineering · GPA 3.8/4.0
Graduated: May 2024

experience

Teaching Assistant · University of North Texas
Oct 2024 – Present · Texas, United States
  • Supporting 60+ students’ class by assisting with instructions, classroom coordination, and weekly course operations.
  • Grading assignments, quizzes, projects and exams using standardized rubrics to ensure accuracy and fairness.
  • Guiding students on assignments, troubleshooting issues and improving hands-on learning.
Research Assistant · University of North Texas
Jan 2025 – Present · Texas, USA
  • Developing SQL reports and BI dashboards to support research and operational decisions.
  • Designing data collection and transformation pipelines to improve reporting accuracy and efficiency.
  • Orchestrating Git-based analytics workflows for reproducible models and experiments.
Data Analytics Intern · TATA Power, Dharavi Receiving Station
Feb 2024 – Jul 2024 · Mumbai, India
  • Automated Python/SQL data pipelines, improving processing speed by 60% and enabling near real-time transformer monitoring for 50K+ devices.
  • Performed anomaly detection and trend analysis on transformer performance and reliability.
  • Proposed automation-focused improvements to reduce manual reporting and error risk.
Analytics Internship · COE Lab, VJTI
May 2023 – Jan 2024 · Mumbai, India
  • Implemented SQL/Python ETL pipelines to ingest and transform 100K+ grid simulation records, improving data readiness for modeling.
  • Enhanced integration workflows for large-scale power system simulations.
  • Built analytical reports summarizing performance trends across operating scenarios.

projects

YouTube Trending Video Analytics Pipeline
SQL · Python (Pandas, SQLite) · Streamlit · Matplotlib
  • Engineered a robust ETL pipeline to ingest, clean, and process 37,000+ semi-structured video records from YouTube trending datasets.
  • Integrated SQL with Python scripts to automate data extraction and transformation across multiple categories and time periods.
  • Developed a real-time analytics dashboard using Streamlit to monitor spikes in views, likes, and category trends, reducing manual reporting time by 80%.
  • Optimized SQL queries with indexed tables and efficient joins, improving performance by ~40%.
VIEW MY WORK HERE →
Flight Pricing Forecast
Python · Random Forest · ANOVA · Seaborn
  • Cleaned and prepared 10K+ airline pricing records to build a regression pipeline for fare prediction.
  • Applied Random Forest Regressor (R² ≈ 0.82) and ANOVA to explain pricing variance across airlines, holidays, and baggage rules.
  • Suggested pricing segmentation strategies based on customer personas, transferable to domains like cloud or hospital resource allocation.
VIEW MY WORK HERE →
Car Selling Price Prediction
Python · Scikit-learn
  • Built a scalable preprocessing pipeline (missing value imputation, one-hot encoding, feature scaling) for 6K+ car listings.
  • Trained Random Forest Regressor with optimized hyperparameters using GridSearchCV.
  • Visualized top predictive features (age, km driven, fuel type), simulating telemetry insights for IoT systems.
VIEW MY WORK HERE →
Car Sales Performance Dashboard
Power BI · Excel
  • Built an interactive dashboard analyzing 1,000+ car sales records by model, segment, and region.
  • Added filters for time, geography, and discounts to help decision-makers explore profitability.
  • Integrated Excel preprocessing with Power BI for a repeatable, refreshable reporting workflow.
VIEW MY WORK HERE →
Retail Sales Forecasting
Python · Scikit-learn
  • Built an end-to-end ML pipeline using 120K+ retail transactions to predict daily sales.
  • Performed cleaning, missing value imputation, outlier checks, and created multiple EDA visualizations.
  • Engineered pricing gaps, inventory indicators, and profitability features.
  • Developed and compared Linear Regression, Random Forest, and Gradient Boosting models; Random Forest achieved R² ≈ 0.998 after fixing data leakage.
  • Showed that units ordered and price were the strongest revenue drivers.
VIEW MY WORK HERE →
ATS Resume Keyword Analyzer
Python · Streamlit
  • Built an interactive Streamlit web app to evaluate resume–JD alignment using keyword scoring.
  • Implemented regex-based parsing and a curated keyword map across ML, BI, Analytics, and Cloud categories.
  • Generated automated improvement suggestions to help users tailor resumes for ATS systems.
  • Designed a clean UI with dynamic match scores, progress bars, and CSV downloads.
VIEW MY WORK HERE →
Blinkit Sales Analysis – Power BI Dashboard
Power BI · Excel
  • Built a Power BI dashboard analyzing 8.5K Blinkit grocery items across 1.5K outlets.
  • Created DAX-based KPIs (Total Sales, Avg Sales, Avg Rating) for data-driven insights.
  • Visualized sales by outlet type, size, and fat content with dynamic filters and trend charts.
  • Found that Tier 3 outlets & low-fat products generated the highest revenue.
VIEW MY WORK HERE →
Advanced Sonar Signal Classification System
Python · Scikit-learn · PCA · SHAP · XGBoost · Streamlit
  • Performed EDA on 2,080 sonar signal instances and used PCA for dimensionality reduction to detect underwater mines vs. rocks.
  • Evaluated 5+ classifiers and deployed XGBoost (≈95% accuracy) with SHAP explainability.
  • Deployed via Streamlit for real-time inference with <300ms latency in simulations.
VIEW MY WORK HERE →
Super Store Sales – Power BI Dashboard
Power BI · Excel
  • Built a Power BI dashboard analyzing 10K+ Super Store transactions across U.S. regions.
  • Created DAX KPIs and visuals for segment, category, and payment trends.
  • Found that Consumer segment and Office Supplies dominate total sales, with peaks in Nov–Dec.
VIEW MY WORK HERE →
Diabetes Risk Prediction System
Python · Scikit-learn · Stratified K-Fold
  • Analyzed 768 patient records and identified key predictors such as glucose and BMI.
  • Built and compared Random Forest, Logistic Regression, and AdaBoost, achieving ROC-AUC of 0.89.
  • Created explainability visualizations suitable for bias-sensitive healthcare environments.
VIEW MY WORK HERE →

skills

Programming & Databases
Python R SQL MySQL PostgreSQL MongoDB TensorFlow PyTorch NumPy Pandas Matplotlib Scikit-learn
Data Engineering & Analytics
ETL Pipelines Data Wrangling Random Forest XGBoost Logistic Regression Gradient Boosting A/B Testing Hypothesis Testing SHAP Time Series
BI, Cloud & Tools
Power BI Tableau Looker Excel (Advanced) Streamlit Git & GitHub GCP AWS VS Code
Certifications
Google Data Analytics Professional Certificate IBM Data Analytics Professional Certificate Microsoft Power BI Certificate
Interests
Large-Scale Experimentation Product & Growth Analytics Recommendation Systems Search & Ranking LLM-powered Assistants Causal Inference Data Platform Design Real-time Dashboards

blogs & articles

WHAT A STRATEGY CARD GAME TAUGHT ME ABOUT AI DECISION-MAKING IN 2025
Most people learn AI by training models and tuning metrics. But those exercises don’t show how AI actually thinks. This semester, during an academic research project, I worked with an AI simulation framework designed to play a strategy card game.
READ MORE →