Career Work
All Projects
25 production ML systems built across 4+ years — every project here shipped to production and was owned end-to-end, from research through maintenance.
Revenue Forecasting Platform
Production ML system predicting pipeline and booking outcomes at daily frequency across current and future quarters.
Marketing Mix Modeling Platform
Full-stack marketing attribution and budget optimization platform—from raw channel spend to response curves and scenario planning.
Multi-Touch Attribution System
Probabilistic multi-touch attribution using Markov Chain removal-effect modeling — crediting each channel based on its actual influence on conversion probability across the full customer journey.
Full Pipeline Projection Engine
Central projection system integrating 8+ ML model families into a single daily revenue forecast across four quarter horizons.
Demand Generation Potential Model
Estimates pipeline contribution from opportunities not yet visible in the CRM — the 'unseen' quarter contribution.
Walk-In Pipeline Projection
Regression-based estimate of within-quarter pipeline creation from sources not visible at quarter start.
Account Propensity Model
Dual propensity scoring system assigning Base Fit (long-term ICP fit) and 3-Month (short-term conversion likelihood) scores to every account.
Lead Propensity Model
Scores each lead's conversion likelihood across four quarter horizons (CQ/NQ/NQ+1/NQ+2) using engagement patterns and journey sequences.
Opportunity Propensity Model
Predicts each opportunity's close likelihood per quarter, replacing static CRM probability fields with dynamically scored multi-quarter estimates.
Account Propensity X Month
Monthly-granularity refactor of the Account Propensity model, enabling month-level account prioritization alongside quarterly scores.
Time-Decay Forecast Adjustment
Dynamic adjustment layer that re-weights prior EOQ history by recency, capturing momentum and velocity signals closer to quarter end.
Average-Index Forecast Adjustment
Statistical fallback layer using day-of-quarter index averages when ML model signals are insufficient.
Forecast Explainability & SHAP Logging
Dual-algorithm explainability infrastructure — SHAP for tree models, coefficient contribution for linear models — with BigQuery logging.
Configurable Multi-Model ML Framework
30+ parameter framework for runtime algorithm selection, feature group configuration, and model stacking across the forecasting platform.
Campaign Performance Prediction Engine
Two-stage system predicting campaign pipeline potential before launch and monitoring active campaigns daily.
Revenue Metrics Pipeline Migration
Migrated the revenue metrics pipeline from single-machine pandas to distributed PySpark, resolving critical implementation bugs to unblock daily ML scoring.
Campaign Entity Resolution
Fuzzy matching + BERT embedding system to deduplicate and resolve campaign entities across CRM and marketing platform data sources.
Smart Filter Parser & SQL Generation
Metadata-driven SQL generation system translating user-defined filter configurations into executable BigQuery queries.
Generic Regressor Framework
Reusable ML regression framework standardizing feature engineering, training, validation, and scoring — adopted as the baseline for all subsequent platform models.
Model Metric Dashboard
Internal ML governance dashboard tracking train/test MAPE, MAE, RMSE, and classification metrics across all deployed platform models.
Product Prediction Model
Predicts which products are likely associated with a lead conversion, improving downstream opportunity size estimation.
Opportunity Size Prediction
Predicts booking dollar value for leads and early-stage opportunities, enabling per-record expected value computation.
Customer Segmentation Clustering
Unsupervised clustering model grouping accounts by firmographic and behavioral attributes for GTM targeting.
Booking Prediction Model
Record-level ML system predicting booking likelihood and value, contributing to pipeline and booking projection.
Record Level Likelihood & Win Rate
ML-derived win rate methodology using per-record conversion likelihoods instead of binary counts for smoother, more accurate pipeline reporting.