Hedcut portrait of Leon Chen

Leon Chen

NYC metropolitan

I am a data scientist and I love building products.


01 · Fun Stuff

Projects

Optimization

Inventory Rebalancing MIP

Mixed-integer program (Python + Gurobi) that rebalances inventory across a multi-sided supply network. Productionized with monitoring and scenario overrides.

Savings
$20M/yr
Scope
Multi-network
  • Python
  • Gurobi
  • MIP
  • Snowflake
Forecasting

E-commerce Demand Forecast

Migrated UK/US retail forecasting from univariate ARIMA/ETS to a hybrid LightGBM + ensemble averaging the top-3 models per SKU.

WMAPE
−20%
SKUs
500+
  • LightGBM
  • ARIMA
  • ETS
  • Python
AI · Agents

Copilot Agentic Framework

Org-wide agentic framework on Microsoft Copilot for data querying, analysis, and reporting. Deployed across 30+ internal projects.

Projects
30+
Surface
Org-wide
  • Copilot
  • LLM
  • Agents
  • Python
Causal Inference

Impact Measurement Framework

Difference-in-differences and synthetic-control methods to measure the incremental impact of optimization recommendations on service levels.

Methods
DiD · SCM
Audience
Leadership
  • Causal
  • Python
  • R
  • Statsmodels
AI · Full Stack

Coffeed News — AI News Aggregator

Topic-driven personalized news digest. Scrapes RSS, Reddit, X, and podcasts, ranks via a four-pillar INRF formula (insight + relevance + niche) over pgvector embeddings, then synthesizes emails with a DeepSeek → OpenAI → Gemini → Grok LLM cascade. Next.js + FastAPI + Celery on Railway.

Stack
Next.js · FastAPI · Celery
Pipeline
4 stages, 4×/day
  • Next.js
  • FastAPI
  • Supabase
  • pgvector
  • Celery
  • LLM
AI · Trading

AutoWallStreet — Bot Trading Arena

AI bot trading arena where autonomous bots compete with $100K paper money on real US stocks (15-min delayed). Three FastAPI microservices on Railway share a Supabase Postgres, a polling order engine handles fills / margin / dividends / splits, and a React SPA on Vercel streams live prices over WebSocket.

Services
3 microservices
Seasons
2-month leagues
  • FastAPI
  • React
  • Supabase
  • Railway
  • WebSocket
  • X OAuth
iOS · Speech

Molly Talk — Live Caption & Translate

Native iOS app (Swift + SwiftUI, MVVM) for real-time speech transcription, translation, and language-learning coaching. Uses the native Speech framework for continuous recognition and Google Gemini 2.0 Flash for on-the-fly translation and vocabulary extraction. Firebase Auth + Firestore history.

Languages
10 supported
Latency
Live captions
  • Swift
  • SwiftUI
  • Speech
  • Gemini
  • Firebase
Generative · Vision

Virtual Try-On Pipeline

Two-stage virtual try-on: Google Vertex AI's try-on model generates the composite, then Gemini restores face, body, and garment detail before a 2K upscale. React + Vite frontend on Vercel, FastAPI backend on Railway, Supabase for auth, storage, and batch history.

Stages
Gen → Restore → 2K
Mode
N × M batch
  • Vertex AI
  • Gemini
  • FastAPI
  • React
  • Supabase
Forecasting

SKU-Level Demand Forecast

Led the design and deployment of SKU-level demand forecasting models using XGBoost and Random Forest, improving forecast accuracy by 30%.

Accuracy
+30%
Grain
SKU-level
  • XGBoost
  • Random Forest
  • Python
NLP

Product Embeddings for Pricing

Product embeddings (Amazon Titan, CLIP, Doc2Vec) on large-scale product catalog to drive pricing and assortment strategy at Coach.

Signal
Visual + text
Use
Pricing · Assortment
  • CLIP
  • Titan
  • Doc2Vec
  • PyTorch
MLOps

Automated ML Pipelines

End-to-end retraining pipelines with Airflow + Kubernetes + Docker powering forecasting and pricing models at Coach.

Cadence
Auto-retrain
Infra
K8s + Airflow
  • Airflow
  • Kubernetes
  • Docker
  • AWS
Forecasting

ARIMAX Demand Forecasting

Built ARIMAX time-series models in R and Python to forecast demand across Chinatex's textile accounts, incorporating exogenous drivers (promotions, seasonality, price).

Accuracy
+25%
Stack
R · Python
  • ARIMAX
  • R
  • Python
  • Time Series
Supply Chain

Dynamic Safety-Stock Framework

Inventory management framework with dynamic safety-stock targets reacting to forecast error and lead-time variance — held the in-stock rate at 95% across the account portfolio.

In-stock
95%
Scope
Account portfolio
  • Safety Stock
  • Inventory
  • Python
  • SQL
Data Eng · BI

Automated ETL + Tableau Reporting

Designed and automated ETL pipelines and Tableau dashboards for daily performance reporting, replacing manual Excel workflows and giving the commercial team same-day visibility.

Cadence
Daily
Audience
Commercial team
  • ETL
  • SQL
  • Tableau
  • Python
Full Stack

Personalized National Park Itinerary

Flask web app on Heroku that builds a tailored route across 63 US national parks. A Python TSP solver returns itineraries in under 60 seconds, backed by an AWS RDS / SQLAlchemy schema and an ETL pipeline fed by the Google Maps and NPS APIs.

Satisfaction
9 / 10
Runtime
< 60s
  • Python
  • Flask
  • TSP
  • AWS RDS
  • JavaScript
Multimodal

Hateful Memes Classification

Multimodal classifier on the Hateful Memes benchmark using CLIP encoders fused via self-attention, cross-fusion, and MLP heads in PyTorch. Tuned learning rate, dropout, weight decay, and batch size against the challenge baselines.

Accuracy
+15%
AUROC
+14%
  • CLIP
  • PyTorch
  • Attention
  • Multimodal

02 · Elsewhere

Connect