Sourabh Sharma

AI | ML | SW Engineer

Machine Learning Engineer with 4 years of experience delivering end-to-end AI solutions with a strong engineering-first approach. An early adopter in Self-Supervised Learning, LLMs, GenAI, and recent Agentic systems, consistently explored the frontier of the AI space. Practices a first-principles problem-solving approach driven by a passion in owning impactful and performant systems that drive growth.

Areas of Interest

LLMs
Agentic
GenAI
Computer Vision
Full Stack

Key Skills

Machine Learning & AI

PyTorch
TensorFlow
SageMaker
Vertex AI
Kubeflow
Weights & Biases
MLflow
DVC
ONNX
Optuna
Hydra
PyTorch Lightning
scikit-learn
DeepSpeed
XGBoost

Cloud & Infrastructure

Docker
AWS
Google Cloud
Cloudflare
Kubernetes
Terraform
Airflow
Prefect
OpenTelemetry

Data Engineering & Analytics

Dask
Spark
Apache Kafka
dbt
PostgreSQL
Redis
Elasticsearch
Apache Superset
BigQuery
Google Analytics
Ray
Apache Beam

Web Development & Design

React
Next.js
Tailwind CSS
FastAPI
Flask
Hono
Playwright
Selenium
Zustand
Zod
Drizzle ORM
Deno
tRPC

Programming & Development

Python
TypeScript
Bash
Go

Work Experience

Aidaptive by JarvisML

Cupertino, USA

AI-driven personalization platform focusing on hospitality and e-commerce.

ML Engineer

Sep '22 – Present
Hospitality Image Attribute Extraction
  • Led end-to-end implementation, serving 44 shopify and 55 hospitality VRM clients, powering personalized search, recommendations and metadata generation.

  • Crawled Booking-com, Airbnb to compile 11M+ image-text pairs identifying ~114 granular hierarchical attributes (e.g., pool, hot_tub, fireplace, beach_view) in listing images, achieved ~ 0.87 multi-label F1.

PyTorch
NLP
Computer Vision
Web Crawling
Natural Language to SQL - Ask A Metric
  • Developed an on-premise prototype for both internal and client use, allowing non-technical stakeholders to execute SQL using natural language.

  • Experimented with various models (SQL-Coder, LLaMA, GPT-3.5) with a RAG vector store for schemas. Generated Postgres compatible SQL for typical queries even with multi-level joins; partial support for BigQuery.

LLaMA
GPT-3.5
SQL-Coder
RAG
PostgreSQL
BigQuery
GenAI for Consistent Character Generation
  • Explored GenAI models for on-brand visual content in collaboration with a production house for a then upcoming action movie "Crackk," featuring Vidyut Jammwal, Nora Fatehi, and Arjun Rampal.

  • Trained LoRAs on SDXL2 for each actor, employed a post workflow involving AI Upscaling, Inpainting, Face-Swapping and Face-Restoration to generate high-quality likeness.

SDXL2
LoRA
AI Upscaling
Face Restoration
Inpainting
  • Built visual similarity recommendation pipelines for hospitality and ecommerce clients, on an average boosting RPI by 14%.

Aisle3

London, UK

Early-stage e-commerce aggregator for multi-merchant product discovery.

Computer Vision Researcher

May '21 – Sep '22
Product Matching
  • Developed the core business IP; multi-stage pipeline fusing multi-modal inputs to perform exact-match of products across merchants.

  • Leveraged Self-supervised methods (DINO, SimCLR, MoCo) with Swin Transformers for robust feature extraction, achieving P@1 = 0.91 and Recall = 0.96.

  • Employed Faiss + Elasticsearch for scalable dense vector retrieval, orchestrated via Airflow, handling 60GB+ throughput.

PyTorch
Swin Transformers
DINO
SimCLR
MoCo
Faiss
Elasticsearch
Airflow
Product Image Attribute Extraction
  • Designed a multi-headed classifier to extract fine-grained attributes (pattern, material, ankle height) for footwear products, reaching an F1 = 0.93 on curated eval.

  • Developed color detection using ROI pixel analysis in CIELAB space, achieving F1 0.91.

  • Trained utility models — product categorizer, pose detection, out-of-domain detection

PyTorch
OpenCV
Multi-head Classification
Computer Vision

Founding ML Engineer

Nov '20 – May '21
  • Led infrastructure setup, establishing backend systems and data ingestion and ETLs.

  • Built serverless end-to-end Product Matching workflows on AWS Stepfunctions.

  • Developed a human-in-the-loop annotation tool using FastAPI and Faiss, used to create ~120K pairwise annotations.

AWS StepFunctions
FastAPI
Faiss
Docker
Python

Early Work(May '19 – Aug '20)

Computer Vision

@ FaceX
internship
5 mosBangalore, India

Startup providing face recognition and liveness detection solutions.

  • Developed peri-ocular recognition for masked facial recognition during the pandemic.

  • Built a Face Liveness detection SDK (tf.js on WebGL, OpenCV on WASM) for real-time mobile Aadhar KYC (Govt. ID OCR + Face ID and Liveness) for identity verification

  • Collaborated with Scamlytics on sensitive data to identify scam accounts.

Data Scientist

@ Jumper.ai
contract
3 mosSingapore

Platform enabling social commerce through chatbots and automation.

  • Integrated Dialogflow for chatbots and built a menu-digitization app.

Data Scientist

@ 73 Strings
part-time
3 mosBangalore, India

Financial technology firm providing AI-driven valuation tools.

  • Led web scraping and data curation efforts for financial analytics and a universal About Us page scraper.

Data Science

@ Gumption Labs
internship
2 mosBangalore, India

Startup focused on automated trading solutions for retail investors.

  • Explored Genetic Programming for trading rules discovery.

  • Automated Financial Insights posting using Selenium, driving increased user engagement