Two weeks. Use-case triage, feature-availability audit, model roadmap, build-vs-buy decisions, kill list. Fifteen pages, not a hundred.
Machine learning consulting that ends with a model in production.
Most machine learning consulting ends with a Jupyter notebook nobody can deploy. Ours ends with a model serving real traffic, a feature store (a versioned store of the data inputs the model relies on) the team trusts, an eval (automated test suite that scores model output) harness that catches regressions, and a retraining runbook. Everything the team needs to run it without us.
Most ML consulting ends in a notebook. Ours ends in a serving layer.
Strategy sprint. PoC. Full build. Retainer.
ML strategy sprint
Two weeks. For teams with twelve possible model bets and budget for three. Use-case triage against a fixed rubric, feature-availability audit, build-vs-buy decisions, six-month roadmap, kill list. Fifteen pages, not a hundred.
Production proof-of-concept
Fourteen days to live. Pick one model, one offline metric, one online metric. Train on your data, serve in your environment, hit real users behind a feature flag. Eval harness, dashboard, runbook. Refundable if it doesn't ship.
Full ML platform build
Six to twelve weeks. Feature stores, model registries, training pipelines, serving layers, monitoring, drift detection, scheduled retraining. Integration into the system that consumes the prediction. This is where ML consulting becomes development.
Embedded team augmentation
Monthly retainer. Senior engineers embedded with your team. Architecture decisions, eval design, code review, on-call escalation, hiring help. We're in your Slack. We don't displace the team - we make the team's first six months three.
One model in production. Monitoring included.
Not a Jupyter notebook. Not a research paper. A model training on your data, serving in your stack, hitting real users, with metrics your team owns. Dashboard, runbook, eval suite, retraining schedule - everything your team needs to run it without us.
Book a fit call →From $25k strategy sprint to $150k+ full build - four flat-fee shapes.
The range you fall into is set by integration surface, data sensitivity, and whether the model has to serve under tight latency. We publish it because we hated being on the other side of the call where the price quote turns into three weeks of email-tag.
Fourteen days to one model live in your stack. Training pipeline, eval harness, monitoring dashboard, runbook included. Refundable if it doesn't ship.
Six to twelve weeks. Feature store, model registry, serving layer, drift detection, retraining cadence, integration into the consuming system.
Senior engineers embedded with your ML team. Architecture review, eval design, code review, on-call escalation, hiring help.
We run our own ML product in production. You get that methodology.
Sentinel is JAAX's live Shopify analytics product. Real merchants. Drift alerts we have answered at two in the morning. Every consulting engagement is shaped by what we have learned shipping it. The proof is that you can buy the product the methodology built.
See SentinelFor the team that has tried once and knows the difference between a demo and something deployed.
The buyers we do our best work for share three traits:
- A number they want moved - deflection rate, recovery rate, time-to-quote, cost-per-ticket
- At least one AI initiative already attempted - they know the difference between a working agent and a working demo
- A window, usually a quarter, to show something running
We work with Series A startups whose CTO is the buyer.
We work with mid-market companies whose head of data inherited an ML portfolio they didn't staff for.
We work with Fortune 1000 divisions that have given up on the central data-science org and want one model shipped well in their own P&L.
If you need a hundred-page maturity assessment or a Kaggle competition, call a Big Four firm. We're not better at that than they are, and we'll tell you so on the fit call.
Questions we get on every fit call.
Machine learning consulting is the engineering practice - model selection, training pipelines, feature engineering, MLOps (the operations layer that keeps the agent running reliably in production), monitoring. AI consulting is the strategy practice - which projects to fund, which to kill, how to sequence them. We do both, but a CTO shopping for ML consulting wants the implementers, not the strategists. This page is for the implementers. The strategy practice lives at /services/ai-consulting/.
Python and TypeScript at the edges, depending on the surface. PyTorch and scikit-learn for modeling, with Hugging Face transformers when the task is NLP. Feast or Tecton for feature stores when the team has one; we will build the simplest possible feature layer in Postgres when they don't. MLflow or Weights and Biases for the model registry. Evidently AI or a hand-rolled drift harness for monitoring. Modal, Replicate, Bedrock, or AWS SageMaker for serving - we pick by cost and latency, not by what is on the conference circuit.
An ML strategy sprint is two weeks, no extensions. A production proof-of-concept is fourteen days from kickoff to a model serving real traffic. A full implementation - feature store, registry, monitoring, retraining - is six to twelve weeks depending on data hygiene and integration surface. We refuse engagements that don't fit a 14-day sprint at the unit level. If the work cannot be sliced into 14-day deliverables, we have not finished scoping it.
Strategy sprints run $25–45k. Production proofs-of-concept run $50–150k. Full implementations start at $150k and scale with integration depth and data complexity. Embedded team augmentation is a monthly retainer. Pricing is the same regardless of industry - we charge by engagement shape, not by domain.
Both, and we are unromantic about which one wins. The honest answer for most production problems is gradient-boosted trees on tabular data, a transformer when the input is text or sequence, and a pretrained vision model fine-tuned on a few hundred labeled examples when the input is an image. We have shipped XGBoost in production more often than we have shipped a custom-trained transformer. The right answer is whichever model the eval picks.
Both. The work is the same; the procurement is different. We have shipped models for Series A startups and for divisions inside Fortune 1000s. The constraint is not company size - it is whether the buyer can name a number they want moved and a person who owns it. ML projects without a named owner on the business side go badly regardless of how good the model is.
Every model we ship leaves with a drift detector and a retraining cadence written into the runbook before the model is live. We track data drift (population stability index), model drift (prediction distribution), and serving health (fallback-rate alerts). Retraining is scheduled - weekly, monthly, or event-triggered - not vibes-based. The point of MLOps is that models age; we plan for it on day one.
Yes, mutual NDA before any technical conversation. We do not work for clients with conflicting active engagements in the same competitive set during a quarter - a rule we enforce on ourselves more strictly than most clients ask us to.
If you need a hundred-page maturity assessment, hire a Big Four. If you need a model in production by the end of the month, hire us. The two-person constraint is the feature. There is no junior layer running notebooks you'll never see. The people writing the training loop are the people on your kickoff call.
Send a paragraph. We'll come back the same day.
Tell us what model you want shipped and the metric you want moved. We'll come back with a yes, a no, or a sharper question. No discovery deck, no pitch meeting marathon.
Book a 30-min fit call