How do you adapt the development process for ML/AI features where data and model performance introduce uncertainty?

Answer

4 Answers

Product Development Process

Puja Hait

Google Group Product Manager • 7mo

In GenAI product development- Your role as a PM essentially shifts from writing a PRD to writing Evals in addition to PRD. The Evals are your reality check on how well the model is actually performing for your intended usecases / user queries. Also the Eval metrics matter a lot. I would treat them as input metrics and engagement (e.g DAU) as output metrics. Its because if your output quality is great, users will come back so focus on making your solutions really shine and matter. Ofcourse you st ...Read More

2,240 Views
Kara Gillis

Cortex VP of Product | Formerly Splunk, Deloitte • 8mo

Based on my experience at Cortex, here's how I'd approach adapting the development process for ML/AI features: All AI-driven features go through a Research Preview designation before GA. This serves two critical purposes: Sales enablement: Clearly flags "AI" features so prospects and AI-skeptic customers can opt-in, avoiding lengthy security reviews and contract addendums Risk management: Keeps features behind feature flags until they meet quality thresholds This acknowledges upfront that AI fea ...Read More

434 Views
Manjeet Singh

Salesforce Senior Director of Product Management • 8mo

AI and ML features add uncertainty because model behavior and data quality change faster than traditional software. For example, companies like OpenAI/DeepSeek are refreshing their model every week. Something these changes break your pipeline or impact the quality of your solution. Here are few best practices that you need to add your AI development and release cycle: Benchmark continuously – set up internal golden datasets and auto-run evaluations after every model or data change. Test for vari ...Read More

675 Views
Farheen Noorie

Superhuman Head of Product, Enterprise • 8mo

There are a few well known techniques that can help adapt product development in the world of ML/AI features Evals: Evals are essentially data sets that represent your key usecases. You can run your features against these key data sets to make sure your model is working as expected. Do it repeatedly to understand variation over time. LLMs Judges: Use another LLM to evaluate the output of your primary LLM. This is useful when human evaluation is either too slow or expensive. An LLM judge can be p ...Read More

458 Views