In GenAI product development- Your role as a PM essentially shifts from writing a PRD to writing Evals in addition to PRD. The Evals are your reality check on how well the model is actually performing for your intended usecases / user queries. Also the Eval metrics matter a lot. I would treat them as input metrics and engagement (e.g DAU) as output metrics. Its because if your output quality is great, users will come back so focus on making your solutions really shine and matter. Ofcourse you st ...Read More
How do you adapt the development process for ML/AI features where data and model performance introduce uncertainty?
-
2,234 Views
-
Cortex VP of Product | Formerly Splunk, Deloitte • 7mo
Based on my experience at Cortex, here's how I'd approach adapting the development process for ML/AI features: All AI-driven features go through a Research Preview designation before GA. This serves two critical purposes: Sales enablement: Clearly flags "AI" features so prospects and AI-skeptic customers can opt-in, avoiding lengthy security reviews and contract addendums Risk management: Keeps features behind feature flags until they meet quality thresholds This acknowledges upfront that AI fea ...Read More
434 Views -
Salesforce Senior Director of Product Management • 7mo
AI and ML features add uncertainty because model behavior and data quality change faster than traditional software. For example, companies like OpenAI/DeepSeek are refreshing their model every week. Something these changes break your pipeline or impact the quality of your solution. Here are few best practices that you need to add your AI development and release cycle: Benchmark continuously – set up internal golden datasets and auto-run evaluations after every model or data change. Test for vari ...Read More
669 Views -
Superhuman Head of Product, Enterprise • 7mo
There are a few well known techniques that can help adapt product development in the world of ML/AI features Evals: Evals are essentially data sets that represent your key usecases. You can run your features against these key data sets to make sure your model is working as expected. Do it repeatedly to understand variation over time. LLMs Judges: Use another LLM to evaluate the output of your primary LLM. This is useful when human evaluation is either too slow or expensive. An LLM judge can be p ...Read More
451 Views
Related Ask Me Anything Sessions
Ignition SVP, Product, Tammy Hahn on Product Development Process
April 9 @ 9:00AM PT
Google Group Product Manager, Puja Hait on AI Product Development
December 3, 2025 @ 9:00PM PT
Top Product Management Mentors
-
Paresh VakhariyaView ProfileDirector of Product Management (Confluence) · Atlassian
-
Vasudha MithalView ProfileChief Product Officer · Care Solace
-
Liron DeutschView ProfileProduct Management Leader
-
Deepti PradeepView ProfileSenior Director of Product Management & Growth (Creative Cloud) · Adobe
-
Bruno GobbisView ProfileDirector, Product Growth · Nuvemshop
-
Victor DronovView ProfileHead of Product, Trello · Atlassian
-
Aleks BassView ProfileChief Product Officer · Typeform
-
Devika NairView ProfileDirector of Product Management · Oracle
-
Deepak MukunthuView ProfileSenior Director of Product, Agentforce AI Platform · Salesforce
-
Ashka VakilView ProfileSr. Director, Product Management · strongDM