Navigating AI in Product Management: The ONE CARD Framework

The rush to embed AI into every product roadmap has created a paradox. Product managers have more tools than ever to build intelligent features, yet fewer guardrails to determine when AI actually serves the customer versus when it becomes expensive theater. I have been asked this question multiple times in various discussions on how AI can do the job of a Product Manager. However, the correct question in my mind is how can a seasoned Product Manager employ AI agents to get their tasks done faster. Use of AI in product management is not a replacement of the Product Manager but to help the Product Manager get through their tasks like research, data synthesis, competitor analysis, and much more by deploying AI tools and agents. This brings down the time taken for mundane, routine and research tasks from days to hours and hours to minutes. While you will burn the new currency i.e. TOKENs at an alarming paced depending on how fast you want to go, there is opportunity to get outsized ROI through precision prompting, targeted output definition and continuous training!

I framed my thinking of Product Management with AI into a framework called The ONE CARD rubric which is aimed at cutting through current AI hyperbole and noise through precision questioning (one for each letter) that forces rigor before a single model is trained or API called.

Clarity: Could a Customer Act on This?

AI outputs are worthless if they sit idle. Before greenlighting any AI-driven feature, ask whether the insight, prediction, or automation produced can be directly operationalized by the person receiving it.

• A churn prediction model that flags at-risk accounts but offers no recommended intervention path fails this test
• A SQL query generator that returns syntactically valid but semantically wrong code actively harms the user
• However, a demand forecast that feeds automatically into procurement workflows passes with distinction

The clarity standard demands that the AI’s output terminates in a concrete action or decision, not merely in information for its own sake. If your customer must perform additional translation, interpretation, or guesswork to benefit, your product has not delivered clarity—it has delegated cognitive burden.

Ask: What Specific Outcome Are You Improving?

Vague aspirations like “enhance user experience” or “leverage AI capabilities” doom products to measurement ambiguity. This is what I refer to as the “generic AI slop” that any AI chatbot would happily generate for you. The Ask question forces precision (and potentially saves costly token burning chat sessions :D): what metric moves, by how much, and for whom?

• Baseline the current non-AI state with empirical rigor. It is extremely hard to measure progress without it.
• Define the target outcome in business terms (revenue per user, support ticket resolution time, forecast error reduction)
• Specify the population segment where improvement is expected

Without this anchor, teams optimize model accuracy while the business bleeds. A recommendation engine achieving 94% precision means nothing if it increases cart abandonment because the latency of inference degrades checkout flow. The outcome question keeps product and model objectives aligned.

Risk: What Breaks When the AI Is Wrong?

Every AI system fails. The Risk question compares failure modes between AI and non-AI approaches with eyes wide open.

False positive costs: Automated fraud flags blocking legitimate transactions; medical triage models sending healthy patients to emergency care

False negative costs: Missed defects in manufacturing inspection; undetected security anomalies in log analysis

Systemic risks: Training data drift degrading performance silently; adversarial manipulation of input features; regulatory non-compliance in automated decision-making

The non-AI baseline matters critically. Human-operated processes have failure rates too—often higher, but differently distributed. The product manager must weigh whether AI errors are more corrigible, more frequent, or more catastrophic than the status quo, and whether the product design includes appropriate human-in-the-loop or human-on-the-loop safeguards.

Data: How Will You Prove ROI Was Justified?

AI investments consume compute, talent, and customer trust. The Data question mandates the evidentiary framework before deployment, not after. Trust BUT verify takes on a different meaning in the AI Agentic world and doing that at machine speed in a consistent and repeatable way is still an UNSOLVED problem!

Counterfactual infrastructure: Can you run controlled experiments isolating AI impact from confounding variables?

Longitudinal tracking: Are you measuring sustained improvement or novelty effects that decay?

Total cost accounting: Does your ROI calculation include model retraining, monitoring infrastructure, incident response, and compliance auditing?

The data requirement is particularly acute for product managers serving technical audiences. Your users will scrutinize whether “AI-powered” represents genuine capability augmentation or marketing veneer. Specify the telemetry, the experimental design, and the decision criteria for continuation or sunsetting.

Putting ONE CARD Into Practice

The framework’s power lies in its sequential discipline. Clarity prevents solutionism. Ask prevents metric confusion. Risk prevents blind optimism. Data prevents sunk-cost entrenchment. Each question builds a decision record that product managers can defend to engineering skeptics, finance controllers, and customers alike.

For data platform products specifically—query optimization, schema recommendation, anomaly detection—the rubric offers particular value. These domains suffer from plausible-sounding AI applications that collapse under operational scrutiny. A query planner using learned cardinality estimates must pass the clarity test (can the DBA override and trust the plan?), the ask test (what’s the p95 latency improvement versus the legacy optimizer?), the risk test (when estimates fail, does the system degrade gracefully or catastrophically?), and the data test (how do we attribute performance changes to the model versus hardware upgrades or data skew shifts?).

The Hard Truth

Not every product deserves AI. The ONE CARD rubric makes this determination explicit rather than politically fraught. A feature that fails multiple dimensions may indicate that traditional deterministic approaches, improved data quality, or simpler heuristic methods deliver superior customer value at lower risk.

The product manager who internalizes this framework becomes the credible voice for when not to use AI—a rarer and more valuable stance than cheerleading every neural network trend. Your roadmap gains integrity. Your engineering partnerships deepen. And your customers receive products that solve problems rather than showcase technology.


ONE CARD in Action: NLP-to-SQL with a Frontier Model

Theoretical frameworks collapse without concrete application. Below is a worked example of the ONE CARD rubric applied to a real product decision: whether to embed a frontier LLM (GPT-5.3, Claude 4 Sonnet, or equivalent) into your data platform to let business users write natural language questions and receive executable SQL queries.

The Product Context

Your organization runs a cloud data warehouse (Snowflake, Databricks, BigQuery). Analysts and business users currently file tickets with a centralized BI team to get answers. Average turnaround is 2-3 days. The product proposal: integrate a frontier model via API to generate SQL from natural language, cutting latency to minutes and democratizing data access.

Clarity: Can the Customer Act Correctly?

The generated SQL must be executable and semantically faithful to the user’s intent. This is where text-to-SQL systems most commonly fail.

Syntax-only success is insufficient. A query that runs but joins orders to members on the wrong key produces a plausible-looking result set that misleads the business decision

Schema hallucinations remain unsolved. Frontier models occasionally reference tables or columns that do not exist, especially on large enterprise schemas with 200+ columns per table. This unfortunately increases review times for Coding outputs, PRDs and technical outputs like SQL queries.

Business semantics require translation. The model must know that “active user” at your company means last_login > 7_days_ago and account_status != 'disabled'—not infer from column names alone. This is where ONE semantic truth is highly important to have and where the data foundation maturity gets underscored twice in building a trusted data foundation for our AI workload.

As an example, Uber’s QueryGPT addressed this by adding a “Table Agent” that surfaces proposed tables to the user for acknowledgment before generation, plus a “Column Prune Agent” that strips irrelevant schema metadata to reduce hallucination surface area. Even with these safeguards, 22% of users reported they still needed to modify generated queries before execution.

Clarity verdict: Passes only with human-in-the-loop confirmation and explicit semantic layer integration. Raw frontier-model-to-SQL fails this test for non-technical users.

Ask: What Specific Outcome Improves?

Vague goals like “democratize data access” obscure measurement. The Ask question forces specificity.

Baseline Metrics vs Targets:
• Average query authoring time (technical users): 10 min → 3 min
• Business user ticket-to-insight time: 2.5 days → 15 min
• Self-service query volume (non-technical users): 5% of total queries → 25%
• Analyst time reclaimed from ad-hoc requests: 40% of week → 15%

The outcome must be tied to a cohort. If the product targets Operations managers at Uber, the metric is their monthly interactive query volume and the productivity gain per query. If targeting finance analysts, the metric might be forecast cycle time reduction.

Critical discipline: Define what does not improve. Text-to-SQL will not reduce data governance overhead. It will not eliminate the need for semantic modeling. It will not make complex multi-table joins reliable without investment in context infrastructure.

Risk: What Breaks When the AI Is Wrong?

Text-to-SQL failure modes differ materially from the non-AI baseline (human analyst writes query).

Wrong aggregation (SUM vs COUNT):
• AI: Silent—query runs, returns wrong number (Higher severity due to silent failure)
• Non-AI: Caught in code review or QA

Incorrect join path:
• AI: Returns inflated/deflated result set (Higher on undocumented schemas)
• Non-AI: Analyst knows schema relationships

Schema hallucination:
• AI: Query fails with “column not found” (Obvious, recoverable)
• Non-AI: Human does not hallucinate schema

Semantic drift:
• AI: Model uses stale training context (Higher without live context refresh)
• Non-AI: Analyst reads updated documentation

Data exposure:
• AI: Model includes sensitive columns if not filtered (Depends on prompt engineering rigor)
• Non-AI: Role-based access controls enforce limits

Uber explicitly tracks “Run Has Output” as a safety metric—queries that execute successfully but return zero rows often indicate hallucinated filter values (e.g., WHERE status = 'Finished' instead of WHERE status = 'Completed').

The dbt Labs 2026 benchmark reveals a structural insight: for queries covered by a well-modeled Semantic Layer, accuracy approaches 100% because the LLM cannot produce subtly wrong joins or aggregations—the deterministic engine handles SQL generation. The risk gap between raw text-to-SQL and semantic-layer-mediated approaches is enormous.

Risk mitigation design:
• Intent Agent to classify user questions into bounded business domains (workspaces)
• Validation Agent that executes generated SQL against test data and checks result plausibility
• Mandatory human approval for queries targeting financial or customer PII tables

Data: How Will You Prove ROI?

AI infrastructure costs are non-trivial. Frontier model API calls for complex schemas can consume 40-60K tokens per request. Without pre-defined measurement, the project becomes a faith-based initiative.

Cost Categories (Annual Estimates):
• Frontier model API (per 1K queries/month): $15K-$45K
• Semantic layer construction: 3-4 engineer-months
• Evaluation set curation: 2 analyst-months
• Monitoring infrastructure: $8K-$12K
• Incident response (bad query reaches production): Unquantified liability

Benefit Measurement:
• Analyst time reclaimed: Pre/post time allocation surveys; sprint velocity change
• Business decision velocity: Time from question to board-ready metric
• Query volume shift: Ratio of self-service to ticketed queries
• Error rate reduction: Comparison of production incident root causes (AI vs human)

Uber’s evaluation framework is instructive: they run standardized question sets through “Vanilla” (full AI) and “Decoupled” (human-in-the-loop) product flows, tracking intent accuracy, table overlap score, execution success, and qualitative SQL similarity via LLM-as-judge. This enables component-level debugging—knowing whether failure originates in intent classification, schema retrieval, or SQL generation.

The Decision Record

Applying ONE CARD to this NLP-to-SQL proposal yields a conditional go using the example rubric below:

Clarity: Go with Table Agent confirmation and semantic layer pre-modeling
Ask: Go with Operations analyst cohort, 10→3 minute target, quarterly evaluation
Risk: Go with workspace isolation, validation agent, and PII table restrictions
Data: Go with token cost telemetry, golden SQL evaluation set, and analyst time-tracking integration

Without these four commitments, the product manager would be shipping a prototype disguised as production infrastructure. The frontier model is capable—but capability without guardrails is liability.


Summary Table: ONE CARD Applied to NLP-to-SQL

Clarity: Could a customer act or leverage the insight provided by AI correctly?
Application: Can the analyst execute the generated SQL and trust the result without manual rewrite?

Ask: What is the specific outcome that you are looking to improve or achieve?
Application: Reduce query authoring time from 10 min to 3 min; increase self-service analytics adoption

Risk: What is the risk in using AI for this approach vs non-AI way of achieving the output?
Application: Wrong joins/aggregations return plausible but incorrect results; schema hallucinations on large databases

Data: What data will you need to collect to verify that the ROI was justified?
Application: Query execution success rate, semantic accuracy vs golden SQL, time-to-insight, support ticket reduction

Note: This post used AI to research and author parts of this post. 

Introducing Perplexity Comet: The AI Browser That Works For You

I have been working on AI technologies both on my personal time and in my day job. And I wanted to share more about the new AI-browser, Comet. Perplexity Comet is an innovative AI-powered browser designed to transform how you browse the web and get things done. With Comet, you can automate tasks that usually take hours, such as finding the lowest possible airfare for a multi-city trip. Just describe your travel plan in plain English—Comet handles the rest, searching across different airlines, comparing options, and surfacing the best deals without any manual work. Now with OpenAI launching it’s own AI browser and Gemini (from Google) making progress in this area, one can easily foresee a big browser wars on the horizon!

Give Perplexity Comet a try. It’s key benefits include:

  • AI Assistant Integration: Comet provides contextual answers, page summaries, follow-up queries, and actionable insights within every tab through an embedded AI assistant.
  • Agentic Actions: The assistant can perform tasks such as shopping, filling carts, scheduling meetings, managing emails, engaging with calendar events, and unsubscribing from newsletters—often operating directly on behalf of the user.​
  • Workflow Organization: Users can organize tabs, automate research, and streamline distracting tasks to stay focused and productive.
  • Privacy and Personalization: Comet is built to learn user preferences, collaborate in research, and maintain privacy as a core principle, helping keep browsing sessions efficient and tailored.

I really like the user experience it offers and the in-browser automation for common tasks like looking up best prices for a product, lowest airfares, writing a post, etc. Comet rolled out globally for free in October 2025, aiming to democratize access to AI-powered browsing and productivity tools. Earlier this year, Perplexity and Venmo/PayPal offered early access to Perplexity’s new Comet Browser with Free Perplexity Pro Subscription.

Disclaimer valid as of the date of publication:
Artificial Intelligence was employed to aid in the composition of this post.
I may receive a referral bonus from the referral links included in this post.

Addressing SQL Server and TDE with AKV errors

I recently wrote an Azure Data Studio Notebook on how to setup TDE for SQL Server 2019 Standard Edition (yes, SQL Server 2019 Standard Edition has TDE) using Azure Key Vault. I ran into a few issues that I had to debug, which I am outlining below. Make sure that you are following the pre-requisites when you are setting TDE with Azure Key Vault.

The first one was a 404 error. When I looked the application event log, I saw the following error:

Operation: getKeyByName
Key Name: ContosoRSAKey0
Message: [error:112, info:404, state:0] The server responded 404, because the key name was not found. Please make sure the key name exists in your vault.

The simple reason for the above error is that I was using an incorrect key name or the key didn’t exist in my Azure Key Vault. So the remediation is to check if the key exists in your Azure Key Vault. If not, then create the key.

Another error I ran into was a 401 error. The following information was included with the event:

Operation: acquireToken
Key Name:
Message: [error:108, info:401, state:0] Server responded 401 for the request. Make sure the client Id and secret are correct, and the credential string is a concatenation of AAD client Id and secret without hyphens.

The CREATE CREDENTIAL command has the following syntax:

CREATE CREDENTIAL Azure_EKM_TDE_cred WITH IDENTITY = 'SQLStandardKeyVault', -- for global Azure -- WITH IDENTITY = 'ContosoDevKeyVault.vault.usgovcloudapi.net', -- for Azure Government -- WITH IDENTITY = 'ContosoDevKeyVault.vault.azure.cn', -- for Azure China 21Vianet -- WITH IDENTITY = 'ContosoDevKeyVault.vault.microsoftazure.de', -- for Azure Germany SECRET = '<combination of AAD Client ID without hyphens and AAD Client Secret>' FOR CRYPTOGRAPHIC PROVIDER AzureKeyVault_EKM_Prov

The IDENTITY here is the name of your Azure key vault.
The SECRET here is your AAD Client ID (with the hyphens removed) and your AAD Client Secret concatenated together. You will need to create a “New Client Secret” for your Azure AD app registration. See steps here.

Your AAD Client ID will be a GUID and so will your Client Secret will be a random alphanumeric string. If you don’t have the client secret, then create new one and use that.

Upcoming sessions at Microsoft Ignite and PASS Summit

The next week of November is going to be an action packed week for me with two back to back conferences: Microsoft Ignite 2019, Orlando, Florida and PASS Summit 2019, Seattle, Washington.

Below are the sessions that I will be delivering at Microsoft Ignite.

Mission critical performance with SQL Server 2019 – In this session, my colleague, Kevin Farlee, and I will be talking about the various performance and scale improvements that SQL Server 2019 will be delivering at a great price performance that lets you run SQL Server 2019 with the best TCO for your Tier-1 workloads.

Azure SQL Database Edge – Overview – In this session, my colleague, Sourabh Agarwal, and I will talk about the new innovations we are bringing to the edge for ARM64 and x64 with Azure SQL Database Edge. We will also talk about some of the scenarios where Azure SQL Database Edge helped make our customers successful in their IoT applications.

Azure Arc: Bring Azure Data Services to On-Premises, Multi-Cloud and Edge – In this session, James Rowland Jones and I will walk you through the Azure Arc announcements and show you deployments of our data services on Azure Arc.

If you are going to Ignite and are interested in Data, then we hope to see you at our sessions.

At PASS Summit, I will be deliver another session on Azure SQL Database Edge which will talk more about how you can “Develop once, deploy anywhere” with our edge database offering.

Looking forward to see the #SQLFamily at PASS Summit and Microsoft Ignite.

SQL PASS Summit 2017

It is that time of the year when I get to meet the SQL Family. It is always wonderful to put a face to that Twitter handle that I exchanged #sqlchats with or connected with on LinkedIn. SQL PASS Summit is probably one of the largest gathering of data professionals under a single roof.

This year, I will be presenting a session on “Building One Million Predictions Per Second Using SQL-R”.

Date: Nov 3rd, 2017
Time: 11am
Room: Tahoma 5 (TCC Level 3)
Abstract:
Using the power of OLTP and data transformation in SQL 2016 and advanced analytics in Microsoft R Server, various industries that really push the boundary of processing higher number of transaction per second (tps) for different use cases. In this talk, we will walk through the use case of predicting loan charge off (loan default) rate, architecture configuration that enable this use case, and rich visual dashboard that allow customer to do what-if analysis. Attend this session to find out how SQL + R allows you to build an “intelligent data warehouse”.

There will be a number of sessions delivered at PASS from the Tiger team this year and you will find a lot of the folks at the SQL Server Clinic.

If you have a technical question, a troubleshooting challenge, give product feedback, or want to find out about best practices running your SQL Server? Then the SQL Clinic is the place you want to go to. SQL Clinic is the hub of technical experts from SQL Product Group (including Tiger team), SQL CAT, SQL Customer Support Services (CSS), SQL Premier Field Engineering (PFE) and others.

SQL PASS Summit gives me an unique opportunity to meet the #SQLFamily during an annual event, gather feedback from customers and get to see some old friends from across the globe!

Hope to see you at the event!