ShipSet is for existing Product Managers (any seniority) who want to upskill into AI Product Manager roles. The curriculum assumes PM fundamentals (PRDs, working with engineers, stakeholders, sprints) and teaches the AI-specific layer on top. For people new to product management, ShipSet includes a separate PM Primer (7 lessons, always free, not part of the paid program) that covers PM fundamentals before the main 90-day program.

How is ShipSet different from Udemy, Lenny, Maven, or Reforge?

ShipSet is builder-first. By Day 28 you ship a working AI prototype with a live URL. By Day 90 you have a live AI feature with eval evidence, a validated cost model, a 90-second demo Loom, and a 10-piece portfolio. Other AI PM courses are lecture-first with optional projects. ShipSet inverts that ratio: real builds first, concepts taught as they become relevant.

Do I need to code to become an AI Product Manager through ShipSet?

No. Light coding helps but is not required. The build lessons use no-code AI builders (v0, Lovable, Cursor, Claude Code) to ship working prototypes. Learners read prompts and modify them; deep coding is not part of the curriculum.

How much does ShipSet cost?

Founding members (first 50) pay ₹2,499 ($79) one-time for lifetime access. Annual is ₹3,499 ($99) per year. Monthly is ₹699 ($14.99) per month. INR via Razorpay, USD via card (coming after LLP). All plans include the same content. The first 5 main lessons are free for everyone, no card required. The separate PM Primer (7 lessons) is always free regardless of plan. Every paid plan has a 7-day money-back guarantee.

How long does ShipSet take to complete?

The program is 90 lessons of about 15 minutes each, for a total of roughly 22 hours over 90 days. There is no calendar pressure: lessons unlock as completed, so learners move at their own pace. Most working PMs complete in 90 to 120 days at 3 hours per week.

Can ShipSet be completed on a phone only?

Partly. About 75 percent of lessons (reading, journaling, critique, reflection) work well on a phone. The other 25 percent are build sessions (writing specs, building prototypes, validating costs with real API calls, recording Loom walkthroughs) and need a laptop. Plan for roughly one laptop session per week.

What does the ShipSet portfolio include at Day 90?

The Day 90 portfolio includes ten artifacts: a live working AI feature with a public URL, a 90-second Loom walkthrough, an AI-native PRD, a 20-row eval suite with measurable scores, a validated cost model, an AI UX flow with placement decisions, an AI product metrics framework, a business case / ROI document, a launch plan with monitoring strategy, and a compiled portfolio document. Plus a public ShipSet certificate with verification URL.

AI Cost Modeling for PMs: The 7-Variable Workbook + 3 Real Models

The launch reviewer asks one question every time: "what does this cost to run per user?" If you do not have a number, the feature does not ship. If you have a wrong number, you ship and then explain to finance why the AWS bill tripled. The cost model is the second-most-important pre-launch artifact after the eval suite, and most PMs treat it as something engineering will figure out. They will not. It is your job.

This post is the cost-modeling format we teach in ShipSet Lesson 47 ("Token economics and unit cost") and Lesson 49 ("Build cost validation"). Seven variables, a workbook you can put in a spreadsheet tonight, three real cost models from shipped features, and the variable PMs forget that turns a "looks cheap" feature into a margin-killer.

Why most AI cost models are wrong

The default PM cost model looks like this:

"We pay $3 per million tokens. The average request is 1,000 tokens. So each request is $0.003. We expect 10,000 requests per day, so that is $30 per day, $900 per month. Cheap."

This is wrong in three ways at once, and each wrongness compounds. By the time you see the actual bill it is 4-10x your model. Here is what gets missed:

1. Input vs output token pricing. Most providers charge 2-5x more for output tokens than input tokens. If your feature reads 100 tokens of input and writes 900 tokens of output, the output dominates the bill. The PM model that uses one "blended" price misses this entirely.

2. System prompts are tokens too. Your system prompt is 800 tokens. Every single request pays for those 800 tokens. The "1,000 token average" the PM quoted is what the user typed plus the response; the actual request is 1,000 + 800 system prompt + maybe 600 tokens of injected context = 2,400 tokens. The model just lost 2.4x.

3. Caching is an asterisk, not a free lunch. Anthropic and OpenAI both offer prompt caching that drops input cost by 10x for the cached portion. PMs see "10x cheaper" and build models around the cached price. In practice your cache hit rate is 60-80%, not 100%, and only the system-prompt portion caches, not the user input. Real savings: 30-50%, not 90%.

The cost model that survives launch accounts for all three.

The 7-variable workbook

Put this in a spreadsheet. One model per AI feature. Every variable is a cell you can change to do sensitivity analysis.

Variable	What it is	Where the number comes from
`input_tokens_per_request`	Including system prompt + user input + injected context	Count tokens on 10 sample requests
`output_tokens_per_request`	Average length of model response	Same 10 samples
`input_price_per_million`	Provider's per-million-input-token rate	Provider pricing page
`output_price_per_million`	Per-million-output-token rate	Same
`requests_per_user_per_month`	Activity rate of your typical user	Production logs OR estimated DAU × sessions × requests/session
`cache_hit_rate`	What fraction of input tokens hit the prompt cache	Provider analytics OR estimate at 0.5 if you don't know
`target_users`	How many users at the price point you are modeling	Your launch plan

The cost per user per month is then:

non_cached_input = input_tokens_per_request * (1 - cache_hit_rate)
cached_input = input_tokens_per_request * cache_hit_rate
cost_per_request = (
  non_cached_input * input_price_per_million / 1_000_000 +
  cached_input * (input_price_per_million * 0.1) / 1_000_000 +
  output_tokens_per_request * output_price_per_million / 1_000_000
)
cost_per_user_per_month = cost_per_request * requests_per_user_per_month
total_cost_per_month = cost_per_user_per_month * target_users

That formula plugs into your business case. The number cost_per_user_per_month is what you compare to your pricing tier to compute margin. The number total_cost_per_month is what you compare to your runway to compute risk.

How to count tokens (without running a tokenizer)

You do not need a Python script. Two rules of thumb that get you close enough:

English text: ~4 characters per token. Take your average user message length in characters and divide by 4. A 200-character question is about 50 tokens.

JSON output: ~3 characters per token. JSON has more punctuation than prose. A 600-character JSON response is about 200 tokens.

For a production-quality estimate, use the official tokenizer (Anthropic's count_tokens API or OpenAI's tiktoken library). For a PM cost model that gets you to launch, the character-divided-by-4 estimate is within 15% of true. That is good enough.

Three real cost models

1Support ticket auto-router (classification, low cost)

The feature: classifies incoming support tickets into one of 12 team categories. Reads ticket body, returns a category and a confidence score.

Variable	Value	Notes
`input_tokens_per_request`	600	400 system prompt (caches) + 200 user ticket
`output_tokens_per_request`	30	Just the category + confidence number
`input_price_per_million`	$1.00	Claude Haiku 4.5
`output_price_per_million`	$5.00	Same
`requests_per_user_per_month`	80	Average ticket volume per support agent
`cache_hit_rate`	0.67	400 of 600 input tokens are cacheable system prompt
`target_users`	500	Internal team

Cost per request: $0.0007. Cost per user per month: $0.06. Total monthly: $30. Bill comes in: $42 because cache hit rate was 0.55 not 0.67 (system prompt missed cache during cold periods). Variance: 40%. Still trivial in absolute terms.

Takeaway: classification with a short output is the cheapest AI feature shape there is. Models for these can be casual; the bill will not surprise you.

2PRD critique assistant (generation, medium cost)

The feature: reads a draft PRD (2,500 tokens), returns 3-5 specific critiques (800 tokens).

Variable	Value	Notes
`input_tokens_per_request`	3,200	600 system prompt + 100 instruction + 2,500 PRD
`output_tokens_per_request`	800	The critiques
`input_price_per_million`	$3.00	Claude Sonnet 4.6
`output_price_per_million`	$15.00	Same
`requests_per_user_per_month`	15	One PRD a week, two critiques each
`cache_hit_rate`	0.22	Only system prompt caches; the PRD is unique each time
`target_users`	5,000	Premium tier subscribers

Cost per request: $0.022. Cost per user per month: $0.33. Total monthly: $1,650. The "uncached" portion (the PRD itself, 2,500 tokens per request) dominates the bill. Caching helped, but not the way a naive model assumes.

Takeaway: when the input is unique per request, caching saves you 20% not 80%. Plan for the uncached number, treat caching as a margin buffer.

3AI coding agent (generation + multi-turn, high cost)

The feature: a Claude-powered coding agent that reads a repo, takes a natural-language task, makes code changes across multiple turns. Average task: 6 turns, 8,000 input + 1,200 output tokens per turn.

Variable	Value	Notes
`input_tokens_per_request`	8,000	Repo context (5,500) + conversation history (1,500) + user message (1,000)
`output_tokens_per_request`	1,200	Code edits + reasoning
`input_price_per_million`	$3.00	Claude Sonnet 4.6
`output_price_per_million`	$15.00	Same
`requests_per_user_per_month`	240	40 tasks × 6 turns average
`cache_hit_rate`	0.42	Repo context caches across the task; conversation history caches partially
`target_users`	2,000	Premium tier

Cost per request: $0.041. Cost per user per month: $9.84. Total monthly: $19,680. This is where AI feature unit economics start mattering: if you price at $19/month you have $9.16 of gross margin per user before infrastructure, support, and refunds. Tight but viable. If you price at $14.99/month you have $5.15 of margin and you cannot afford a viral spike.

Takeaway: multi-turn generation is where cost models become business models. The PM who shipped this without validating the cache hit rate against production traffic for two weeks before launch lost their job.

The variable everyone forgets: variance

The model above gives you a single number. Production gives you a distribution. The PM that ships with a single-number model gets blindsided when:

A power user sends 8x the average requests (it happens — long-tail is real).
A jailbreak attempt inflates input tokens by including a 10,000-token "now ignore all above" preamble.
A prompt-injection user tries to get the model to write a novel (10x output tokens).
A model upgrade silently doubles input prices (it has happened).

You cannot eliminate variance. You can model it.

The cheap fix: add three rows to your spreadsheet. p50_cost_per_user, p95_cost_per_user, p99_cost_per_user. Use 2x p50 for p95 and 4x p50 for p99 if you have no production data. Use real production data the moment you have it.

The expensive fix: per-user rate limits and hard cost caps. The PM that does not ship a hard limit on requests_per_user_per_day for an LLM feature is gambling. ShipSet's Captain has a 30-message-per-day cap not because users complain about it but because it bounds the maximum monthly cost per user to a known number.

The "what to do at p95" decision

Once you have p50 and p95 numbers, you have a decision:

Scenario	What to do
p95 < 1.5x your gross margin	Ship without caps. You can absorb the variance.
p95 between 1.5x and 3x margin	Add a soft cap (warn user at 80% of monthly limit).
p95 between 3x and 5x margin	Hard cap with degraded experience past the cap (cheaper model, smaller context).
p95 > 5x margin	Re-price OR re-design the feature. Variance will eat you.

Most AI PMs ship in the second or third row and discover they should have been in the third or fourth row when the bill arrives. Build the p95 number, take it seriously.

How to validate before launch

The cost model is theory until you validate it. Three steps to turn it into evidence:

1. Replay production traffic on the new feature, dry-run. Take 1,000 representative requests from a similar existing feature (or generate 1,000 synthetic ones). Run them through the new feature's prompt with a real model call. Measure actual cost. Compare to your model.

2. Run a closed beta with 30-50 real users for 2 weeks. Get the actual requests_per_user_per_month number. PM models almost always underestimate this by 2-3x because users do unexpected things.

3. Project from the beta to the launch population. Apply the beta usage rate to your target users number. This is your launch-day cost. Compare to your business case. If margin is tight, fix it now, not after launch.

What to put in the PRD

The cost model section of an AI PRD has four bullets:

Cost model

Per-request: $X (assumption table linked)

Per-user-per-month at p50: $Y / at p95: $Z

Implied margin at the proposed price tier: A%

Per-user rate limit: N requests/day (hard cap)

Four lines. Reviewers can interrogate any of them. The cost model itself lives in a linked spreadsheet so anyone can run sensitivity analysis on the assumptions.

The cheat code: ship the cost model alongside the eval suite

Reviewers ask two questions: "how do you know it works" and "what does it cost." If you arrive with both — the eval suite scoring artifact and the cost model with p95 numbers — the launch conversation gets short. The PM who built both is the PM who ships.

This is the difference between an AI PM at the offer stage and one who gets sent to do more interviews. The interviewer asks "tell me about the AI feature you shipped." The candidate with cost-model habit talks about the unit economics. The interviewer is sold.

TL;DR

The PM cost model that uses a single blended price is wrong in 3 ways: input ≠ output pricing, system prompts are tokens, caching saves 30-50% not 90%.
The 7-variable workbook: input_tokens, output_tokens, input_price, output_price, requests_per_user, cache_hit_rate, target_users.
Count tokens by characters / 4 (English) or characters / 3 (JSON). 15% accuracy is fine for a launch model.
Compute p50 cost, then p95 = 2x p50, p99 = 4x p50 if you have no data. Take p95 seriously.
Cost model decision tree: at p95 > 3x margin, you need caps. At p95 > 5x, you re-price or re-design.
Validate the model before launch with replay traffic + a 30-user closed beta. PM estimates of usage are almost always 2-3x low.
The PRD section is 4 lines: per-request cost, per-user p50 / p95, margin %, hard rate limit. Spreadsheet linked.

In ShipSet Lesson 49 ("Build cost validation") you build the cost model for the feature you are shipping and validate it against replay traffic. By Day 90, the cost model is one of the 10 portfolio artifacts. It is the artifact engineering managers ask about most in interviews.

If you are reading this and have not built one: open a spreadsheet, paste the seven variables in column A, and fill in column B for the feature on your desk. The decisions get easier from there.