Evidence & Evaluation Definitions
Canonical definitions used across all ProductBooks evaluators
- • Problem–Solution Fit (Stage 1)
- • Product–Market Fit (Stage 2)
- • Business Model Fit (Stage 3)
Purpose of This Document
This document exists to prevent evaluator drift, protect scoring integrity, and ensure consistent application of evidence standards across all ProductBooks assessments. It establishes what qualifies as evidence and what does not.
- Prevents inconsistent interpretation of evidence across evaluations
- Ensures founders understand what will and will not be scored
- Provides a shared reference for disputes or clarifications
- Does not provide guidance on how to gather evidence or build products
If information does not meet the definitions below, it scores zero.
How to Use This Document
- Read before starting any ProductBooks evaluation
- Refer to when unsure whether something qualifies
- Treat as a classification system, not an instruction manual
- Definitions describe what is, not what should be done
Core Evidence Concepts (All Stages)
4.1 Evidence
Information originating from users that is externally verifiable and demonstrates observable behaviour, stated preference, or measurable outcome.
Qualifies:
- Recorded user interviews with timestamps and participant consent
- Payment receipts or transaction logs
- Product usage metrics tied to individual user sessions
Does not qualify:
- Paraphrased summaries without source material
- Beliefs or assumptions from the team
- Projected future behaviour without historical data
Rule: If it cannot be shown to an evaluator, it is not evidence.
4.2 Direct vs Indirect Evidence
Direct evidence is obtained through first-party interaction with users. Indirect evidence is obtained through third-party sources or secondary research.
Qualifies as direct:
- Interviews conducted by the team
- Product analytics from your own users
- Prototype testing sessions you ran
Qualifies as indirect:
- Published industry reports with cited methodology
- Competitor user reviews from verified platforms
- Government datasets or census information
Rule: Direct evidence is weighted higher than indirect evidence.
4.3 Validation vs Confirmation
Validation is testing a belief against reality. Confirmation is seeking information that supports a pre-existing belief.
Qualifies as validation:
- Open-ended questions that allow users to contradict your hypothesis
- A/B tests showing which option users prefer
- Recording disconfirming evidence alongside confirming evidence
Does not qualify (confirmation bias):
- Leading questions that prime users toward a desired answer
- Selectively sharing only positive feedback
- Discounting negative signals without explanation
Rule: Evidence must be gathered with the intent to falsify, not confirm.
4.4 User
An individual who matches the defined target user profile and has experienced the problem or used the product.
Qualifies:
- Someone who meets the demographic, behavioural, or contextual criteria defined for the target segment
- An individual who has experienced the problem first-hand
Does not qualify:
- Team members, advisors, or investors
- Friends or family unless they independently match the target profile
- Individuals who do not match the defined user criteria
Rule: If the individual does not match the user definition, their input scores zero.
4.5 Signal
Observable user behaviour or measurable outcome that indicates preference, intent, or commitment.
Qualifies:
- User completes a purchase
- User returns to the product without prompting
- User refers another user unprompted
Does not qualify:
- User says they like the product
- User expresses intent to purchase
- User mentions they might refer someone
Rule: Behaviour is signal. Opinion is not.
4.6 Metric vs Signal
A metric is a quantitative measure. A signal is meaningful user behaviour. Metrics without context are not signals.
Qualifies as signal:
- 40% of users return within 7 days without prompting
- 15% of trial users convert to paid
Does not qualify (metric without meaning):
- 500 website visits
- 200 email signups (without context on conversion or engagement)
Rule: A metric becomes a signal when it demonstrates user choice or commitment.
4.7 Estimate
A calculation based on evidence and stated assumptions. Estimates must show their methodology.
Qualifies:
- TAM calculation with cited sources and documented assumptions
- Revenue projection based on observed conversion rates
Does not qualify:
- Projections without stated methodology
- Estimates based on aspirational assumptions
Rule: Estimates are scored based on the quality of underlying evidence, not optimism.
Evidence Weighting (All Stages)
Actual purchases, payments, or irreversible commitments.
Observed actions such as return visits, feature usage, or referrals.
Choices made between real alternatives under constraint.
Verbal feedback, survey responses, or stated preferences without action.
Beliefs, projections, or hypotheses without supporting evidence.
Weighting applies only after evidence qualifies.
Evidence by Stage
Problem–Solution Fit
User interview
A structured conversation with an individual who matches the target user profile, focused on understanding their experience of the problem.
Qualifies:
- Recorded conversation with consent and timestamped transcript
- Interview conducted with someone who independently matches the user definition
Does not qualify:
- Paraphrased summaries without source material
- Conversations with team members, investors, or advisors
Top-3 pain
A user explicitly ranks the problem among their three highest-priority problems in the relevant context.
Qualifies:
- User ranks the problem first, second, or third when asked to prioritise competing problems
Does not qualify:
- User acknowledges the problem exists but does not rank it
- User ranks it fourth or lower
Workaround
Evidence that a user has built or adopted a non-standard solution to address the problem.
Qualifies:
- User demonstrates a custom spreadsheet, script, or manual process they created
- User combines multiple tools in a non-standard way to solve the problem
Does not qualify:
- User says they wish they had a workaround
- User uses a standard tool as intended
Secondary research
Published third-party information that provides context on the problem space or user behaviour.
Qualifies:
- Industry reports with cited methodology
- Government datasets or census information
Does not qualify:
- Blog posts or opinion articles
- Uncited market size claims
Solution exposure
A user has been shown the solution and provided feedback.
Qualifies:
- User has seen a prototype, concept, or mockup and responded
Does not qualify:
- Team members reviewing the solution
- User has not yet been shown the solution
Prototype
A testable representation of the solution that users can interact with.
Qualifies:
- Clickable mockup or functional MVP
- Physical prototype users can handle
Does not qualify:
- Static images or descriptions
- Concept art without interactivity
Concept test
A structured test where users evaluate the solution concept against alternatives or criteria.
Qualifies:
- Controlled comparison where users choose between options
- Scored evaluation against defined criteria
Does not qualify:
- Open-ended feedback without structure
Preference signal
User chooses one option over another under constraint.
Qualifies:
- User selects this solution over an alternative when both are available
Does not qualify:
- User says they prefer it without making a choice
Willingness to pay
User accepts or rejects a specific paid offer.
Qualifies:
- User is presented with a real price and accepts or declines
- Pre-order with payment commitment
Does not qualify:
- User says they would pay without seeing a price
- Email signup or waitlist without payment
Product–Market Fit
Paying user
An individual or organisation that has completed a transaction for the product.
Qualifies:
- User has paid money for the product
- Payment was processed and received
Does not qualify:
- Free trial users
- Users who committed but have not yet paid
Retention
Users return to the product without prompting or incentive.
Qualifies:
- User logs in or uses the product multiple times over a defined period
- Return visits occur without email reminders or promotions
Does not qualify:
- Users return only after being emailed or prompted
- Single-use or one-time access
Advocacy
Users recommend the product to others without prompting or reward.
Qualifies:
- User refers another user unprompted
- Public testimonial or review posted voluntarily
Does not qualify:
- Referrals incentivised by discounts or rewards
- User says they would recommend but has not done so
Business Model Fit
Revenue
Money received from users in exchange for the product.
Qualifies:
- Payment received and cleared
- Verifiable transaction records
Does not qualify:
- Committed but unpaid invoices
- Grant funding or investment
Founder-assisted revenue
Revenue that required manual intervention or custom work by the founding team.
Qualifies:
- Revenue from users who required hands-on setup or support
Does not qualify:
- Fully self-service revenue with no founder involvement
Repeatability
Revenue generated without founder involvement or custom work.
Qualifies:
- Users sign up and pay through a standardised flow
- Product delivers value without manual intervention
Does not qualify:
- Each user requires custom configuration or support
Capital optionality
The business can sustain or grow without requiring additional external funding.
Qualifies:
- Revenue covers operating costs
- Growth is fundable through retained earnings or debt
Does not qualify:
- Business requires additional equity funding to continue
Adoption vs Usage (All Stages)
Usage
A user interacts with the product or consumes its output.
Adoption
A user integrates the product into their workflow or behaviour pattern.
ProductBooks weights adoption more heavily.
Common Misinterpretations (Global)
The following do not qualify as evidence and are not scored:
- Effort
- Passion
- Intelligence
- Vision
- Speed of execution
- Quality of storytelling
- "We are early"
- "Everyone liked it"
Scoring Implications (Global)
- Missing evidence = zero
- Partial evidence = partial score
- Strong claims without evidence are penalised
- Disputes do not change the rubric
- The evaluator does not negotiate
This document defines how reality is assessed.
It does not define what you should do, build, or believe.
Comfort indicates insufficient rigor.