Evidence & Evaluation Definitions

Canonical definitions used across all ProductBooks evaluators

Version:v2.0

Status:Canonical · Locked

Applies to:

• Problem–Solution Fit (Stage 1)
• Product–Market Fit (Stage 2)
• Business Model Fit (Stage 3)

Purpose of This Document

This document exists to prevent evaluator drift, protect scoring integrity, and ensure consistent application of evidence standards across all ProductBooks assessments. It establishes what qualifies as evidence and what does not.

Prevents inconsistent interpretation of evidence across evaluations
Ensures founders understand what will and will not be scored
Provides a shared reference for disputes or clarifications
Does not provide guidance on how to gather evidence or build products

If information does not meet the definitions below, it scores zero.

How to Use This Document

Read before starting any ProductBooks evaluation
Refer to when unsure whether something qualifies
Treat as a classification system, not an instruction manual
Definitions describe what is, not what should be done

Core Evidence Concepts (All Stages)

4.1 Evidence

Information originating from users that is externally verifiable and demonstrates observable behaviour, stated preference, or measurable outcome.

Qualifies:

Recorded user interviews with timestamps and participant consent
Payment receipts or transaction logs
Product usage metrics tied to individual user sessions

Does not qualify:

Paraphrased summaries without source material
Beliefs or assumptions from the team
Projected future behaviour without historical data

Rule: If it cannot be shown to an evaluator, it is not evidence.

4.2 Direct vs Indirect Evidence

Direct evidence is obtained through first-party interaction with users. Indirect evidence is obtained through third-party sources or secondary research.

Qualifies as direct:

Interviews conducted by the team
Product analytics from your own users
Prototype testing sessions you ran

Qualifies as indirect:

Published industry reports with cited methodology
Competitor user reviews from verified platforms
Government datasets or census information

Rule: Direct evidence is weighted higher than indirect evidence.

4.3 Validation vs Confirmation

Validation is testing a belief against reality. Confirmation is seeking information that supports a pre-existing belief.

Qualifies as validation:

Open-ended questions that allow users to contradict your hypothesis
A/B tests showing which option users prefer
Recording disconfirming evidence alongside confirming evidence

Does not qualify (confirmation bias):

Leading questions that prime users toward a desired answer
Selectively sharing only positive feedback
Discounting negative signals without explanation

Rule: Evidence must be gathered with the intent to falsify, not confirm.

4.4 User

An individual who matches the defined target user profile and has experienced the problem or used the product.

Qualifies:

Someone who meets the demographic, behavioural, or contextual criteria defined for the target segment
An individual who has experienced the problem first-hand

Does not qualify:

Team members, advisors, or investors
Friends or family unless they independently match the target profile
Individuals who do not match the defined user criteria

Rule: If the individual does not match the user definition, their input scores zero.

4.5 Signal

Observable user behaviour or measurable outcome that indicates preference, intent, or commitment.

Qualifies:

User completes a purchase
User returns to the product without prompting
User refers another user unprompted

Does not qualify:

User says they like the product
User expresses intent to purchase
User mentions they might refer someone

Rule: Behaviour is signal. Opinion is not.

4.6 Metric vs Signal

A metric is a quantitative measure. A signal is meaningful user behaviour. Metrics without context are not signals.

Qualifies as signal:

40% of users return within 7 days without prompting
15% of trial users convert to paid

Does not qualify (metric without meaning):

500 website visits
200 email signups (without context on conversion or engagement)

Rule: A metric becomes a signal when it demonstrates user choice or commitment.

4.7 Estimate

A calculation based on evidence and stated assumptions. Estimates must show their methodology.

Qualifies:

TAM calculation with cited sources and documented assumptions
Revenue projection based on observed conversion rates

Does not qualify:

Projections without stated methodology
Estimates based on aspirational assumptions

Rule: Estimates are scored based on the quality of underlying evidence, not optimism.

Evidence Weighting (All Stages)

Transaction-based evidenceHighest weight

Actual purchases, payments, or irreversible commitments.

Behavioural evidenceHigh weight

Observed actions such as return visits, feature usage, or referrals.

Preference evidenceMedium weight

Choices made between real alternatives under constraint.

Opinion evidenceLow weight

Verbal feedback, survey responses, or stated preferences without action.

AssumptionZero weight

Beliefs, projections, or hypotheses without supporting evidence.

Weighting applies only after evidence qualifies.

Evidence by Stage

Stage 1

Problem–Solution Fit

User interview

A structured conversation with an individual who matches the target user profile, focused on understanding their experience of the problem.

Qualifies:

Recorded conversation with consent and timestamped transcript
Interview conducted with someone who independently matches the user definition

Does not qualify:

Paraphrased summaries without source material
Conversations with team members, investors, or advisors

Top-3 pain

A user explicitly ranks the problem among their three highest-priority problems in the relevant context.

Qualifies:

User ranks the problem first, second, or third when asked to prioritise competing problems

Does not qualify:

User acknowledges the problem exists but does not rank it
User ranks it fourth or lower

Workaround

Evidence that a user has built or adopted a non-standard solution to address the problem.

Qualifies:

User demonstrates a custom spreadsheet, script, or manual process they created
User combines multiple tools in a non-standard way to solve the problem

Does not qualify:

User says they wish they had a workaround
User uses a standard tool as intended

Secondary research

Published third-party information that provides context on the problem space or user behaviour.

Qualifies:

Industry reports with cited methodology
Government datasets or census information

Does not qualify:

Blog posts or opinion articles
Uncited market size claims

Solution exposure

A user has been shown the solution and provided feedback.

Qualifies:

User has seen a prototype, concept, or mockup and responded

Does not qualify:

Team members reviewing the solution
User has not yet been shown the solution

Prototype

A testable representation of the solution that users can interact with.

Qualifies:

Clickable mockup or functional MVP
Physical prototype users can handle

Does not qualify:

Static images or descriptions
Concept art without interactivity

Concept test

A structured test where users evaluate the solution concept against alternatives or criteria.

Qualifies:

Controlled comparison where users choose between options
Scored evaluation against defined criteria

Does not qualify:

Open-ended feedback without structure

Preference signal

User chooses one option over another under constraint.

Qualifies:

User selects this solution over an alternative when both are available

Does not qualify:

User says they prefer it without making a choice

Willingness to pay

User accepts or rejects a specific paid offer.

Qualifies:

User is presented with a real price and accepts or declines
Pre-order with payment commitment

Does not qualify:

User says they would pay without seeing a price
Email signup or waitlist without payment

Stage 2

Product–Market Fit

Paying user

An individual or organisation that has completed a transaction for the product.

Qualifies:

User has paid money for the product
Payment was processed and received

Does not qualify:

Free trial users
Users who committed but have not yet paid

Retention

Users return to the product without prompting or incentive.

Qualifies:

User logs in or uses the product multiple times over a defined period
Return visits occur without email reminders or promotions

Does not qualify:

Users return only after being emailed or prompted
Single-use or one-time access

Advocacy

Users recommend the product to others without prompting or reward.

Qualifies:

User refers another user unprompted
Public testimonial or review posted voluntarily

Does not qualify:

Referrals incentivised by discounts or rewards
User says they would recommend but has not done so

Stage 3

Business Model Fit

Revenue

Money received from users in exchange for the product.

Qualifies:

Payment received and cleared
Verifiable transaction records

Does not qualify:

Committed but unpaid invoices
Grant funding or investment

Founder-assisted revenue

Revenue that required manual intervention or custom work by the founding team.

Qualifies:

Revenue from users who required hands-on setup or support

Does not qualify:

Fully self-service revenue with no founder involvement

Repeatability

Revenue generated without founder involvement or custom work.

Qualifies:

Users sign up and pay through a standardised flow
Product delivers value without manual intervention

Does not qualify:

Each user requires custom configuration or support

Capital optionality

The business can sustain or grow without requiring additional external funding.

Qualifies:

Revenue covers operating costs
Growth is fundable through retained earnings or debt

Does not qualify:

Business requires additional equity funding to continue

Adoption vs Usage (All Stages)

Usage

A user interacts with the product or consumes its output.

Adoption

A user integrates the product into their workflow or behaviour pattern.

ProductBooks weights adoption more heavily.

Common Misinterpretations (Global)

The following do not qualify as evidence and are not scored:

Effort
Passion
Intelligence
Vision
Speed of execution
Quality of storytelling
"We are early"
"Everyone liked it"

Scoring Implications (Global)

Missing evidence = zero
Partial evidence = partial score
Strong claims without evidence are penalised
Disputes do not change the rubric
The evaluator does not negotiate

This document defines how reality is assessed.
It does not define what you should do, build, or believe.
Comfort indicates insufficient rigor.