Methodology · PolicyLogic

Elected Officials Methodology

How PolicyLogic scores governors, mayors, senators, and representatives on their campaign and inaugural commitments. Version 2.0 — operative standard. All prior versions superseded.

What Counts as a Promise

A statement qualifies as a trackable promise if it meets all three conditions: it is attributable directly to the official (not a surrogate) in a campaign or inaugural context; it is forward-looking (an expressed intention, not a description of existing policy); and it is verifiable — it has at least one condition that can in principle be confirmed true or false.

Statements of value without action components ("I believe in education") are excluded. Promises contingent on federal action outside the official's jurisdiction are excluded. Where a promise appears at multiple specificity levels, the more specific version is retained.

There is no cap on promise count. All qualifying promises are scored. Capping promise count introduces editorial bias — selecting 6 from 30 means 24 are excluded on undisclosed grounds. Every scorecard displays a mandatory disclosure: N promises tracked · estimated M identified · X excluded for non-verifiability.

Promise Types

PolicyLogic recognizes three promise types, each with its own delivery scoring track. All three feed into the same D0–D4 delivery buckets — the distinction is in how the bucket is determined.

Type	Definition	How Delivery Is Measured
Quantitative	Promise includes a numeric target or measurable threshold.	Percentage of stated goal achieved. D4 = 70% or more, D3 = 40% to under 70%, D2 = 10% to under 40%, D1 = under 10% with documented action, D0 = no meaningful action.
Qualitative	Promise specifies an action or outcome without a numeric target (pass a bill, make an appointment, launch a program).	Action milestone ladder: D0 = no action · D1 = public commitment only · D2 = formal action initiated (bill introduced, passed committee) · D3 = advanced past the decisive hurdle (passed a full chamber, nominee confirmed, program launched) · D4 = fully delivered.
Negative	Promise to prevent, avoid, or not do something ("I will not raise taxes").	Inverted criteria, scored binary. D4 = condition fully avoided through the term. D0 = condition occurred. Intermediate buckets (D1–D3) do not apply to negative promises — a thing is either avoided or not. The single exception: any mid-term redefinition of the condition triggers an automatic Redefined flag and caps delivery at D2.

Scoring Framework

Each promise is scored on three independent axes. The axes combine into a Promise Score. Promise scores are aggregated into an overall grade. Maximum per promise: 25 points.

Axis 1 — Delivery (max 12 points)

Delivery measures what the official actually accomplished relative to what was promised. It is the primary axis and carries the most weight in the final grade.

Code	Label	Description	Pts
D4	Delivered	70%+ of quantitative goal achieved, or qualitative promise fully delivered.	12
D3	Substantially Delivered	40% to under 70% achieved, or qualitative promise advanced past the decisive hurdle (D3 on the ladder) with documented evidence.	9
D2	Partial	10% to under 40% achieved, or qualitative promise at formal-action stage (D2 on the ladder). Concrete actions taken without full outcome.	6
D1	Minimal	Under 10% of a quantitative goal with documented action, or a qualitative promise at public-commitment-only stage (D1 on the ladder). No enacted policy or measurable outcome yet.	3
D0	Not Delivered	No meaningful action taken, promise abandoned, or condition of a negative promise occurred.	0

Their Role modifier (0.0–1.0): The official's direct causal contribution to the outcome is assessed and applied to Delivery points before aggregation.

Adjusted Delivery = Delivery Points × Their Role

See the Their Role lookup table below for anchor values by office-action combination.

Axis 2 — Difficulty (max 5 points, earned proportionally)

Difficulty rewards ambition — but only when delivery occurs. Zero delivery earns zero Difficulty points regardless of how ambitious the promise was.

Code	Label	Description	Max Pts
H3	Structural	Multi-year initiative requiring coalition-building, constitutional change, or federal coordination.	5
H2	Legislative	Requires passage of legislation, significant political capital, or cross-chamber negotiation.	3
H1	Executive	Achievable through executive order, budget allocation, appointment, or regulatory change within direct authority.	1

Difficulty Earned = Bucket Points × (Delivery Points / 12)

Example: H3 promise with D2 delivery = 5 × (6/12) = 2.5 pts · H3 with D0 = 0 pts

Axis 3 — Impact (max 8 points)

Impact measures what was at stake. It is not modified by delivery — a failed promise on a critical issue still carries high impact stakes. This is intentional: if you promise something that matters and don't deliver, the grade reflects how much it mattered.

Scale	Description	Pts	Magnitude	Description	Pts
S3	Systemic / Statewide	4	M3	Transformative	4
S2	Regional / Citywide	2	M2	Significant	2
S1	Neighborhood / Segment	1	M1	Minor / Symbolic	1

Specificity Cap (Clarity Rule): If a promise has a Clarity score of 2 (see Clarity Scale below), Magnitude is capped at M1. A vague promise cannot be awarded high magnitude because its intended scope was never defined.

Promise Score

Promise Score = Adjusted Delivery + Difficulty Earned + Impact

Maximum per promise: 12 (Delivery) + 5 (Difficulty) + 8 (Impact) = 25 points

Grade Calculation

The final grade is calculated from two ratios combined in a fixed 60/40 split. This split is published on every scorecard and does not vary by official, party, or jurisdiction.

Delivery Ratio = Sum of Adjusted Delivery Points ÷ (N promises × 12)

Promise Score Ratio = Sum of Promise Scores ÷ (N promises × 25)

Grade Input = (Delivery Ratio × 0.60) + (Promise Score Ratio × 0.40)

The 60/40 weighting ensures raw delivery dominates. An official cannot achieve a high grade by promising only easy, low-impact things and delivering them all.

Boundary convention: all thresholds are inclusive of their lower bound. A band stated as 70–84% means 70.0% up to but not including 85%. The same convention applies to every range in this methodology — Delivery percentages, Difficulty bands, and Time Pressure cutoffs — so no value falls into two buckets.

A+	90%+	Exceptional delivery on ambitious, high-impact promises across multiple domains.
A	85–89%	Strong delivery across most domains with documented outcomes.
B	70–84%	Above-average to solid delivery; meaningful gaps remain.
C	50–69%	Mixed to marginal delivery; structural or political barriers evident.
D	30–49%	Poor to very poor delivery; few concrete outcomes.
F	Below 30%	No meaningful delivery on tracked promises.

Worked Example

Worked Example · Housing Promise · Senator (Qualitative, H2)

"I will pass a tenant protection bill in the first session."

Promise TypeQualitative

Delivery BucketD2 — Bill introduced, passed committee, stalled on floor (formal action initiated)

Delivery Points6

Their Role0.6 — Advocated, dependent on legislature

Adjusted Delivery6 × 0.6 = 3.6 pts

DifficultyH2 (Legislative) = 3 max pts

Difficulty Earned3 × (6/12) = 1.5 pts

ImpactS2 + M2 = 2 + 2 = 4 pts

Promise Score3.6 + 1.5 + 4 = 9.1 / 25

Their Role — Lookup Table

The Their Role modifier is the single most judgment-dependent element of the methodology. The following anchor values reduce inconsistency across AI runs and human reviewers.

Score	Situation	Common Examples
1.0	Sole or near-sole authority. Official acted unilaterally within clear constitutional or statutory power.	Governor signing an executive order; mayor appointing a department head.
0.8	Official championed, negotiated, and signed legislation. Legislature was a necessary co-actor but official drove the outcome.	Governor securing and signing a major budget deal; senator authoring and passing a bill with party in majority.
0.6	Official advocated consistently but was dependent on others who were not fully aligned.	Senator in majority facing moderate opposition within own caucus; governor working with a split legislature.
0.4	Supporting or facilitative role. Outcome primarily driven by other actors, market forces, or federal policy.	Mayor benefiting from a federal infrastructure grant they applied for but did not design.
0.2	Minimal causal influence. Outcome largely driven by forces entirely outside the official's sphere.	Economic improvement during a governor's term primarily driven by national trends.
0.0	No meaningful causal connection. Official's actions were irrelevant to the outcome.	Official claims credit for a federal policy they had no role in; outcome occurred despite official's opposition.

Clarity Scale

Clarity measures how specifically a promise was stated at the time it was made. It is assessed as of the original statement — it cannot be improved retroactively by subsequent clarifications.

The scale begins at 2. Pure values statements with no action component ("I believe in stronger communities") carry no verifiable condition and are screened out at qualification under What Counts as a Promise — they never enter the scored set, so there is no Clarity score of 1.

Score	Label	Definition & Example
2	Directional, no specifics	General intent stated, no mechanism, target, or timeframe. Magnitude capped at M1. "We will improve public safety."
3	Specific policy named	A named policy, program, or bill is referenced. "I will pass a tenant protection bill."
4	Specific + conditions	Named policy plus timeframe, jurisdiction, or population. "I will pass a tenant protection bill in the first year."
5	Specific + measurable target	Named policy with a quantified, verifiable outcome. "I will reduce violent crime 20% by 2026."

Time Pressure Adjustment

Every promise is assigned a timeline bucket based on the official's stated or implied delivery window. Delivery scores are adjusted for whether a promise is early, on-track, or overdue.

Condition	Status	Effect
Time Pressure < 0.5	Early	Delivery weight = Time Pressure × 2. Grade is provisional. Displayed with clock indicator.
Time Pressure 0.5–1.0	On Track	Full delivery weight applied. No adjustment.
Time Pressure 1.0–1.25	Overdue (mild)	10% delivery score reduction.
Time Pressure 1.25–1.5	Overdue (moderate)	20% delivery score reduction.
Time Pressure > 1.5	Overdue (significant)	Reduction continues at 10% per 0.25 over 1.0, capped at 40%.

Behavioral Flags

Behavioral flags override or cap standard scoring when the official actively distorted the promise. Each flag requires a written rationale in the scorecard JSON.

When more than one flag applies to the same promise, a cap (maximum) always takes precedence over a floor (minimum), and the lowest applicable cap controls. Example: if Externally Blocked sets a D2 floor and Scope Reduced caps Magnitude while Redefined caps delivery at D2, the delivery score resolves to D2 — the cap binds. A flag that sets the score to a fixed value (Reversed = D0) overrides both caps and floors.

Reversed

Score = D0 regardless

Official explicitly reversed or repealed a previously promised policy.

Redefined

Outcome capped at D2

Definition of success materially changed mid-term, or condition of a negative promise redefined.

Externally Blocked

D2 minimum if actions taken

Promise failed due to documented external intervention meeting the Changed-Circumstances Test.

Credit Overclaimed

Their Role capped at 0.4

Official claimed credit for outcomes driven by prior administration or federal action.

Deadline Shifted

Time Pressure cap relief removed

Timeline extended without explanation after original deadline passed.

Scope Reduced

Magnitude capped at M1

Promise delivered in significantly diminished form without acknowledgment.

Transparency Flags

Transparency flags are applied at the scorecard level and communicate limitations to readers without modifying the grade. They appear as labeled badges on the scorecard.

Contested

A classification is disputed by the official or a credible third party. Evidence and rationale are documented.

Limited Evidence

Fewer sources than the standard minimum. Judgment applied. Grade should not be cited as definitive.

AI Draft

Scorecard is AI-generated and pending human review.

Under Review

Classification actively being re-evaluated due to new evidence.

Low Promise Count

Fewer than five promises scored. Grade may not reflect the full record.

Mid-Term Departure

Official left office before term ended. Scoring rules for departure cases documented in full methodology.

Data Sources

Tier 1 — Primary: Official government records, signed legislation, executive orders, budget documents, Federal Register publications. Required for D3 or D4 delivery scores.
Tier 2 — Authoritative: Major news organizations with named sources, academic and policy research, nonpartisan watchdog reports (CBO, GAO, CRS). Acceptable for confirmation; not sufficient alone for contested outcomes.
Tier 3 — Supporting: Other credible reporting, official statements, press releases. Requires corroboration.
Inadmissible: Anonymous social media, partisan advocacy material, press releases from the official's own office as sole support for a positive outcome.