Nonpartisan Government Accountability
PolicyLogic
How We Apply Our Methodology
INDEPENDENT & TRANSPARENT METHODOLOGY
PolicyLogic
Home About
Scorecards
All Scorecards State & Local Congress Presidential International Depts & Agencies Nonprofits & NGOs
Learn
Learning Center Think Clearly How Policy Works Take Action Govt 101 Glossary
Methodology
Elected Officials Depts & Agencies Presidential & International AI Pipeline Limitations
Corrections
Error Log Submission Tracker Contact
Elected Officials Departments & Agencies Presidential & International AI Pipeline Limitations
Elected Officials Methodology
How PolicyLogic scores governors, mayors, senators, and representatives on their campaign and inaugural commitments. Version 2.0 — operative standard. All prior versions superseded.
All scorecards are AI-generated drafts pending human review. Scores are generated using Claude (Anthropic) and reflect public records as of the research date. The grade formula is computed deterministically from AI-assigned bucket values — the AI does not calculate grades. Every scorecard carries an AI Draft flag until verified by a human reviewer.

What Counts as a Promise

A statement qualifies as a trackable promise if it meets all three conditions: it is attributable directly to the official (not a surrogate) in a campaign or inaugural context; it is forward-looking (an expressed intention, not a description of existing policy); and it is verifiable — it has at least one condition that can in principle be confirmed true or false.

Statements of value without action components ("I believe in education") are excluded. Promises contingent on federal action outside the official's jurisdiction are excluded. Where a promise appears at multiple specificity levels, the more specific version is retained.

There is no cap on promise count. All qualifying promises are scored. Capping promise count introduces editorial bias — selecting 6 from 30 means 24 are excluded on undisclosed grounds. Every scorecard displays a mandatory disclosure: N promises tracked · estimated M identified · X excluded for non-verifiability.

Promise Types

PolicyLogic recognizes three promise types, each with its own delivery scoring track. All three feed into the same D0–D4 delivery buckets — the distinction is in how the bucket is determined.

TypeDefinitionHow Delivery Is Measured
Quantitative Promise includes a numeric target or measurable threshold. Percentage of stated goal achieved. D4 = 70% or more, D3 = 40% to under 70%, D2 = 10% to under 40%, D1 = under 10% with documented action, D0 = no meaningful action.
Qualitative Promise specifies an action or outcome without a numeric target (pass a bill, make an appointment, launch a program). Action milestone ladder: D0 = no action · D1 = public commitment only · D2 = formal action initiated (bill introduced, passed committee) · D3 = advanced past the decisive hurdle (passed a full chamber, nominee confirmed, program launched) · D4 = fully delivered.
Negative Promise to prevent, avoid, or not do something ("I will not raise taxes"). Inverted criteria, scored binary. D4 = condition fully avoided through the term. D0 = condition occurred. Intermediate buckets (D1–D3) do not apply to negative promises — a thing is either avoided or not. The single exception: any mid-term redefinition of the condition triggers an automatic Redefined flag and caps delivery at D2.

Scoring Framework

Each promise is scored on three independent axes. The axes combine into a Promise Score. Promise scores are aggregated into an overall grade. Maximum per promise: 25 points.

Axis 1 — Delivery (max 12 points)

Delivery measures what the official actually accomplished relative to what was promised. It is the primary axis and carries the most weight in the final grade.

CodeLabelDescriptionPts
D4Delivered70%+ of quantitative goal achieved, or qualitative promise fully delivered.12
D3Substantially Delivered40% to under 70% achieved, or qualitative promise advanced past the decisive hurdle (D3 on the ladder) with documented evidence.9
D2Partial10% to under 40% achieved, or qualitative promise at formal-action stage (D2 on the ladder). Concrete actions taken without full outcome.6
D1MinimalUnder 10% of a quantitative goal with documented action, or a qualitative promise at public-commitment-only stage (D1 on the ladder). No enacted policy or measurable outcome yet.3
D0Not DeliveredNo meaningful action taken, promise abandoned, or condition of a negative promise occurred.0

Their Role modifier (0.0–1.0): The official's direct causal contribution to the outcome is assessed and applied to Delivery points before aggregation.

Adjusted Delivery = Delivery Points × Their Role

See the Their Role lookup table below for anchor values by office-action combination.

Axis 2 — Difficulty (max 5 points, earned proportionally)

Difficulty rewards ambition — but only when delivery occurs. Zero delivery earns zero Difficulty points regardless of how ambitious the promise was.

CodeLabelDescriptionMax Pts
H3StructuralMulti-year initiative requiring coalition-building, constitutional change, or federal coordination.5
H2LegislativeRequires passage of legislation, significant political capital, or cross-chamber negotiation.3
H1ExecutiveAchievable through executive order, budget allocation, appointment, or regulatory change within direct authority.1
Difficulty Earned = Bucket Points × (Delivery Points / 12)
Example: H3 promise with D2 delivery = 5 × (6/12) = 2.5 pts · H3 with D0 = 0 pts

Axis 3 — Impact (max 8 points)

Impact measures what was at stake. It is not modified by delivery — a failed promise on a critical issue still carries high impact stakes. This is intentional: if you promise something that matters and don't deliver, the grade reflects how much it mattered.

ScaleDescriptionPtsMagnitudeDescriptionPts
S3Systemic / Statewide4M3Transformative4
S2Regional / Citywide2M2Significant2
S1Neighborhood / Segment1M1Minor / Symbolic1
Specificity Cap (Clarity Rule): If a promise has a Clarity score of 2 (see Clarity Scale below), Magnitude is capped at M1. A vague promise cannot be awarded high magnitude because its intended scope was never defined.

Promise Score

Promise Score = Adjusted Delivery + Difficulty Earned + Impact
Maximum per promise: 12 (Delivery) + 5 (Difficulty) + 8 (Impact) = 25 points

Grade Calculation

The final grade is calculated from two ratios combined in a fixed 60/40 split. This split is published on every scorecard and does not vary by official, party, or jurisdiction.

Delivery Ratio = Sum of Adjusted Delivery Points ÷ (N promises × 12)
Promise Score Ratio = Sum of Promise Scores ÷ (N promises × 25)
Grade Input = (Delivery Ratio × 0.60) + (Promise Score Ratio × 0.40)
The 60/40 weighting ensures raw delivery dominates. An official cannot achieve a high grade by promising only easy, low-impact things and delivering them all.
Boundary convention: all thresholds are inclusive of their lower bound. A band stated as 70–84% means 70.0% up to but not including 85%. The same convention applies to every range in this methodology — Delivery percentages, Difficulty bands, and Time Pressure cutoffs — so no value falls into two buckets.
A+90%+Exceptional delivery on ambitious, high-impact promises across multiple domains.
A85–89%Strong delivery across most domains with documented outcomes.
B70–84%Above-average to solid delivery; meaningful gaps remain.
C50–69%Mixed to marginal delivery; structural or political barriers evident.
D30–49%Poor to very poor delivery; few concrete outcomes.
FBelow 30%No meaningful delivery on tracked promises.

Worked Example

Worked Example · Housing Promise · Senator (Qualitative, H2)

"I will pass a tenant protection bill in the first session."

Promise TypeQualitative
Delivery BucketD2 — Bill introduced, passed committee, stalled on floor (formal action initiated)
Delivery Points6
Their Role0.6 — Advocated, dependent on legislature
Adjusted Delivery6 × 0.6 = 3.6 pts
DifficultyH2 (Legislative) = 3 max pts
Difficulty Earned3 × (6/12) = 1.5 pts
ImpactS2 + M2 = 2 + 2 = 4 pts
Promise Score3.6 + 1.5 + 4 = 9.1 / 25

Their Role — Lookup Table

The Their Role modifier is the single most judgment-dependent element of the methodology. The following anchor values reduce inconsistency across AI runs and human reviewers.

ScoreSituationCommon Examples
1.0 Sole or near-sole authority. Official acted unilaterally within clear constitutional or statutory power. Governor signing an executive order; mayor appointing a department head.
0.8 Official championed, negotiated, and signed legislation. Legislature was a necessary co-actor but official drove the outcome. Governor securing and signing a major budget deal; senator authoring and passing a bill with party in majority.
0.6 Official advocated consistently but was dependent on others who were not fully aligned. Senator in majority facing moderate opposition within own caucus; governor working with a split legislature.
0.4 Supporting or facilitative role. Outcome primarily driven by other actors, market forces, or federal policy. Mayor benefiting from a federal infrastructure grant they applied for but did not design.
0.2 Minimal causal influence. Outcome largely driven by forces entirely outside the official's sphere. Economic improvement during a governor's term primarily driven by national trends.
0.0 No meaningful causal connection. Official's actions were irrelevant to the outcome. Official claims credit for a federal policy they had no role in; outcome occurred despite official's opposition.

Clarity Scale

Clarity measures how specifically a promise was stated at the time it was made. It is assessed as of the original statement — it cannot be improved retroactively by subsequent clarifications.

The scale begins at 2. Pure values statements with no action component ("I believe in stronger communities") carry no verifiable condition and are screened out at qualification under What Counts as a Promise — they never enter the scored set, so there is no Clarity score of 1.

ScoreLabelDefinition & Example
2Directional, no specificsGeneral intent stated, no mechanism, target, or timeframe. Magnitude capped at M1. "We will improve public safety."
3Specific policy namedA named policy, program, or bill is referenced. "I will pass a tenant protection bill."
4Specific + conditionsNamed policy plus timeframe, jurisdiction, or population. "I will pass a tenant protection bill in the first year."
5Specific + measurable targetNamed policy with a quantified, verifiable outcome. "I will reduce violent crime 20% by 2026."

Time Pressure Adjustment

Every promise is assigned a timeline bucket based on the official's stated or implied delivery window. Delivery scores are adjusted for whether a promise is early, on-track, or overdue.

ConditionStatusEffect
Time Pressure < 0.5EarlyDelivery weight = Time Pressure × 2. Grade is provisional. Displayed with clock indicator.
Time Pressure 0.5–1.0On TrackFull delivery weight applied. No adjustment.
Time Pressure 1.0–1.25Overdue (mild)10% delivery score reduction.
Time Pressure 1.25–1.5Overdue (moderate)20% delivery score reduction.
Time Pressure > 1.5Overdue (significant)Reduction continues at 10% per 0.25 over 1.0, capped at 40%.

Behavioral Flags

Behavioral flags override or cap standard scoring when the official actively distorted the promise. Each flag requires a written rationale in the scorecard JSON.

When more than one flag applies to the same promise, a cap (maximum) always takes precedence over a floor (minimum), and the lowest applicable cap controls. Example: if Externally Blocked sets a D2 floor and Scope Reduced caps Magnitude while Redefined caps delivery at D2, the delivery score resolves to D2 — the cap binds. A flag that sets the score to a fixed value (Reversed = D0) overrides both caps and floors.

Reversed
Score = D0 regardless
Official explicitly reversed or repealed a previously promised policy.
Redefined
Outcome capped at D2
Definition of success materially changed mid-term, or condition of a negative promise redefined.
Externally Blocked
D2 minimum if actions taken
Promise failed due to documented external intervention meeting the Changed-Circumstances Test.
Credit Overclaimed
Their Role capped at 0.4
Official claimed credit for outcomes driven by prior administration or federal action.
Deadline Shifted
Time Pressure cap relief removed
Timeline extended without explanation after original deadline passed.
Scope Reduced
Magnitude capped at M1
Promise delivered in significantly diminished form without acknowledgment.

Transparency Flags

Transparency flags are applied at the scorecard level and communicate limitations to readers without modifying the grade. They appear as labeled badges on the scorecard.

Contested
A classification is disputed by the official or a credible third party. Evidence and rationale are documented.
Limited Evidence
Fewer sources than the standard minimum. Judgment applied. Grade should not be cited as definitive.
AI Draft
Scorecard is AI-generated and pending human review.
Under Review
Classification actively being re-evaluated due to new evidence.
Low Promise Count
Fewer than five promises scored. Grade may not reflect the full record.
Mid-Term Departure
Official left office before term ended. Scoring rules for departure cases documented in full methodology.

Data Sources