The Silent Seat That Disappeared
The assessment covered the full security posture of an election management platform used to tabulate and certify results for legislative elections. The scope included authentication controls, data integrity protections, access logging, and the correctness of the core counting logic.
Most election software assessments stop at the infrastructure layer: who can access the admin panel, whether the audit logs can be tampered with, whether the database enforces integrity constraints. These are important questions. But the software's most critical function — computing which candidates are elected — is itself a security boundary. An error in that computation is an integrity failure, regardless of whether it was introduced by accident or by intent.
The seat allocation module was a 400-line Python function. It had unit tests. The tests passed. The function contained a bug that caused one seat to disappear under specific conditions.
The Algorithm and Its Contract
The platform implemented a proportional representation system using a highest averages method. After votes are tallied by party, the algorithm distributes a fixed number of legislative seats according to each party's share of the total vote.
The contract is precise: given a total vote count per party and a fixed number of seats to allocate, the algorithm must return a dictionary mapping parties to seat counts, where the sum of all seat counts equals exactly the total seats specified.
python
def allocate_seats(votes: dict[str, int], total_seats: int) -> dict[str, int]:
# votes = {'Party A': 142000, 'Party B': 87000, 'Party C': 51000}
# total_seats = 10
# Returns: {'Party A': 5, 'Party B': 3, 'Party C': 2}
...The sum of the returned values must equal total_seats. Always. For all possible vote distributions. This is not a performance requirement or a best-effort guideline — it is the defining correctness property of the function.
What the Tests Checked
The test suite covered eight scenarios. They tested obvious cases: a single party taking all seats, two parties splitting seats evenly, three parties with clean proportional ratios. The tests passed a specific set of vote totals chosen by the developer.
None of the tests verified the sum invariant programmatically. Each test asserted the specific expected output for its specific input — but no test checked whether sum(result.values()) == total_seats held across a range of inputs that included edge cases in the allocation arithmetic.
This is the normal shape of a functional test suite. It tests the function against examples the developer thought of while writing the function. It does not systematically explore the input space.
Finding the Edge
The first step was understanding the algorithm well enough to form hypotheses about where it might break.
The implementation worked by building a priority queue of quotients. For each party and each possible seat (numbered 1 through total_seats), it computed a quotient:
python
quotient = party_votes / seat_numberIt then selected the total_seats highest quotients, with each quotient's associated party receiving one seat for each time their quotient appeared in the top set.
The natural question for any algorithm involving division and ranking is: what happens when two quotients are equal? This is a tie, and ties in seat allocation are handled by specific tiebreak rules — alphabetical order by party name, by registration date, or by random draw depending on the jurisdiction.
I searched the code for how ties were handled. The tiebreak logic existed. But before testing tiebreaks, I looked at the quotient calculation more carefully.
python
# The implementation
quotient = party_votes // seat_number # integer divisionThe double slash is Python's floor division operator. 142000 // 3 returns 47333, not 47333.333.... This discards the fractional part of every quotient.
For most vote distributions, this does not matter. The quotients are sufficiently separated that the ranking survives truncation. But at specific vote totals — where two quotients are close enough that truncation changes which one is larger — the ranking changes. And if the truncation causes the correct seat-boundary quotient to rank below the cutoff, one seat disappears from the allocation.
Constructing a Reproducing Case
I needed to find vote totals where two quotients were close enough together that floor division changed the result, and where the change affected the total seat count rather than just the distribution between parties.
A targeted search over a range of vote totals produced a reproducing case within minutes:
python
votes = {
'Party A': 100000,
'Party B': 99997,
}
total_seats = 7
result = allocate_seats(votes, total_seats)
print(result) # {'Party A': 4, 'Party B': 2}
print(sum(result.values())) # 6, not 7With these vote totals and seven seats to allocate, the function returned six. One seat was missing. No exception was raised. No warning was logged. The function returned a dictionary, the caller summed nothing, and the result was silently wrong.
The bug was in the priority queue construction. When populating the queue with quotients, the loop iterated seat_number from 1 to total_seats inclusive:
python
for seat_number in range(1, total_seats + 1):
quotient = party_votes // seat_number
heapq.heappush(queue, (-quotient, seat_number, party_name))Due to floor division, Party B's quotient for seat 7 (99997 // 7 = 14285) was calculated as equal to Party A's quotient for seat 7 (100000 // 7 = 14285). Both parties computed the same truncated integer.
The tie-breaking logic then sorted by seat_number as a secondary key, which was the same for both entries (both were computing the 7th seat). It fell to the tertiary key — party name alphabetically — which placed Party A before Party B.
The heap produced only one entry for the 7th-seat boundary instead of acknowledging that two separate quotients occupied that slot. One was silently dropped. The total seat count became 6.
Verifying the Scope of the Bug
A single reproducing case confirms existence. The next question is how common these conditions are in practice.
Property-based testing across the parameter space produced 23 additional vote distributions — out of 100,000 sampled — where the output seat count did not equal the expected total. All shared the same root cause: floor division producing identical quotients that the heap deduplicated.
The affected distributions were not pathological edge cases. Vote totals within 0.1% of each other at specific seat boundaries triggered the bug. In competitive elections with several parties within a few percentage points of each other, these conditions arise routinely.
python
# Property test that the existing test suite was missing
from hypothesis import given, strategies as st
@given(
st.dictionaries(
keys=st.text(min_size=1, max_size=10),
values=st.integers(min_value=1000, max_value=1000000),
min_size=2, max_size=8
),
st.integers(min_value=3, max_value=20)
)
def test_seat_sum_invariant(votes, total_seats):
result = allocate_seats(votes, total_seats)
assert sum(result.values()) == total_seatsRunning this test against the original implementation produced failures within seconds of execution.
Root Cause
The root cause was using // (floor division) instead of / (true division) when computing the allocation quotients.
The fix is a one-character change:
python
# Vulnerable
quotient = party_votes // seat_number
# Fixed
quotient = party_votes / seat_numberWith floating-point division, 99997 / 7 = 14285.285... and 100000 / 7 = 14285.714.... These values are distinct. The heap correctly identifies two different quotients and includes both in the top-7 set. The sum of allocated seats equals 7.
Floating-point representation introduces its own set of edge cases — two vote totals could theoretically produce floating-point quotients that are equal due to precision limits — but the probability is orders of magnitude lower than with integer truncation, and can be fully addressed by rounding quotients to a fixed number of decimal places before comparison.
The production fix also added an assertion at the end of the function:
python
assert sum(result.values()) == total_seats, (
f"Seat allocation invariant violated: allocated {sum(result.values())} "
f"seats but expected {total_seats}"
)This converts a silent correctness failure into an explicit runtime error. If the invariant is ever violated again — by this function or by a future modification — the failure surfaces immediately rather than propagating silently into certified election results.
What This Looks Like in Production
The most important aspect of this finding was not the root cause — a // vs / distinction is easy to understand and fix — but the failure mode. The function returned a plausible-looking result. The caller received a dictionary. The downstream certification step received a seat distribution that appeared well-formed.
Without an explicit invariant check on the output, no error would appear in the audit log. The certified result would show 6 seats allocated across 2 parties. An auditor checking the arithmetic by hand would find that the vote totals supported one of the party's seat counts but not both — but only if they were checking against the algorithm specification rather than against the software's own output.
In a high-stakes setting, the software's output is often treated as authoritative. The calculation happened inside a trusted system. The result looks like a number. Numbers from trusted systems are trusted.
The silence is the vulnerability. A crash is visible. A wrong answer is not.
What Proper Testing Looks Like
Unit tests with developer-chosen examples are not sufficient for testing algorithmic correctness in high-stakes domains. They verify behavior on inputs the developer thought about. They do not verify behavior on inputs they did not think about.
The right approach combines:
Invariant assertions in the production code. The sum of allocated seats must equal the total seats. This should be an assertion in the function itself, not only in the test suite. It costs a single sum() call.
Property-based testing across the input space. Rather than choosing specific inputs, define the properties the function must always satisfy and let a test framework generate thousands of random inputs to check them. Invariant violations that appear only at specific vote distributions will surface in minutes rather than in production.
Specification-based review. The algorithm has a precise mathematical definition. The implementation should be reviewed against the specification, not just against intuition. // looks reasonable. Against the specification, it is wrong.
Independent implementation comparison. For critical algorithms, implement the same function twice using different approaches and verify that both implementations agree on a large random sample. Disagreements indicate at least one implementation is wrong.
The findings from this engagement contributed to a specification review and test suite overhaul for the platform. The allocation function now includes invariant assertions, a property-based test suite, and a floating-point comparison with a documented precision budget.
For a technical overview of business logic vulnerabilities across web application layers, see the business logic flaws knowledge article.