# Polling Methodology

Decision Desk HQ is committed to providing accurate, reliable, and comprehensive poll averages that reflect the current state of public opinion, races for elective office, and ballot initiative efforts. Our methodology, detailed below, is designed to ensure that our poll averages consider all available data and time trends, while being methodologically sound and consistent over time

### Sourcing

Our team carefully reviews and averages these polls using a straightforward methodology, ensuring that our polling averages reflect a balanced, up-to-date snapshot of public opinion. Within each average, you can find a specific point in time to compare how things have changed or view any poll included within that average.

### Averaging

Decision Desk HQ has averaged polls since the 2018 midterm election cycle, and over time, we've refined our methodology to focus on simplicity, interpretability, and accuracy. It's easy to get carried away with sophisticated statistical approaches that mine large amounts of historical polling data for every possible variable, but, as has been [shown consistently](https://sites.stat.columbia.edu/gelman/research/published/polling-errors.pdf), keeping it simple pays dividends when averaging polls.

### Methodology

Most approaches to averaging polls – though not ours – involve taking a vendor's base poll and making substantial changes to the results. This typically includes work such as:

* Time-weighing polls
* Adjusting polls for house effects and quality of polling
* Weighing different populations (e.g., likely voters, registered voters) differently
* Weighting by sample size
* Treating outliers

After this, the results are averaged using one of an array of methods, e.g., a moving average, an exponentially weighted average, or more complex approaches.

These approaches are all valid, but they also introduce many subjective decisions about how to treat certain types of data, and these decisions are sometimes made without enough data to justify one decision over another.

Our goal is to provide a straightforward, consumer-facing average that accurately reflects the current state of polling, is easy to interpret, and is not overly sensitive to recent news events. If a poll [meets the standards set by the American Association of Public Opinion Research (AAPOR)](https://aapor.org/standards-and-ethics/disclosure-standards/#1667933142550-55785157-2071) and releases its methodology, field dates, and sample size, we include it in our average. The only factors that affect how DDHQ incorporates a poll into its analysis are its recency and whether it is an internal poll—meaning it was commissioned by a candidate or their campaign committee. Additionally, at least 2 polls are required for us to publish an average for any race.

#### How We Handle Recency

If a race has too few polls to reliably estimate time trends, we use a flatter weighted-average approach, giving older polls a bit more influence than they would in a spline-based method. This approach reduces variance by stabilizing estimates in data-sparse environments, but it comes at the cost of potential bias, as older polls may be less representative of the race’s current dynamics. Our analysis has determined that this bias-variance tradeoff is preferential for low polling volume races–for instance, in US House races, we found no indication that polls coming earlier in the cycle are less accurate than those coming later.

As the polling volume increases and we begin to impute time trends, we blend an adjusted cubic spline into our weighted average, gradually increasing its weight. When there is sufficient polling volume in a race to strongly impute time trends (15 or more polls), the weighted average is no longer blended into the estimate. A cubic spline is a third-degree piecewise polynomial function that allows for smooth interpolation between data points. It eliminates the possibility of any sudden deviation or break from earlier results. We fit a new spline every day. The cubic spline for a given day is based only on polls conducted beginning on or before that day.

#### How We Handle Internal Polls

For horse race polling, we average internal polls. The first internal poll from a candidate or campaign committee is averaged into our average as normal, but any additional internals from the same candidate are averaged with that poll rather than entered as a new poll. This limits a campaign's ability to “flood” the average with polls while still reflecting the data available.

#### Crosstab Averaging

We collect crosstabs from all available polls that meet AAPOR standards and apply a minimum sample size threshold for inclusion. Due to the nature of crosstabs (smaller sample sizes, infrequency of release), averages can be more volatile in the short term.

The advantages of this approach are that it refrains from what has become, for many, a borderline-forecasting function while delivering results in line with those of more sophisticated approaches. All publicly available polls will be included in the Decision Desk HQ polling database. We provide averages that capture as close to a complete picture of the electorate as possible.
