Lightnews — Scholar-powered news

Phil Swatton

@philswatton.bsky.social

250 followers 370 following 210 posts

Work as a data scientist at the Alan Turing Institute, background in political science. Views my own and not necessarily shared by my employer.

https://philswatton.github.io/

Posts Replies Media Videos

Phil Swatton

@philswatton.bsky.social

I tried to make a case along these lines not too long ago: renewal.org.uk/blog/public-...

Public Opinion and the Survey Response

Political scientists know that survey responses cannot be taken at face value. For most respondents, most of the time, responses are not the product of deeply held, pre-existing, coherent views on par...

renewal.org.uk

November 14, 2025 at 12:22 PM

Phil Swatton

@philswatton.bsky.social

That makes a lot of sense, thank you!

November 12, 2025 at 2:11 PM

Phil Swatton

@philswatton.bsky.social

Wilcoxon is an approach to it, but I guess my q is:

- If tests on individual datasets fail to reject the null (can't tell which is better on any given dataset)
- & a single test comparing accuracies across datasets rejects the null (A is better than B result across datasets)

What should I infer?

November 12, 2025 at 2:03 PM

Phil Swatton

@philswatton.bsky.social

Paper in the brackets I forgot to link is www.jmlr.org/papers/volum...

www.jmlr.org

November 12, 2025 at 1:42 PM

Phil Swatton

@philswatton.bsky.social

Do you:

1) interpret the results as being inconclusive on which classifier is better
2) interpret classifier A as being better than B

Obviously there is

3) Find some extra datasets w/ larger test sets, but I'm curious how people would approach the initial problem

(3/3)

November 12, 2025 at 1:40 PM

Phil Swatton

@philswatton.bsky.social

Classifier A has consistently better accuracy than classifier B on most test sets (say, 13/15). This is significant in a Wilcoxon signed rank test (approach advocated by ).

However, on most _individual_ points (say, 14/15), the 95% CIs on the accuracy on each dataset overlap. (2/3)

November 12, 2025 at 1:40 PM

Phil Swatton

@philswatton.bsky.social

Thank you!

November 7, 2025 at 6:51 PM

Phil Swatton

@philswatton.bsky.social

Hi Ben, I sent an email about the role but fear it may have ended up in your spam

November 7, 2025 at 6:18 PM

Phil Swatton

@philswatton.bsky.social

And good shout on DKs/would not votes - will try that out later today if I remember!

November 7, 2025 at 5:32 PM

Phil Swatton

@philswatton.bsky.social

If I squint hard enough, I might interpret it as established party vs not, in that it explains both the gap between Con & Ref on the one hand, and why Green voters are on the same part as Ref on the other, but I'm not fully convinced by it - e.g. why Lab & LD are more middling on the dimension

November 7, 2025 at 5:32 PM

Phil Swatton

@philswatton.bsky.social

Thank you - the paper looks fascinating, will add to my reading pile, thanks for sharing!

November 7, 2025 at 12:08 PM

Phil Swatton

@philswatton.bsky.social

Based on our recent discussion, maybe or maybe not of interest to

@mariosrichards.bsky.social
@ralphscott.bsky.social
@jack-bailey.co.uk
@heinzbrandenburg.bsky.social

November 7, 2025 at 11:57 AM

Phil Swatton

@philswatton.bsky.social

I've made the code for the ANES output available at: gist.github.com/philswatton/...

gist.github.com

November 7, 2025 at 11:55 AM

Phil Swatton

@philswatton.bsky.social

(this is for most but not all post-election feeling thermometers)

November 4, 2025 at 9:49 PM

Phil Swatton

@philswatton.bsky.social

Here's the equivalent distributions for ANES 2024, they look much spikier (but possibly you still get more out of it, if e.g. people use values ending 0 or 5, that's still 21 meaningful values vs 11 in 0-10 or 7 in 1-7)

November 4, 2025 at 9:49 PM

Phil Swatton

@philswatton.bsky.social

Made this example w/ the 2024 data, will upload a blog post or gist over the next couple of days:

November 4, 2025 at 9:42 PM

Phil Swatton

@philswatton.bsky.social

That's really interesting - what dataset is this from?

November 4, 2025 at 6:11 PM

Phil Swatton

@philswatton.bsky.social

Agreed - I'd be interested in seeing comparisons of 1-7, 0-10, and 0-100 disaggregated across ideological self placements vs warmth ratings too

November 4, 2025 at 5:43 PM

Phil Swatton

@philswatton.bsky.social

1-7 scales are on issues

0-100 are warmth, but you can recover a liberal-conservative dimension from them (I think I have a better plot somewhere):

November 4, 2025 at 5:33 PM

Phil Swatton

@philswatton.bsky.social

I don't know of any dataset that would enable comparison of different scales for warmth ratings though

November 4, 2025 at 5:31 PM

Phil Swatton

@philswatton.bsky.social

It's been a while and I don't have my laptop with me to check atm, but while I imagine a lot of respondents do use '50', '25', etc I think I recall correctly that there was enough variation between different stimuli to rate

November 4, 2025 at 5:31 PM

Phil Swatton

@philswatton.bsky.social

I'm not sure actually - I wouldn't use 0-100 for issue scales - when I wish for them it's more about having warmth ratings towards lots of different stimuli. I helped make a presentation to the RSS on the 2020 presidential w/ them, they were really interesting for nonmetric unfolding:

November 4, 2025 at 5:31 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news