Lightnews — Scholar-powered news

Henrik Singmann

@singmann.bsky.social

See the same pattern for our Experiments 2 and 3 here. In Experiment 3, we added additional topics (e.g., Separating church from state causes more harm than good.) and more thoroughly controlled argument quality in three levels (good, internally inconsistent, and authority-based).

October 1, 2025 at 7:46 PM

Henrik Singmann

@singmann.bsky.social

The pattern in the average data also holds for each of the arguments (each line/colour per panel is one specific argument). People who think a claim (e.g., "abortion should be legal") is false find the corresponding argument is bad; people who think the claim is true think the argument is good.

Fig. 4 from the paper showing argument quality ratings as a function of belief consistency for each argument in Experiment 1. The overall pattern is shown for each argument shown.
Note. Results of Experiment 1 conditional on the topic and the level of argument support. Each line and colour in each panel shows responses to exactly one argument (i.e., there is no aggregation across items within a panel). The dots show individual responses and the curved lines show predictions from the linear mixed model. Blue dots represent argument quality ratings to good arguments in the data, orange dots represent argument quality ratings to bad arguments in the data, and the size of the dots represents the number of argument quality rating responses for the corresponding belief rating. Data points are dodged so that responses for good and bad arguments do not overlap. Model predictions are based on the fixed effects of the final model and the random effects of the by-topic grouping factor. Ext. = extremely.

October 1, 2025 at 7:43 PM

Henrik Singmann

@singmann.bsky.social

Exciting #rstats news for Bayesian model comparison: bridgesampling is finally ready to support cmdstanr, see screenshot. Help us by installing the development version of bridgesampling and letting us know if it works for your model(s): pak::pkg_install("quentingronau/bridgesampling#44")

$R code and output showing the new functionality: ``` r ## pak::pkg_install("quentingronau/bridgesampling#44") ## see: https://cran.r-project.org/web/packages/bridgesampling/vignettes/bridgesampling_example_stan.html library(bridgesampling) ### generate data ### set.seed(12345) mu <- 0 tau2 <- 0.5 sigma2 <- 1 n <- 20 theta <- rnorm(n, mu, sqrt(tau2)) y <- rnorm(n, theta, sqrt(sigma2)) ### set prior parameters ### mu0 <- 0 tau20 <- 1 alpha <- 1 beta <- 1 stancodeH0 <- 'data { int<lower=1> n; // number of observations vector[n] y; // observations real<lower=0> alpha; real<lower=0> beta; real<lower=0> sigma2; } parameters { real<lower=0> tau2; // group-level variance vector[n] theta; // participant effects } model { target += inv_gamma_lpdf(tau2 | alpha, beta); target += normal_lpdf(theta | 0, sqrt(tau2)); target += normal_lpdf(y | theta, sqrt(sigma2)); } ' tf <- withr::local_tempfile(fileext = ".stan") writeLines(stancodeH0, tf) mod <- cmdstanr::cmdstan_model(tf, quiet = TRUE, force_recompile = TRUE) fitH0 <- mod$sample( data = list(y = y, n = n, alpha = alpha, beta = beta, sigma2 = sigma2), seed = 202, chains = 4, parallel_chains = 4, iter_warmup = 1000, iter_sampling = 50000, refresh = 0 ) #> Running MCMC with 4 parallel chains... #> #> Chain 3 finished in 0.8 seconds. #> Chain 2 finished in 0.8 seconds. #> Chain 4 finished in 0.8 seconds. #> Chain 1 finished in 1.1 seconds. #> #> All 4 chains finished successfully. #> Mean chain execution time: 0.9 seconds. #> Total execution time: 1.2 seconds. H0.bridge <- bridge_sampler(fitH0, silent = TRUE) print(H0.bridge) #> Bridge sampling estimate of the log marginal likelihood: -37.73301 #> Estimate obtained in 8 iteration(s) via method "normal". #### Expected output: ## Bridge sampling estimate of the log marginal likelihood: -37.53183 ## Estimate obtained in 5 iteration(s) via method "normal". ```$

September 2, 2025 at 9:16 AM

Henrik Singmann

@singmann.bsky.social

Yes & we discuss some shortcomings of d_a. As shown below, d_a does not permit an ordering of participants according to performance (d' and g' do). We also compare Type I error rates for g', d', and d_a for real H/FA-pairs where only response bias differs, only g' maintains 5% Type I errors (pp. 51)

April 28, 2025 at 6:46 AM

Henrik Singmann

@singmann.bsky.social

A particularly noteworthy example of a Gumbel-min prediction is shown here. The ROC predicted from g' (calculated from a single yes/no point) closely matches the ROC reconstruction derived independently from forced-choice judgments. The Gaussian model cannot even make a prediction in this case.

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

We compared the descriptive performance of both models across 35 datasets from four different recognition memory paradigms. The Gumbel-min model fits the data nearly as well as the Gaussian model. Once model complexity was penalized via AIC, the Gumbel-min model matched or outperformed the Gaussian.

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

The Gumbel-min model implies a behavioural principle: the probability of choosing a new item remains constant as choice sets grow. An experiment confirms this principle with constant accuracy for new item detection (2M-min). For old-item detection (2M-max), accuracy increase with choice set.

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

We consider an SDT model assuming Gumbel-min (i.e., minimum extreme-value) distributions. The Gumbel-min model avoids the problrms of the Gaussian model, predicts asymmetric ROCs assuming equal variances, and allows calculating measures of discriminability and response bias, g′ and kappa.

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

In recognition memory, ROCs are typically asymmetric, which requires Gaussian distributions with unequal variance. One problem with the unequal-variance model is that it predicts below chance performance for items with very low familiarity (i.e., studying makes some items less familiar).

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

SDT is a cornerstone of recognition memory research, primarily assuming Gaussian distributions – a choice based more on tradition than necessity. The standard model assumes two equal-variance distributions, allows calculating d′ from a pair of hits and false alarms, and predicts symmetric ROCs.

April 27, 2025 at 2:46 PM

Henrik Singmann

@singmann.bsky.social

Results were in line with the qualitative predictions derived from sampling-based models. Predictions also held for the two types of illogical rankings we looked at. We do not know of any other (i.e. non-sampling) model that can make these qualitative predictions and predict illogical rankings.

March 20, 2025 at 9:42 AM

Henrik Singmann

@singmann.bsky.social

Simulation results show different qualitative pattern across event sets. Pr(logical ranking) is largest for mixed sets, followed by edge-event sets, followed by mid-event sets. This pattern held independently of sample size or whether there was additional read-out noise in the sampling process.

March 19, 2025 at 10:21 PM

Henrik Singmann

@singmann.bsky.social

We simulate the probability of obtaining logical and two types of illogical rankings for three different event sets: Edge events (P(A) & P(B) ≈ 1), mid-events (P(A) & P(B) ≈ .5), and mixed sets (P(A) ≈ 1 & P(B) ≈ .5).

March 19, 2025 at 10:18 PM

Henrik Singmann

@singmann.bsky.social

In each trial of the event ranking task, participants have to rank an event set consisting of four events, A, not-A, B, and not-B, in terms of their perceived likelihoods. The task contains an embedded logical that allows to classify the obtained ranking as logical or illogical.

March 19, 2025 at 9:49 PM

Henrik Singmann

@singmann.bsky.social

The follow up:

Email from Professor Brian Ripley to R devel that says:
Sent in error (and not moderated)

February 4, 2025 at 10:49 AM

Henrik Singmann

@singmann.bsky.social

For context:

Email from Professor Brian Ripley to the R devel mailing list instead of another R core member in private. The part shown here reads:
Tomas,

I am thinking of writing something for R-devel, and hope to have your
input first.

I get moderated on R-devel as I am now subscribed as
brian.ripley@R-project.org which of course I cannot send from. So I am
even more discouraged from posting there. (R-core is bad enough with
Luke discouraging all innovation except by him and Simon completely
misunderstanding the C23 status.)

Thanks,

Brian

February 4, 2025 at 7:45 AM

Henrik Singmann

@singmann.bsky.social

If you want to see a bit more up to date explanation, Macmillman & Creelman (2005, ch. 3) also describe the process.

Table 3.2 from Macmillan and Creelman (2005), Detection Theory

January 14, 2025 at 9:40 AM

Henrik Singmann

@singmann.bsky.social

Bluesky is delivering some mixed messages here

December 13, 2024 at 10:54 AM

Henrik Singmann

@singmann.bsky.social

This term in my stats teaching, I regularly included images of Moo Deng into my slides. One of my students was clearly inspired by this combination and made this super cool drawing of Moo Deng doing stats herself. I love it so much. Stats is Moo Deng Approved!

December 12, 2024 at 12:42 PM

Henrik Singmann

@singmann.bsky.social

Getting ready for my last week of teaching with a new stats meme

December 8, 2024 at 10:06 PM

Henrik Singmann

@singmann.bsky.social

Be careful how many emails you send to the CRAN maintainers (and in which format), otherwise your package might get removed from CRAN. Found on the r-package-devel mailing list. #rstats

output from running R CMD check on a package removed from CRAN. The output says:
> 0 errors | 0 warnings | 2 note
> Package was archived on CRAN
> CRAN repository db overrides: X-CRAN-Comment: Archived on 2024-11-06 for repeated policy violation. Repeatedy spamming a team member's personal email address in HTML.
> checking compilation flags used ... NOTE Compilation used the following non-portable flag(s): ＆-Wp,-D_FORTIFY_SOURCE=3＊

November 12, 2024 at 6:53 PM

Henrik Singmann

@singmann.bsky.social

Finally a humble LLM paper.

Abstract
Establishing a unified theory of cognition has been a major goal of psychology [1, 2]. While there have been previous attempts to instantiate such theories by building computational models [1, 2], we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-theart language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model’s internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.

Keywords: cognitive science, cognitive modeling, unified theory of cognition, large language models

October 27, 2024 at 10:58 PM

Henrik Singmann

@singmann.bsky.social

Getting ready for my stats teaching tomorrow and looks like my meme game is on point. I really hope stats meme never go out of fashion (and if so, please no one tell me).

Image of distracted boyfriend meme with a stats context. Text on boyfriend says: "Not approximately normally distributed residuals". Text on (ignored) girlfriend says "Non-Parametric Test". Text on distracting girls says "Still use ANOVA".

Drake meme with yes-and-no image with stats context.
Text on the no image is: "Normality Test of Residuals and then Non-Parametric Test"
Text on yes-image is: "Just use ANOVA"

October 27, 2024 at 9:29 PM

Henrik Singmann

@singmann.bsky.social

In addition to finding strong evidence for the use of compensatory decision strategies. We found evidence for considerable individual differences. The figure shows both mean and individual-level thresholds between the numerical and categorical impacts.

October 17, 2024 at 3:41 PM

Henrik Singmann

@singmann.bsky.social

For both numerical and categorical judgements we found that weather scientists used compensatory decision strategies. An increase on any impact variable led to an increase in perceived severity, even when adjusting for the effect of the other impacts.

October 17, 2024 at 3:35 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news