Daniel Arteaga
dnlrtg.bsky.social
Daniel Arteaga
@dnlrtg.bsky.social
Physicist. Audio and deep learning research at Dolby Labs. Physics, audio, AI, science, technology and society.

Personal account @contraidees.bsky.social
Am I missing something?
October 27, 2025 at 5:19 PM
Yet in AI research (which is essentially statistical modeling)we routinely abandon these basic practices. The irony is striking.
October 27, 2025 at 5:19 PM
In other scientific fields (natural and social sciences), proper statistical analysis is fundamental. You simply cannot publish without it.
October 27, 2025 at 5:19 PM
There's also the added problem that metrics often don't correlate with perception. A 0.1 dB SDR improvement might be meaningless perceptually. But this issue has been discussed more often than the statistical rigor problem.
October 27, 2025 at 5:19 PM
❌ Claims of "superior performance" based on point estimates alone

Example: Paper A reports 15.21 dB, Paper B reports 15.01 dB. Is this difference meaningful or just noise? Do those decimal places have any meaning? Usually impossible to tell from the paper.
October 27, 2025 at 5:19 PM
❌ Values without error bars/confidence intervals
❌ Standard deviations sometimes quoted but no uncertainty estimates of means
❌ No significance testing whatsoever
❌ No effect size analysis
❌ No exploratory analysis beyond the mean
October 27, 2025 at 5:19 PM
We're not even applying methods from first-year undergraduate physics—like reporting results with error bars. The problems I regularly see would make any physics professor cringe.
October 27, 2025 at 5:19 PM
This work was the result of Silvia Arellano's internship in Dolby Barcelona with us.

Come explore the demo here:
🔗 silviaarellanogarcia.github.io/rir-acoustic/
📄 Paper: arxiv.org/pdf/2507.12136

Feedback & questions welcome!
July 18, 2025 at 8:13 AM
We explore 4 DAC-based models:
1️⃣ AR w/ cross-attention
2️⃣ AR w/ classifier guidance
3️⃣ MaskGIT w/ adaptive layer norm
4️⃣ Flow matching

The MaskGIT model achieves the best subjective quality (avg. 70 MUSHRA score), beating state of the art comparisons.
July 18, 2025 at 8:13 AM
Instead of simulating room geometry, we train four different generative model to produce RIRs conditioned on acoustic attributes (T30, T15, EDT, D50, C80, source-receiver distance)
July 18, 2025 at 8:09 AM
Last winner, Eloi Moliner, pioneered diffusion models in AI for audio restoration. Could you be next?

arxiv.org/abs/2210.15228
Solving Audio Inverse Problems with a Diffusion Model
This paper presents CQT-Diff, a data-driven generative audio model that can, once trained, be used for solving various different audio inverse problems in a problem-agnostic setting. CQT-Diff is a neu...
arxiv.org
June 18, 2025 at 10:33 AM
The key issue isn't the most likely outcome — it's the worst-case scenario we must be prepared for.

arxiv.org/abs/2401.02843
May 28, 2025 at 8:42 AM
Just as nuclear research is subject to international oversight, frontier AI development should be too. We need strong global regulatory frameworks for models with potentially vast power.

www.iaea.org
International Atomic Energy Agency | Atoms for Peace and Development
The IAEA is the world's centre for cooperation in the nuclear field, promoting the safe, secure and peaceful use of nuclear technology. It works in a wide range of areas including energy generation, h...
www.iaea.org
May 28, 2025 at 8:42 AM
Example audio samples here: metlosz.github.io/dealiasing_a...
Deep learning based spatial aliasing reduction in beamforming for audio capture
metlosz.github.io
May 27, 2025 at 2:36 PM
Spatial aliasing occurs when microphone spacing in an array is too large relative to the wavelength of sound, degrading the accuracy of beamforming.

As far as we know, this is the first deep learning paper that addresses directly this problem (other approaches deal with this in an indirect way).
May 27, 2025 at 2:36 PM