Lightnews — Scholar-powered news

ZDi

@zdi1908.bsky.social

57 followers 440 following 110 posts

🇦🇷 I make machines (and myself) learn. Backpropagation, C++ enthusiast.
Currently doing ML speech synthesis (and general DL) research @ my bedroom

🌐: https://zdisket.github.io/page.html
🖋: https://zdtech.substack.com/

Posts Replies Media Videos

ZDi

@zdi1908.bsky.social

Tesla FSD when I ask it to drive me to Will Stancil's house

July 12, 2025 at 4:56 AM

ZDi

@zdi1908.bsky.social

After switching the encoder to a pretrained Resnet18, freezing layers and training for 1 epoch, my model can (kind of) drive, having learned from 81k frames in 44 laps of me playing

May 29, 2025 at 3:04 AM

ZDi

@zdi1908.bsky.social

I'm supposed to be writing a technical report, but I can't stop testing out my music LSTM (tech demo for my approach to language modeling audio). Only 18M parameters btw

May 24, 2025 at 3:30 AM

ZDi

@zdi1908.bsky.social

Audio language modeling has always involved people training models to VQ audio directly. But what if we quantized mel spectrograms, then trained a vocoder like iSTFTNet, and later our AR prior on mel spectrogram indices?
We can language model 44.1KHz audio with a single 1k codebook.

May 3, 2025 at 8:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news