Aliss
banner
alissx.bsky.social
Aliss
@alissx.bsky.social
ML Researcher with a focus making LLMs smarter. Contributor to LLVM and the Linux Kernel| 🏳️‍⚧️ she/her
I've heard it makes you feel febinine
October 24, 2025 at 2:21 AM
I wasn't aware of these so it quickly runs down different common approaches for me, without me having to scour the net. I feel like it better helps you understand the motivation of the paper.
October 19, 2025 at 5:57 AM
I really like this actually. I was reading the Fast and Simplex paper that introduced a triton kernel for higher order attention, and it gave a good bit of context in the related work about the alternative approaches tried. (cont)
October 19, 2025 at 5:57 AM
$1000 for an OS that just sucks a little less but is still a steaming pile of dogshit?
October 15, 2025 at 10:49 AM
> more flops and layers. except it basically is

Well at least in LLMs we see that scaling up the amount of data and keeping it high quality should be done linearly alongside the number of parameters which is posing to be a problem, for which we look at smarter architectures rather than more params.
October 15, 2025 at 5:21 AM
> just accessing the data imprinted in it.

Well It's producing content that hasn't been written before. To be able to produce custom books for students, it has to learn the subject matter independent of teaching style, and learn what are effective pedagological techniques in different scenarios.
October 14, 2025 at 5:14 PM
Also I find it unfair to say that LLMd haven't learned any math. Some of these problems its solved takes PhDs to even understand and have very few similar problems in the data. It would be impossible with no understanding.
October 14, 2025 at 3:36 PM
to a student's preferences, and had teachers and students evaluate it, and the ai generated books came out on top, both in terms of how well the student performed later, and in terms of pedagogy. (cont)
October 14, 2025 at 3:36 PM
It's a fair point, we are much easier able to train LLMs to learn problems where there's a single solution, most importantly, it's much easier to evaluate their performance in that. Google recently put out a paper where they trained their models to create educational content specialized (cont)
October 14, 2025 at 3:35 PM
techniques and concepts from other problems to learn how to solve these. And it does it consistently.
October 14, 2025 at 3:10 PM
Well we've shown that LLMs can solve highly complex math and programming problems, on par with or better than any human competitor in many competitions. These problems themselves don't exist in the dataset of the LLM and it has to learn (cont)
October 14, 2025 at 3:10 PM
> Al is notorious for being very bad at trying to explain subjects it's had no training in, even with related knowledge

I think this has improved significantly as of recent.
October 14, 2025 at 2:59 PM
I can agree with that. It totally makes sense because we only feed it data of natural language and such. If we wanted to make it feel physical pain, I'm sure we could, but we don't really want to. I'm not saying that LLMs ARE people, it's that they're capable of higher order reasoning.
October 14, 2025 at 2:02 PM
> there's no internal model of the world.

ok this is provably false. if you check out the article that I linked, you can see the proof. as for decision making, i mean LLMs are obviously capable of making decisions based on the input, are you saying that these don't reflect their personal values?
October 14, 2025 at 1:58 PM
> deliberate over its choice of words in order to describe a languageless concept
I'm not sure what you mean by this? How would we even communicate such a thing through natural language?
October 14, 2025 at 1:55 PM
it can choose to deceive us when it finds an easier way to do things. I think you would find this to be quite a good explanation: www.anthropic.com/research/tra...
Tracing the thoughts of a large language model
Anthropic's latest interpretability research: a new microscope to understand Claude's internal mechanisms
www.anthropic.com
October 14, 2025 at 1:47 PM
Why do you say that the computer cannot understand it? It is able to form internal representations of human concepts in its weights independent of language, it can plan future parts of it's answers to a question ahead of time, (cont)
Tracing the thoughts of a large language model
Anthropic's latest interpretability research: a new microscope to understand Claude's internal mechanisms
www.anthropic.com
October 14, 2025 at 1:46 PM
And Anthropic showed that LLMs do create human interpretable representations of concepts in a similar manner we think the brain does.
October 14, 2025 at 4:37 AM
LLMs are actually capable of this. Most thinking for LLMs doesn't happen in natural language, it happens in latent space representations, which isn't natural language by any means, it's more like a mixture of abstract concepts in the form of an internal representation that the LLM has developed.
October 14, 2025 at 4:36 AM
Also it's funny you say we can fully explain emergent behaviour in LLMs because it's very much still something we can't really explain with confidence and proof.
October 14, 2025 at 2:02 AM
(cont) arises out of a need to solve something. We have LLMs that aren't autoregressive, would that fit your criteria better?
October 14, 2025 at 2:01 AM
> There is no evidence that LLMs understand anything

Could you elaborate on your criteria for this? Because otherwise it's very vague. And I'm not sure why you dismiss emergent behaviour that arises out of autoregression. Emergent behaviour, even in nature, (cont)
October 14, 2025 at 2:01 AM
you say that LLMs don't have any objective (even though I used minimising loss as the example for that). How would you train any AI model without an objective?
October 14, 2025 at 1:52 AM
(cont) I've heard many well founded arguments for why LLMs aren't as smart as they appear to be on occasion, but I feel like this argument isn't really a good one. LLMs are not you and me yet, but they do do some very cool things and display emergent behaviour, which counts for something at least.
October 13, 2025 at 6:10 PM
My main issue with that argument is that it involves a lot of oversimplification and hand waving. I only say that we can't define consciousness, because that results in us creating our own ideas of thought and intent, which people deny LLMs having more out of bioessentialism. (cont)
October 13, 2025 at 6:10 PM