Don’t see many takers who don’t already have jobs in the field going “we barely understand how these fucking things work”
Don’t see many takers who don’t already have jobs in the field going “we barely understand how these fucking things work”
“Could you explain how it generates a sentence, after being fine-tuned on that sentence, being provided a delta after fine-tuning, for ten million dollars?”
“That would be so torturously difficult that 10 million wouldn’t make it worth it”
Great chat LLM Understander
“Could you explain how it generates a sentence, after being fine-tuned on that sentence, being provided a delta after fine-tuning, for ten million dollars?”
“That would be so torturously difficult that 10 million wouldn’t make it worth it”
Great chat LLM Understander
That does not make it trivial to explain.
It does not make it explainable under any reasonable timeframe, or even an unreasonable timeframe with $10 million of compensation
That does not make it trivial to explain.
It does not make it explainable under any reasonable timeframe, or even an unreasonable timeframe with $10 million of compensation
Feed the model of the first sentence, it will output the second.
You should be able to reason about how changes to the model will affect the output of the second sentence in a predictable manner
Feed the model of the first sentence, it will output the second.
You should be able to reason about how changes to the model will affect the output of the second sentence in a predictable manner
I’m asking how long you think it would take to comprehend the weight changes that occur from fine-tuning on a single sentence.
Like, one whole sentence worth of changes, truly, deeply understood
I’m asking how long you think it would take to comprehend the weight changes that occur from fine-tuning on a single sentence.
Like, one whole sentence worth of changes, truly, deeply understood
A random number generator is not a fundamental component of an LLM
A random number generator is not a fundamental component of an LLM
Incepted, not derived.
Just because random numbers were part of the process doesn’t mean that’s all we are dealing with, and it is not even necessary to use random numbers in training or inference, just optimal.
Incepted, not derived.
Just because random numbers were part of the process doesn’t mean that’s all we are dealing with, and it is not even necessary to use random numbers in training or inference, just optimal.
Even discretely, if told “we are now training/fine-tuning on a given sentence”, and looking at the weights changed, understanding those weights is beyond our understanding
Even discretely, if told “we are now training/fine-tuning on a given sentence”, and looking at the weights changed, understanding those weights is beyond our understanding
That is very much within comprehension, it’s just five lines, and the logic is understood.
That is very much within comprehension, it’s just five lines, and the logic is understood.
I could write a short algorithm which would do this manually, and keeping track of each prime’s index in a 64-bit int, but that may take some time to execute.
I could write a short algorithm which would do this manually, and keeping track of each prime’s index in a 64-bit int, but that may take some time to execute.
“We know exactly how LLMs work”
We know how the virtual machine that runs and creates them works.
With limitless time, we could understand how their weights embed logic, but we currently don’t.
Do you disagree with any of these statements?
“We know exactly how LLMs work”
We know how the virtual machine that runs and creates them works.
With limitless time, we could understand how their weights embed logic, but we currently don’t.
Do you disagree with any of these statements?
That doesn’t mean we understand their weights. Simple as.
That doesn’t mean we understand their weights. Simple as.
They can be fully understood, the same way a modern CPU die can be, albeit with orders of magnitude more complexity than a billion-transistor die
They can be fully understood, the same way a modern CPU die can be, albeit with orders of magnitude more complexity than a billion-transistor die
I agree, it can be.
I disagree that is is.
I agree, it can be.
I disagree that is is.
We do not comprehend how they are so effectively able to compress the corpus of human knowledge into gigabytes
We do not comprehend how they are so effectively able to compress the corpus of human knowledge into gigabytes