Scary in multiple ways.
Scary in multiple ways.
I imagine that's true. Fun experiment to try
I imagine that's true. Fun experiment to try
I do think LLM writing is weird/hollow at times. Just sounding/looking right but, might not say a lot.
But why can't future models learn the gap?
I do think LLM writing is weird/hollow at times. Just sounding/looking right but, might not say a lot.
But why can't future models learn the gap?
My other claim was just that Alex's implementation in cuda (for DL) was very well engineered and his was the most efficient of ones I'm aware of (for DL).
I see where we crossed wires now
My other claim was just that Alex's implementation in cuda (for DL) was very well engineered and his was the most efficient of ones I'm aware of (for DL).
I see where we crossed wires now
Catastrophic forgetting.
I do remember hearing about "Artificial Neural Networks" (the old school term no one uses anymore[?]) and ZISC when I was young and remember thinking they sounded super cool. But that's all I remember haha
Catastrophic forgetting.
I do remember hearing about "Artificial Neural Networks" (the old school term no one uses anymore[?]) and ZISC when I was young and remember thinking they sounded super cool. But that's all I remember haha
Also I think Alex's code was much faster for deep conv nets.
Also I think Alex's code was much faster for deep conv nets.
Testing a XOR gate vs an LLM.
You can fully specify all inputs and outputs for the XOR gate. Hard with LLM.
But maybe I'm taking safety a little too literally.
Testing a XOR gate vs an LLM.
You can fully specify all inputs and outputs for the XOR gate. Hard with LLM.
But maybe I'm taking safety a little too literally.
the difference I can imagine is the degree of testing is very different if theres a huge combination of possible outputs (like we see in LLMs)
the difference I can imagine is the degree of testing is very different if theres a huge combination of possible outputs (like we see in LLMs)