Colin Doyle
colindoylelaw.bsky.social
Colin Doyle
@colindoylelaw.bsky.social
"Great lecturer, caring teacher, but perhaps not the best person?" - anonymous student evaluation

Law Professor, Loyola Law School, Los Angeles

colin.doyle@lls.edu

https://www.colin-doyle.net/
Every day in my work with LLMs, I go back and forth between being astonished by their brilliance and astonished by their stupidity.
December 5, 2024 at 6:20 PM
In my limited experience, the faculty who understand and interact with A.I. the most have the most mercurial opinions not just on what A.I. will be able to do in the short term but what A.I. can even do now.
December 5, 2024 at 6:20 PM
I also wonder if it might be an example of this kind of effect:

the-decoder.com/language-mod...

Notably, o1-preview doesn't seem to share this problem with connections puzzles. Very curious about if this was a particular problem OpenAI targeted and how.
Language models know Tom Cruise's mother, but not her son
An experiment shows that language models cannot generalize the simple formula "A is B" to "B is A". But why is that?
the-decoder.com
December 3, 2024 at 11:01 PM
Yes, GPT-4o struggled the most with linguistic puzzles and puzzles in which the connection between the words was in the form of another word that could appear either immediately before or immediately after each of the four puzzle words.
December 3, 2024 at 6:13 PM
I just wrote a paper related to this: arxiv.org/abs/2411.057...

With a complicated prompt system, GPT-4o was able to solve 86% of puzzles. OpenAI's new o1 model is the strongest. When prompted to make one guess at a time and receiving feedback on bad guesses, it could solve 100%.
LLMs as Method Actors: A Model for Prompt Engineering and Architecture
We introduce "Method Actors" as a mental model for guiding LLM prompt engineering and prompt architecture. Under this mental model, LLMs should be thought of as actors; prompts as scripts and cues; an...
arxiv.org
December 2, 2024 at 10:53 PM