But AIs properly prompted to act like tutors, especially with instructor support, seem to be able to boost learning a lot through customized instruction
But AIs properly prompted to act like tutors, especially with instructor support, seem to be able to boost learning a lot through customized instruction
LLMs produce funnier memes than the average human, as judged by humans. Humans working with AI get no boost (a finding that is coming up often in AI-creativity work) The best human memers still beat AI, however. arxiv.org/abs/2501.11433
LLMs produce funnier memes than the average human, as judged by humans. Humans working with AI get no boost (a finding that is coming up often in AI-creativity work) The best human memers still beat AI, however. arxiv.org/abs/2501.11433
Claude’s full story: docs.google.com/document/d/1...
Claude’s full story: docs.google.com/document/d/1...
www.bsidessd.org
#DFIR #infosec
www.bsidessd.org
#DFIR #infosec
Their future livelihoods depend on what is happening too. They deserve to understand it and to have background to read the news and talk to their friends and family (2/)
Their future livelihoods depend on what is happening too. They deserve to understand it and to have background to read the news and talk to their friends and family (2/)
I’ve been talking to my classes about what has been going on. I explained indirect costs to them. I talked to them about what a probationary employee is in the government.
At the end of class they asked if we could talk about it more. (1/)
I’ve been talking to my classes about what has been going on. I explained indirect costs to them. I talked to them about what a probationary employee is in the government.
At the end of class they asked if we could talk about it more. (1/)
Everything else fails, including DeepSeek r1, o3-mini-high, and Gemini 2.0 Pro
Everything else fails, including DeepSeek r1, o3-mini-high, and Gemini 2.0 Pro
1) Gemini 2.0 Flash Thinking sets a new high in price-performance, better than DeepSeek r1 (on ELO) and cheaper
2) The cost of GPT-4 capability dropped 1,000 fold in 18 months
3) Pace of improvement is swift
1) Gemini 2.0 Flash Thinking sets a new high in price-performance, better than DeepSeek r1 (on ELO) and cheaper
2) The cost of GPT-4 capability dropped 1,000 fold in 18 months
3) Pace of improvement is swift
The fact that there are not 30 different benchmarks from different organizations in medicine, in law, in advice quality, etc. is a big shame. People are using systems for these things anyway & we don’t know implications.
The fact that there are not 30 different benchmarks from different organizations in medicine, in law, in advice quality, etc. is a big shame. People are using systems for these things anyway & we don’t know implications.
#ZKP
blog.succinct.xyz/introducing-...
#ZKP
blog.succinct.xyz/introducing-...
1) How vital is 100% accuracy on a task?
2) How accurate is AI?
3) How accurate is the human who would do it?
4) How do you know 2 & 3?
5) How do you deal with the fact that humans are not 100%?
Not all tasks are the same.
1) How vital is 100% accuracy on a task?
2) How accurate is AI?
3) How accurate is the human who would do it?
4) How do you know 2 & 3?
5) How do you deal with the fact that humans are not 100%?
Not all tasks are the same.