Lightnews — Scholar-powered news

Kenneth Chiu

@kjw-chiu.bsky.social

180 followers 370 following 80 posts

Assoc. Prof. of Computer Science at Binghamton University. Science groupie. Also parallel computing, AI4Science. I mainly reply, haphazardly and eclectically. Expect silly jokes, random things that I found interesting, etc.

Posts Replies Media Videos

Kenneth Chiu

@kjw-chiu.bsky.social

March 14, 2025 at 6:41 PM

Kenneth Chiu

@kjw-chiu.bsky.social

More mystery.

March 14, 2025 at 6:37 PM

Kenneth Chiu

@kjw-chiu.bsky.social

I've actually found it outperform o1. Here, R1 is right, o1 is wrong. (Apparently no way to link to a DeepSeek chat, so just screenshot of the answer.)
o1: chatgpt.com/share/679063...

January 23, 2025 at 6:06 AM

Kenneth Chiu

@kjw-chiu.bsky.social

Are you rubbing it in?

January 23, 2025 at 5:47 AM

Kenneth Chiu

@kjw-chiu.bsky.social

This comment is funny.

January 17, 2025 at 4:13 AM

Kenneth Chiu

@kjw-chiu.bsky.social

More lack of reasoning ability of ChatGPT. This is o1. It first used a bubble sort. Then I told it twice to make it faster. Finally it recognizes that each number appears once and exactly once in the array. But it fails to make the simple next logical step.

January 17, 2025 at 4:02 AM

Kenneth Chiu

@kjw-chiu.bsky.social

LLMs can do some amazing things, but they still lack a lot of common sense. It thinks that people typically keep feathers in a vacuum-filled container.

chatgpt.com/share/6789cf...

January 17, 2025 at 3:56 AM

Kenneth Chiu

@kjw-chiu.bsky.social

I always liked that the UNIX 'cal' command knew this:

January 4, 2025 at 7:08 AM

Kenneth Chiu

@kjw-chiu.bsky.social

ChatGPT o1 is definitely better, but still obviously not capable of reasoning, at least in my opinion.

December 26, 2024 at 6:37 PM

Kenneth Chiu

@kjw-chiu.bsky.social

Amusing ChatGPT fail:

December 26, 2024 at 6:35 PM

Kenneth Chiu

@kjw-chiu.bsky.social

I would also consider this answer wrong. I intentionally didn't specify the types, but there cases, depending on the types, where the answer is different. Could be argued that this is not an typical question, though.

December 26, 2024 at 6:22 PM

Kenneth Chiu

@kjw-chiu.bsky.social

It is kind of true for CS. It generally writes code that would get a significant fraction of the points, but is mediocre. Sometimes it gives answers that totally miss the point, and that I would consider wrong, such as this one: chatgpt.com/share/676d97... .

December 26, 2024 at 6:22 PM

Kenneth Chiu

@kjw-chiu.bsky.social

What if Medusa is behind one of the doors?

November 25, 2024 at 2:24 PM

Kenneth Chiu

@kjw-chiu.bsky.social

Somewhat amusing ChatGPT fail.

November 25, 2024 at 2:24 PM

Kenneth Chiu

@kjw-chiu.bsky.social

Complaining about complaining about complaining.

November 24, 2024 at 4:07 PM

Kenneth Chiu

@kjw-chiu.bsky.social

Another variation on Monty Hall for ChatGPT:😂

November 24, 2024 at 1:58 AM

Kenneth Chiu

@kjw-chiu.bsky.social

Here is Gemini.

November 24, 2024 at 1:50 AM

Kenneth Chiu

@kjw-chiu.bsky.social

More ChatGPT amusement. 😂

November 24, 2024 at 1:44 AM

Kenneth Chiu

@kjw-chiu.bsky.social

Just for amusement, here is ChatGPT struggling with a limit: chatgpt.com/share/673eae.... (The correct answer is 5/2.) At one point, it outputs garbage. Gemini did much better with it.