Lightnews — Scholar-powered news

Ryan Angilly

@angilly.bsky.social

How’d it do?

December 10, 2024 at 1:06 AM

Ryan Angilly

@angilly.bsky.social

Qwen is probably best out there right now: ollama.com/library/qwen...

qwen2.5-coder

The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.

ollama.com

December 9, 2024 at 2:02 AM

Ryan Angilly

@angilly.bsky.social

If you want a nicer UI, check out OpenWebUI. It presents a nice ChatGPT-esque web UI with history and etc….

December 9, 2024 at 2:00 AM

Ryan Angilly

@angilly.bsky.social

My hunch is that they can write machine code right now well enough. I've never seen any evals on it, though.

One thing to consider is portability. Machine code is denser than source code, but I'd bet cross compiling source code to 50 distros is far cheaper from a compute perspective.

December 2, 2024 at 4:38 PM

Ryan Angilly

@angilly.bsky.social

But yeah I guess bottom line, rag can get you far. Won’t know where it breaks until it does unfortunately. I look forward to a world where RAG systems can monitor themselves and signal to a user “hey it might be time to do some fine tunings!”

December 2, 2024 at 3:17 PM

Ryan Angilly

@angilly.bsky.social

Depends on the use case. If the query is “what is my most controversial opinion across all my notes?” then rag can easily fall over unless you anticipated it ahead of time in the indexing pipeline. That’s admittedly an extreme example, but the spectrum between that & simple fact retrieval is blurry

December 2, 2024 at 3:12 PM

Ryan Angilly

@angilly.bsky.social

Yeah I get what you’re saying. But I’d caution against dismissing people because they don’t speak for _everyone_.

I am an expert 😂 and while I trust LLMs for many things, me and most of my friends very much would not trust an LLM machine code output.

December 2, 2024 at 3:00 PM

Ryan Angilly

@angilly.bsky.social

What is total dataset size in bytes? If complex reasoning across the whole set of notes is required for your use case — it could be! — RAG will fall over on you.

December 2, 2024 at 1:29 PM

Ryan Angilly

@angilly.bsky.social

Have you done any experiments with your benchmarks going from 1 to 100 examples to see if accuracy regresses?

December 2, 2024 at 1:22 PM

Ryan Angilly

@angilly.bsky.social

I think it can[1] but we don’t do it because:

1) we don’t trust the LLM enough. We want to review the code.
2) high level languages give you a higher density of expression per token. i.e. it takes less tokens so you get faster answers

[1] chatgpt.com/share/674db3...

ChatGPT - x86 Hello World Code

Shared via ChatGPT

chatgpt.com

December 2, 2024 at 1:20 PM

Ryan Angilly

@angilly.bsky.social

I work in it so I’m in a bit of a bubble. What are some of the most egregious lies you see?

November 30, 2024 at 7:46 PM

Ryan Angilly

@angilly.bsky.social

Ok very cool.

Do you run any benchmarks against your default prompt templates, and have you published them so others can compare different models or prompt/template tweaks?

November 30, 2024 at 6:42 PM

Ryan Angilly

@angilly.bsky.social

Do you fine tune any of your models much or do you just work with prompt templating?

November 30, 2024 at 6:15 PM

Ryan Angilly

@angilly.bsky.social

Long story short I think the change is a 10 year horizon. Not 2.

November 30, 2024 at 6:07 PM

Ryan Angilly

@angilly.bsky.social

Only just recently have the models with long enough context length and recall across context to make retrieval work.

November 30, 2024 at 6:07 PM

Ryan Angilly

@angilly.bsky.social

It’s completely transformed how I work: writing code, tests, design docs; less time scouring stackoverflow or fighting with plantuml/mermaid making diagrams. I’m far more productive.

But I’m a special case.

I think the real unlock is going to be agents. This promise still hasn’t been realized.

November 30, 2024 at 6:03 PM

Ryan Angilly

@angilly.bsky.social

How this? Can’t tell if it’s underdone.

November 28, 2024 at 7:09 PM

Ryan Angilly

@angilly.bsky.social

Flowers enjoyed. Very nice flowers.

November 27, 2024 at 3:00 PM

Ryan Angilly

@angilly.bsky.social

🙋🏻‍♂️

November 27, 2024 at 2:53 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news