Tomasz Stefaniak
tomasz-fm.bsky.social
Tomasz Stefaniak
@tomasz-fm.bsky.social
Lately: AI things.

Also:
http://rankanything.online - rank things
http://youtube.be/@EggPlusRadish - travel channel
http://stefaniak.cc - more links
@revenuecat.com are you guys able to help?
October 24, 2025 at 12:09 AM
No fine tuning, everything is in the prompt and in the template.

Users can plug in their own models though, and these can be fine tuned. Although as far as I'm aware most people use either Mistral, OpenAI, Claude, or open source models.
November 30, 2024 at 6:23 PM
Thank you for reading! If you made it this far, consider reposting this for others too see.

If you have any questions let me know in the comments!
November 30, 2024 at 5:23 PM
🔶 StreamTransformPipeline: github.com/continuedev/...

🔶 getAllSnippets.ts: github.com/continuedev/...

🔶 AutocompleteTemplate: github.com/continuedev/...

🔶 Filtering snippets: github.com/continuedev/...

🔶 Compiling the template: github.com/continuedev/...
github.com
November 30, 2024 at 5:23 PM
Here are some useful links to our repo where you can see the code.

Also note that we are always looking for contributors in case you feel like playing around with this algorithm yourself!

🔶 Unit tests: github.com/continuedev/...

🔶 CompletionProvider: github.com/continuedev/...
github.com
November 30, 2024 at 5:23 PM
And that's it - that's how Continue Autocomplete is made!

Keep reading some Github links to our codebase where you can dig into this in more detail 🧑‍💻
November 30, 2024 at 5:22 PM
🔷 Smarter algorithm for selection of git snippets
🔷 Using RAG for snippet collection
🔷 Retrying completions if a completion fails
🔷 Running multiple completion requests in parallel
November 30, 2024 at 5:21 PM
We are always looking at ways to improve and have plenty of ideas on our bucket list:

🔷 Better ranking algorithm for code snippets
🔷 Collecting code snippets from all files recently touched by the user
🔷 Tree Sitter queries for more languages
November 30, 2024 at 5:21 PM
The entire process takes (a couple hundred) milliseconds and at the end of it, the user receives a completion.
November 30, 2024 at 5:20 PM
Sometimes, the model might take too long to respond.

Based on our data, if the user doesn’t receive a completion within the first 300ms they’re unlikely to accept it.

That’s why, if the stream is taking too long, we display whatever we have generated so far and fetch the rest in the background.
November 30, 2024 at 5:19 PM
We have a filter for each one of the major problems.

Some of them fix the issues, some of them have the authority to end the stream and return an empty response instead.
November 30, 2024 at 5:19 PM
Unfortunately, models make all sorts of mistakes, for example, they:

❌ Add unnecessary brackets
❌ Add too much whitespace
❌ Repeat the lines below
❌ Return too much code
November 30, 2024 at 5:18 PM
We pass the template to the correct LLM and get a stream of characters in response. This is where the filtering and processing comes in.
November 30, 2024 at 5:18 PM
The final prompts might looks like this. They contain a header with snippets and a prefix and suffix separated by a FIM marker:
November 30, 2024 at 5:17 PM
Continue allows you to plug in a variety of LLMs and we need to generate the right template for each one of them.

We have some custom, model-specific templates for the most popular models, and a generic fallback template for the remaining models.
November 30, 2024 at 5:17 PM
For the sake of brevity I won’t get into the details of how construct the template and the final prompt, there will be codebase links at the end, though.

The important part is the following:
November 30, 2024 at 5:16 PM
At the moment, our algorithm is simple. We use any git and clipboard snippet we are able to collect and if there are still tokens left, we fill the prompt with import definitions from the AST.
November 30, 2024 at 5:16 PM
Every model has a token limit - anything longer than the limit gets truncated. That’s why we need to decide what context data to include.
November 30, 2024 at 5:15 PM
For a similar reason, we also sometimes use the data from the clipboard, especially if it’s fresh.
November 30, 2024 at 5:14 PM
Next, we collect all the uncommitted git changes - they provide important context on user intent.

For example, if a user just created a ‘UserSettingsForm’ React component and now they’re on ‘UserSettingsPage.tsx’, the model can guess the user most likely wants to place the form there.
November 30, 2024 at 5:14 PM
Writing relevant tree sitter queries is, well, a huge pain. You need separate queries for each language and for each node type you want to capture.

If you’d like to find out more about how we do this, Nate Sesti has written about this in the past:

🔗 blog.continue.dev/root-path-co...
Root path context: The secret ingredient in Continue's autocomplete prompt
If you interpreted “root path” as the root directory of your computer, don’t worry! Continue wants nothing to do with this. What we’re going to share is our strategy for gathering (and caching) autoco...
blog.continue.dev
November 30, 2024 at 5:13 PM
At the moment, we detect the relevant AST nodes using Tree Sitter queries and use the built in IDE functionalities to collect the definition of each one.
November 30, 2024 at 5:12 PM
For example, if the function is ‘createUser(user: Partial<User>)’, we need to know the type of the User object, or the model is going to hallucinate a plausible-looking type on its own.
November 30, 2024 at 5:11 PM