Lightnews — Scholar-powered news

Albert Adler

@iamalbertadler.bsky.social

I build mobile apps and websites trying to solve problems. Also, I share my daily progress here (some people call it #buildinpublic )

Posts Replies Media Videos

Albert Adler

@iamalbertadler.bsky.social

A few extra details:
- Input images are also cached
- Used tools (like search or code) are also cached
- If you use structured output schema, that is also cached!
- Cache is cleared from the OpenAI servers after 5-10 minutes of no requests (although it can last up to 1 hour)

March 12, 2025 at 11:48 PM

Albert Adler

@iamalbertadler.bsky.social

By just organising a bit the prompt, you get some nice benefits:
- Less latency for big prompts (OpenAI says up to 80% less latency!!)
- Reduced costs for the input tokens (since a big part will be cached, OpenAI says up to 50% reduced costs)
- And caching is free, so do it!

March 12, 2025 at 11:48 PM

Albert Adler

@iamalbertadler.bsky.social

The user prompt is the "dynamic" part. The content specific for the article I am creating now. Imagine it like the custom instructions your user provide. That goes at the end.

March 12, 2025 at 11:48 PM

Albert Adler

@iamalbertadler.bsky.social

In the example below I am prompting a basic article writer.

The system prompt will define the basics of every article, so it is my "static" content, it goes at the beginning of the final prompt.

March 12, 2025 at 11:48 PM

Albert Adler

@iamalbertadler.bsky.social

Set the static content of the prompt always at the beginning and the variable data at the end.

Let me give an example (sad that X does not allow code formatting yet...):

March 12, 2025 at 11:47 PM

Albert Adler

@iamalbertadler.bsky.social

Joking. It is good that caching is already enabled for you, but to get the most of it you should structure the prompts in a "special" way:

March 12, 2025 at 11:47 PM

Albert Adler

@iamalbertadler.bsky.social

End of the guide?

March 12, 2025 at 11:47 PM

Albert Adler

@iamalbertadler.bsky.social

So... In OpenAI, prompt caching is enabled by default for prompts with 1024 or more, at least in gpt-4.5-preview, gpt-4o, mini...

March 12, 2025 at 11:47 PM

Albert Adler

@iamalbertadler.bsky.social

As an initial disclaimer, the way of doing it depends on what are you using. In this case I will use OpenAI API, but I will publish other guides with other AIs!

March 12, 2025 at 11:47 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news