Grateful to all the support and kindness from people around this year 🙏
Grateful to all the support and kindness from people around this year 🙏
I am churning off of lovable/replit
I am churning off of lovable/replit
OpenEdison tracks per-session agent tool use to block write actions when they encounter the lethal trifecta
OpenEdison tracks per-session agent tool use to block write actions when they encounter the lethal trifecta
Agents + Tools/MCP = Data leak risk
OpenEdison is an OSS firewall that deterministically blocks data exfiltration & dangerous agent action, even if jailbroken.
👇 comment your MCP use, I'll dm how risky your use is
x.com/Eito_Miyamu...
Agents + Tools/MCP = Data leak risk
OpenEdison is an OSS firewall that deterministically blocks data exfiltration & dangerous agent action, even if jailbroken.
👇 comment your MCP use, I'll dm how risky your use is
x.com/Eito_Miyamu...
Cannot believe how well it drives
Cannot believe how well it drives
@Konstantine is incredibly written
On-demand (e.g. Uber Eats) vs consultative shopping, MCPs for standardisation, Predictive Shipping, robot roaming stores, IoT-smart home proactive purchases, so much more
x.com/Konstantine...
@Konstantine is incredibly written
On-demand (e.g. Uber Eats) vs consultative shopping, MCPs for standardisation, Predictive Shipping, robot roaming stores, IoT-smart home proactive purchases, so much more
x.com/Konstantine...
- The OAI model that solved the IMO can tell if it "doesn't know" how to solve the problems, like it did on Q6 -> huge???? Combined w/ Anthropic steering research, hallucinations might be solvable???
- Emphasis on progress on non-verifiable tasks
- The OAI model that solved the IMO can tell if it "doesn't know" how to solve the problems, like it did on Q6 -> huge???? Combined w/ Anthropic steering research, hallucinations might be solvable???
- Emphasis on progress on non-verifiable tasks
Google is losing search market share so badly they have to start putting ads in London underground
First time in my life I saw a Google ad
Google is losing search market share so badly they have to start putting ads in London underground
First time in my life I saw a Google ad
The meme turing test has already been solved too. What's the reason to believe it cannot be solved for human-to-human communciation?
The meme turing test has already been solved too. What's the reason to believe it cannot be solved for human-to-human communciation?
Insights:
- Deep Research is an example of non verifiable RL & many apps exist where RL + application layer is useful
- "existing multi-agent RL is not bitter lesson pilled enough", comments on not needing explicit external agent modelling
Insights:
- Deep Research is an example of non verifiable RL & many apps exist where RL + application layer is useful
- "existing multi-agent RL is not bitter lesson pilled enough", comments on not needing explicit external agent modelling
YC clearly 3 steps behind
YC clearly 3 steps behind
I swear they're going to get kicked out of the WeWork for suspicions of abusing free electricity for bitcoin mining or causing a fire with the radiator-levels of heat coming from these mac minis
I swear they're going to get kicked out of the WeWork for suspicions of abusing free electricity for bitcoin mining or causing a fire with the radiator-levels of heat coming from these mac minis
1) Gemini-2.0-Flash >>> gpt-4o-mini or any other models
2) (Groq) Llama-3.3-70B / DeepSeek-R1-Llama-70B distill (smarter, faster, cheaper than gpt-4o, though caveat that Groq is VC-subsidised)
source: @ArtificialAnlys
1) Gemini-2.0-Flash >>> gpt-4o-mini or any other models
2) (Groq) Llama-3.3-70B / DeepSeek-R1-Llama-70B distill (smarter, faster, cheaper than gpt-4o, though caveat that Groq is VC-subsidised)
source: @ArtificialAnlys