Lightnews — Scholar-powered news

Kwindla Hultman Kramer

@kwindla.bsky.social

98 followers 740 following 50 posts

Low, low, low latency. Daily.co and Pipecat.ai

Posts Replies Media Videos

Kwindla Hultman Kramer

@kwindla.bsky.social

March Voice AI Meetup - Wednesday the 5th

lu.ma/ffpyl57n

February 17, 2025 at 1:58 AM

Kwindla Hultman Kramer

@kwindla.bsky.social

Source code is here:

github.com/pipecat-ai/p...

My favorite thing about this demo is that it's a really nice example of composite function calling.

Here are the function definitions. Gemini figures out solely from the argument descriptions how to find a conversation from "a few minutes ago"!

February 4, 2025 at 3:51 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

Memory for voice AI agents (and composite function calling) ...

There are several ways to store (and later, retrieve) conversation state. One of the simplest is just to define a couple of functions and use your local filesystem!

Here, @chadbailey.net shows how to do that, using Gemini 2.0 Flash.

February 4, 2025 at 3:51 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

Sean DuBois is one of my favorite people to talk to about WebRTC, audio and video, designing good libraries, and hacking in general.

Sean is the creator of Pion. Pion is an Open Source WebRTC implementation that is influential and very widely used (including at OpenAI, where Sean works).

February 3, 2025 at 8:25 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

My favorite part of the DeepSeek-V3 Technical Report is the stuff about the all-to-all communication kernels. (Mostly in section 3.2.2. "Efficient Implementation of Cross-Node All-to-All Communication.")

January 30, 2025 at 8:46 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

January 24, 2025 at 1:37 AM

Kwindla Hultman Kramer

@kwindla.bsky.social

January 19, 2025 at 9:54 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

Sunday morning listening ... and hacking.

January 12, 2025 at 2:52 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

Oh, wait. I take it back.

January 10, 2025 at 11:15 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

They know what they’re doing over there in Cupertino (and Shenzhen).

January 10, 2025 at 11:13 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

iOS + Gemini Multimodal Live + WebRTC

Filipi Fuchter added an iOS example to the Pipecat "Simple Chatbot" repo. With the Pipecat iOS SDK, you can build apps that use Gemini Multimodal Live and Gemini Flash with WebRTC, WebSockets, and HTTP networking.

January 10, 2025 at 7:23 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

The voice-to-voice AI Pareto frontier ...

Gemini 1.5 Flash occupies an interesting place in the capabilities matrix for voice AI. It's fast, very inexpensive, has a long context window, and has native audio input.

I've been experimenting with Gemini a lot. Here's an interesting Pipecat pipeline:

December 5, 2024 at 4:48 PM

Kwindla Hultman Kramer

@kwindla.bsky.social

Sunset. Double overhead day.

December 3, 2024 at 12:54 AM

Kwindla Hultman Kramer

@kwindla.bsky.social

Team Suparova at the @supabase / @ycombinator hackathon.

There was a four-participant limit on the team size. We have five, but two are robots.

Last night was a very long session with lots of tiny little screws and some heavy ifconfig action.

November 23, 2024 at 7:46 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news