Lightnews — Scholar-powered news

Kyle McDonald

@kcimc.bsky.social

sailing from lumolumo to dawson island

November 7, 2025 at 8:40 AM

Kyle McDonald

@kcimc.bsky.social

water witches and solar power

November 7, 2025 at 8:01 AM

Kyle McDonald

@kcimc.bsky.social

from the other side of the welcoming party

November 7, 2025 at 7:41 AM

Kyle McDonald

@kcimc.bsky.social

betel nut, berries, apples and coconuts

October 20, 2025 at 3:21 AM

Kyle McDonald

@kcimc.bsky.social

new cables and old ladders

October 19, 2025 at 1:39 AM

Kyle McDonald

@kcimc.bsky.social

a short trip from alotau to lumolumo with an incredible welcoming

October 18, 2025 at 3:40 AM

Kyle McDonald

@kcimc.bsky.social

i’m on day 19 of 50 days in the south pacific, helping upgrade power and internet for two traditional voyaging organizations—and trying to capture a rare flash of light called “te lapa”. i’m posting daily on instagram instagram.com/kcimc

October 6, 2025 at 12:30 AM

Kyle McDonald

@kcimc.bsky.social

the latest realtime video-to-video demos are wild

July 29, 2025 at 4:00 AM

Kyle McDonald

@kcimc.bsky.social

has anyone written about AI-assisted vibe coding? the loss of the flow state, the move away from mental modeling of computational processes, managerialization of the developer class, etc? it feels like a bellweather, but i’ve had trouble explaining to non-programmers.

February 22, 2025 at 12:17 AM

Kyle McDonald

@kcimc.bsky.social

are there any citizen science efforts to figure out what is actually in the air and in the ash in LA right now?

January 14, 2025 at 7:54 PM

Kyle McDonald

@kcimc.bsky.social

gemini 2 allows for some absolutely bounding box prediction 😳 i don't know any other way to quickly accomplish something like this. it only misses one object, and misattributes one title.

December 12, 2024 at 3:31 PM

Kyle McDonald

@kcimc.bsky.social

on some of the pages that are upside-down, gemini sometimes transcribed text using upside down unicode characters (but with nonsense english). i'm super curious where this ability comes from—are there images annotated with upside-down unicode in the training data?

December 12, 2024 at 3:18 PM

Kyle McDonald

@kcimc.bsky.social

the OCR capabilities are honestly out of control. i went through a phase in the early 2010s of playing with this very stylized script, and i'm so impressed in managed to correctly transcribe "light leaks". and it almost gets "the nine billion names of god".

December 12, 2024 at 3:18 PM

Kyle McDonald

@kcimc.bsky.social

i thought it would be fun to have colored stickers representing the different categories of things that i'm thinking about. so i fed all the transcribed text to gemini and asked for some categories, and then asked it to tag each page with relevant categories.

December 12, 2024 at 3:18 PM

Kyle McDonald

@kcimc.bsky.social

sometimes i've got to look for an example to see how it tagged the page. there are a bunch of pages i spilled water on two decades ago, and it has decided that these are "watercolors".. very cute

December 12, 2024 at 3:17 PM

Kyle McDonald

@kcimc.bsky.social

the summarization feature is incredible because it means you can search for "recipe" even when the word "recipe" does not appear anywhere on the page

December 12, 2024 at 3:17 PM

Kyle McDonald

@kcimc.bsky.social

gemini is completely different from traditional OCR here. OCR is faster, around 1s, while in my tests with gemini 1.5 pro i was seeing around 8s. and OCR will give you per-character bounding boxes! but it's almost useless as plain text—and blind to diagrams and drawings

December 12, 2024 at 3:17 PM

Kyle McDonald

@kcimc.bsky.social

i've been using a prompt like this to not just do OCR, but also provide descriptions of diagrams, and generate keywords for searching that might not even appear in the text itself

December 12, 2024 at 3:16 PM

Kyle McDonald

@kcimc.bsky.social

next step is converting to text. gemini actually has a big enough context window to just load hundreds of pages in at a time, but i'm really interested in caching analysis so i can have more of a real-time interaction.

December 12, 2024 at 3:16 PM

Kyle McDonald

@kcimc.bsky.social

the next step is getting each photo split into two pages and oriented right side up. after the first few years of having a sketchbook, i started to alternate the orientation of every page so my hand never ran into the spine.

December 12, 2024 at 3:16 PM

Kyle McDonald

@kcimc.bsky.social

it feels like a lot. it's almost my whole life. one of my highschool art teachers required us to keep a sketchbook, and i never stopped. it's where i keep my thoughts organized. a lot of my thinking is visual or diagrammatic.

December 12, 2024 at 3:15 PM

Kyle McDonald

@kcimc.bsky.social

the first step could not be automated—i pulled out my box of 33 sketchbooks, and took a photo of every single page. around 4300 pages in total.

December 12, 2024 at 3:15 PM

Kyle McDonald

@kcimc.bsky.social

i'm building an experimental tool for exploring 25 years of my old sketchbooks, with image and text recognition powered by gemini

December 12, 2024 at 3:14 PM

Kyle McDonald

@kcimc.bsky.social

when i first tested this with chatgpt in march 2023, i sat down with an expert fijian navigator and he laughed at how wrong it was. it's like asking chatgpt "who painted the mona lisa?" and it answered "michelangelo". an answer that is not only wrong but also reveals the deeper associative structure

December 12, 2024 at 12:44 AM

Kyle McDonald

@kcimc.bsky.social

one of my biggest concerns about LLMs is their ability to smother marginalized cultures with homogenizing hallucination. gemini 2.0 flash (released today) is one of the first models to explain the fijian wind compass clearly, succinctly, and accurately.

December 12, 2024 at 12:37 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news