Mark Needham
markhneedham.com
Mark Needham
@markhneedham.com
Product Marketing Engineer at ClickHouse
I make short videos on Local AI @ youtube.com/@LearnDataWithMark
TIL this week: LLMs (or at least gpt4-o) are pretty good at interpreting markdown tables. And docling makes it super easy to get PDFs/HTML pages into markdown format.

I made a short video showing how to query Wikipedia pages - www.youtube.com/watch?v=KGz-...
Do LLMs understand markdown tables?
YouTube video by Learn Data with Mark
www.youtube.com
December 8, 2024 at 10:53 AM
Reposted by Mark Needham
THREAD: How to check if the screenshot of a social media post is real or fake?

Fake screnshots of online posts regularly mislead people. So, here's a simple guide on how you can quickly check if a screenshot of a post attributed to an account is genuine or fake before falling for or sharing it.
December 4, 2024 at 6:13 PM
Reposted by Mark Needham
Github Receipts. Nice idea! 😊
gitreceipt.vercel.app
December 1, 2024 at 6:44 PM
Reposted by Mark Needham
If you're making a side project today or a personal site that needs a hero image, I made this lil hero generator a while back: hero-generator.netlify.app

Enjoy!
November 30, 2024 at 2:53 PM
Reposted by Mark Needham
Fargo, Severance, Black Mirror, Atlanta, Master of None, Euphoria – they all left their fans waiting for three years or more between seasons. In fact, we're waiting longer than ever for TV shows to return, as @lisacmuth.bsky.social shows in her Weekly Chart: blog.datawrapper.de/waittime-for...
November 28, 2024 at 2:50 PM
Reposted by Mark Needham
That you can build Twitter (bsky) with 20 people instead of 2000 means bsky can pursue monetization models that Twitter never could. This is a pattern I see across many startups now. Tech has gotten good enough to ship products at 10x-100x less cost compared to 20 years ago. 1/n
November 28, 2024 at 5:24 PM
Reposted by Mark Needham
So many products today are described as "AI x". But it should be possible to state it's value without the implementation detail that is "AI", right?

Similar how we don't necessarily prefix other software products with "software".
November 28, 2024 at 7:59 AM
Reposted by Mark Needham
ClickHouse doesn't have a PIVOT operator, but we can achieve similar behavior using aggregate function combinators. In our latest video, we use the -Map suffixed functions to explore house prices in the UK.
Can you PIVOT in ClickHouse?
ClickHouse doesn't have a PIVOT clause, but we can get close to this functionality using aggregate function combinators. This video will show how to do this using the UK housing prices dataset.
buff.ly
November 27, 2024 at 2:00 PM
In this week's video, I played around with a library called burr. It's a state machine library that you can use to build LLM apps where you want some human input/decision making built in.

I only scratched the surface, but it was already pretty cool.

www.youtube.com/watch?v=n6WK...
Intro to burr: A State Machine for LLM apps
YouTube video by Learn Data with Mark
www.youtube.com
November 27, 2024 at 4:54 PM
Reposted by Mark Needham
Something interesting is brewing in Iceberg-on-S3 land. 👀

lists.apache.org/thread/v7x65...

cc @eatonphil.bsky.social
lists.apache.org
November 26, 2024 at 7:26 PM
Reposted by Mark Needham
The ClickHouse 24.11 community call has just got underway.

Join us on Zoom: clickhouse.com/company/even...
Or YouTube: www.youtube.com/watch?v=0hpT...
v24.11 Community Call
Nov 26 - Every month we get together with the community (users, contributors, customers, those interested in learning more about ClickHouse) to discuss what is coming in the latest release.
clickhouse.com
November 26, 2024 at 4:02 PM
Reposted by Mark Needham
I want to see a screenshot or photo of a project that you've never published anything online about before
November 23, 2024 at 2:24 AM
Reposted by Mark Needham
Look , I love #duckdb and all, but #daft is going a beautiful job with Iceberg rest catalog
November 23, 2024 at 5:49 AM
Reposted by Mark Needham
The Database Capital of Europe
November 22, 2024 at 11:21 AM
Reposted by Mark Needham
I had the pleasure of joining @joereis.bsky.social on his podcast a couple of weeks ago. We discussed many topics, including big shifts in data ecosystem and approaches to commercializing open source. Check it out! open.spotify.com/episode/3GyW...
Tanya Bragin - Clickhouse, Open Source vs Commercial, and More
Spotify video
open.spotify.com
November 21, 2024 at 10:20 PM
Reposted by Mark Needham
That's like a live sentiment tracker. You could make an emoji index. If it correlated with consumer confidence, which it should lead, it would be an interesting market indicator. #Rstats #dataBS
✨ Introducing Emoji Stats for Bluesky ✨

Watch live the most used emojis — all of them or per-language — explore why the dutch love 🍀, the germans ☕️ and so much more.

Tap/click on the emoji to see the latest posts with them.

A lot more things to come!

emojistats.bsky.sh
Emoji Stats for Bluesky 🦋
Live counters and stats on emoji usage on Bluesky, broken down by language.
emojistats.bsky.sh
November 21, 2024 at 2:25 PM
Reposted by Mark Needham
uv makes it so easy to go zero to MRE

say i find bug in foo==x.y.z, i can repro with:

uv run \
--with foo==x.y.z \
repro.py

RuntimeError: oh no!

test the branch with a fix:

uv run \
--with foo@git+https://{giturl}.git@hotfix-foo \
repro.py



just used this today!

github.com/pydantic/pyd...
new requirement in 2.10 to pass `validated_data` when `call_default_factory=True` · Issue #10912 · pydantic/pydantic
Initial Checks I confirm that I'm using Pydantic V2 this issue opened per #10905 (comment) Description appears to be a new requirement that validated_data is passed to Field.get_default if call_def...
github.com
November 21, 2024 at 3:28 AM
Reposted by Mark Needham
So on further thought: I wonder why this is happening for you: I’m following *gob-tons* of people now, and my feed is practically overwhelmed with cool AI stuff.

Maybe you should follow a bunch of people I am following ;)
November 21, 2024 at 7:24 PM
Reposted by Mark Needham
ICYMI I recently interviewed the creator of @clickhouse.com, Alexey Milovidov.

We dove deep into what ClickHouse is so fast, why AI companies love it (hint: its had a vector data type since 2012), rewriting Zookeeper (yikes), and how math informs database design.

www.rilldata.com/blog/rill-cl...
Rill | Data Talks on the Rocks 4 - Alexey Milovidov, ClickHouse
A video interview with Alexey Milovidov, co-founder and CTO of ClickHouse, that dives deep into AI, ClickHouse’s architecture, different database categories, and pain points in the market that ClickHo...
www.rilldata.com
November 20, 2024 at 9:41 PM
Reposted by Mark Needham
Do not miss @clickhouse.com meetup on Dec 12 in SF 🎄

All 3 founders will be there, our CTO Alexey is giving a talk on ClickHouse+AI, and you'll also hear from users like Doordash.

Register now, space is limited - ty @cloudflare.social for hosting! www.meetup.com/clickhouse-s...
ClickHouse Meetup @ Cloudflare, Thu, Dec 12, 2024, 6:00 PM | Meetup
Hello ClickHouse Enthusiasts! We’re really excited to host another ClickHouse Meetup at the Cloudflare Office - San Francisco on December 12th, 2024. The creator of ClickH
www.meetup.com
November 21, 2024 at 4:33 PM
Reposted by Mark Needham
This chart is pretty insane. More of an overnight social/political-type shift than typical tech adoption imo: techcrunch.com/2024/11/19/b...
November 19, 2024 at 7:13 PM
Reposted by Mark Needham
Updates for my database starter pack and list:

- Added @clickhouse.com
- Added @memgraph.com

Welcome to join the bluesky, let's rock!

go.bsky.app/Nd8hieE
November 20, 2024 at 5:18 AM
@xuanwo.io can we add @clickhouse.com to your database list?
November 19, 2024 at 7:06 PM
Reposted by Mark Needham
This week is going to be A LOT for those of us who've been along for the ride. Here is Roger Federer, a fan of Rafael Nadal
November 19, 2024 at 7:53 AM
Reposted by Mark Needham
If anyone is looking to connect with other engineering managers or leaders, @pelayoarbues.com put together a great starter pack!
November 18, 2024 at 7:57 PM