Gabor Szarnyas
szarnyasg.org
Gabor Szarnyas
@szarnyasg.org
Head of DevRel at DuckDB Labs
Thanks! All the fiddling with SVGs was worth it! 🙌
November 17, 2025 at 9:34 PM
DuckDB does not support GraphQL. GraphQL itself is a bit of misnomer as it is not a full-fledged graph query language, it's primarily intended query REST endpoints. GQL and SQL/PGQ are full-fledged graph query languages, supporting both pattern matching and path finding.
October 24, 2025 at 7:31 AM
I work at DuckDB Labs so obviously I am biased but this really looks like a prime use case for @duckdb.org

Last year I reimplemented a lot of the cut / awk / csvkit examples of the “Data Science at the Command Line Book in DuckDB“ book in DuckDB and got good results:

szarnyasg.org/posts/data-s...
Data Science at the Command Line Book in DuckDB
Today I solved the exercises in Chapter 5 of the Data Science at the Command Line book using the DuckDB command line client. This page documents my solutions. Prerequisites Clone the https://github.co...
szarnyasg.org
May 16, 2025 at 12:13 PM
I don't think there is such a test in DuckDB at the moment. You'd have to look at the binary code with a disassembler and try to find vector instructions.
May 9, 2025 at 8:30 AM
It would be an interesting experiment to try to make use of RISC-V RVV but I'm not aware of any attempts.

In the official DuckDB code base, the engine doesn't have any platform-specific code to ensure portability. So it's up to the compilers to auto-vectorize the code.
May 8, 2025 at 10:12 AM
Here is a query making use of prefix aliases in all three clauses:

SELECT
"Station name": s.name_short,
"Max distance": max(d.distance)
FROM s: 's3://duckdb-blobs/stations.parquet'
JOIN d: 's3://duckdb-blobs/distances.parquet'
ON d.station1 = s.code
GROUP BY ALL
ORDER BY "Max distance" DESC;
February 25, 2025 at 3:04 PM
I recently added your instructions for building DuckDB on RISC-V to the DuckDB documentation: duckdb.org/docs/dev/bui...

Thanks for the great work on this!
Unofficial and Unsupported Platforms
Warning The platforms listed on this page are not officially supported. The build instructions are provided on a best-effort basis. Community contributions are very welcome. DuckDB is built and distri...
duckdb.org
February 21, 2025 at 8:10 PM
I don't think this is possible in the moment. I would go the other route and try to do unnest and join. To save memory, you could peel away the nested column (CREATE TEMP TABLE tmp AS SELECT column FROM original_table), and do the unnest and join on this table, then join it back to the original.
February 19, 2025 at 11:19 AM
The list_reduce function iterates through the list and picks the correct categoriy.

You can generalize this and put a MAP value into the list_reduce function to capture the mapping, then do exact matching on the MAP's keys. For more details, see list_reduce in the docs: duckdb.org/docs/sql/fun...
Lambda Functions
Lambda functions enable the use of more complex and flexible expressions in queries. DuckDB supports several scalar functions that operate on LISTs and accept lambda functions as parameters in the for...
duckdb.org
February 19, 2025 at 10:27 AM
I ran into a similar problem recently when I needed to categorize posts into according to their length:

– 0: 0 ≤ length < 40
– 1: 40 ≤ length < 80
– 2: 80 ≤ length < 160
– 3: 160 ≤ length

I came up with this:

list_reduce([0, 40, 80, 160], (acc, x, i) -> IF(x <= length, i - 1, acc)) AS category
February 19, 2025 at 10:27 AM
My post on DuckDB vs. wc received a lot of feedback. Based on these, I ran a few more experiments to see how DuckDB stacks up against parallelized wc and grep/ripgrep on Linux.

I wrote up my results in a blog post.

TL;DR: it depends but DuckDB is still pretty fast!
szarnyasg.org/posts/duckdb...
December 4, 2024 at 9:25 PM
Oops, that's the difference of reading the CSV with or without its header. Well-spotted!
December 2, 2024 at 10:58 PM
3) The ts command adds a timestamp at the beginning of each line. On macOS, it's available in the moreutils package on Homebrew.
November 30, 2024 at 7:50 PM
2) A single sed command can include multiple search and replace pairs separated by semicolon. This makes sed commands *even less readable*, so use it with caution.
November 30, 2024 at 7:50 PM
1) The bat tool – an alternative to cat – prints the newline characters if it's invoked with the -A switch. This output mode reveals whether a file is using CR/LF or LF newlines (or both).
November 30, 2024 at 7:50 PM