Last year I reimplemented a lot of the cut / awk / csvkit examples of the “Data Science at the Command Line Book in DuckDB“ book in DuckDB and got good results:
szarnyasg.org/posts/data-s...
Last year I reimplemented a lot of the cut / awk / csvkit examples of the “Data Science at the Command Line Book in DuckDB“ book in DuckDB and got good results:
szarnyasg.org/posts/data-s...
In the official DuckDB code base, the engine doesn't have any platform-specific code to ensure portability. So it's up to the compilers to auto-vectorize the code.
In the official DuckDB code base, the engine doesn't have any platform-specific code to ensure portability. So it's up to the compilers to auto-vectorize the code.
SELECT
"Station name": s.name_short,
"Max distance": max(d.distance)
FROM s: 's3://duckdb-blobs/stations.parquet'
JOIN d: 's3://duckdb-blobs/distances.parquet'
ON d.station1 = s.code
GROUP BY ALL
ORDER BY "Max distance" DESC;
SELECT
"Station name": s.name_short,
"Max distance": max(d.distance)
FROM s: 's3://duckdb-blobs/stations.parquet'
JOIN d: 's3://duckdb-blobs/distances.parquet'
ON d.station1 = s.code
GROUP BY ALL
ORDER BY "Max distance" DESC;
Thanks for the great work on this!
Thanks for the great work on this!
You can generalize this and put a MAP value into the list_reduce function to capture the mapping, then do exact matching on the MAP's keys. For more details, see list_reduce in the docs: duckdb.org/docs/sql/fun...
You can generalize this and put a MAP value into the list_reduce function to capture the mapping, then do exact matching on the MAP's keys. For more details, see list_reduce in the docs: duckdb.org/docs/sql/fun...
– 0: 0 ≤ length < 40
– 1: 40 ≤ length < 80
– 2: 80 ≤ length < 160
– 3: 160 ≤ length
I came up with this:
list_reduce([0, 40, 80, 160], (acc, x, i) -> IF(x <= length, i - 1, acc)) AS category
– 0: 0 ≤ length < 40
– 1: 40 ≤ length < 80
– 2: 80 ≤ length < 160
– 3: 160 ≤ length
I came up with this:
list_reduce([0, 40, 80, 160], (acc, x, i) -> IF(x <= length, i - 1, acc)) AS category
I wrote up my results in a blog post.
TL;DR: it depends but DuckDB is still pretty fast!
szarnyasg.org/posts/duckdb...
I wrote up my results in a blog post.
TL;DR: it depends but DuckDB is still pretty fast!
szarnyasg.org/posts/duckdb...