honicky.bsky.social
@honicky.bsky.social
Working with a large set of AnnData files on S3 is tough if you don't have the metadata indexed, so I created a tool to help.

github.com/honicky/annd...

It uses partial downloads to dramatically speed up extracting the metadata without downloading the whole file.

`pip install anndata-metadata`
GitHub - honicky/anndata-metadata: A Python library and CLI tool for extracting metadata from AnnData .h5ad files, both locally and on S3
A Python library and CLI tool for extracting metadata from AnnData .h5ad files, both locally and on S3 - honicky/anndata-metadata
github.com
May 18, 2025 at 9:22 PM
I promised to explain how you can build your very own, custom GenePT embeddings. Here's another Lab Note:

learning-exhaust.hashnode.dev/lab-notes-cu...

Enjoy!
Gene Embeddings, LLMs, and a $6.83 Experiment That Might Matter
Better Gene Embeddings Through Prompt Engineering (Or, at Least, We Tried)
learning-exhaust.hashnode.dev
March 5, 2025 at 5:42 AM
Data products often scale cost like hardware, iterate like software, and scale performance like... data products.

learning-exhaust.hashnode.dev/data-product...
Data Products are Different
Why you have to manage data products differently than software and hardware.
learning-exhaust.hashnode.dev
December 28, 2024 at 9:54 PM
Oh man, writing up the Cerebras paper on Weight Streaming was a rabbit hole filled with parallel algorithms.

learning-exhaust.hashnode.dev/one-thing-i-...

@picocreator.bsky.social @eugeneyan.com @swyx.io @latentspacepod.bsky.social
Weight streaming might work well on GPUs!
I wonder why it isn't a thing...
learning-exhaust.hashnode.dev
December 13, 2024 at 5:17 AM
First post to bsky:

I've been trying to post a bit more frequently and in smaller bites, so I'm going to try to pick one interesting thing I've learned from papers I read, and then write a quick post about them.

Here's my first one:

learning-exhaust.hashnode.dev/one-thing-i-...
Embeddings are task specific
Modern training techniques for embedding models mean that we should probably include a prompt or fine tune to a specific type of task.
learning-exhaust.hashnode.dev
November 28, 2024 at 1:10 AM