find more about Xet here: huggingface.co/blog/xet-on-...
find more about Xet here: huggingface.co/blog/xet-on-...
Which means versioned datasets cost only a fraction of their original storage ! 🔥🤯
e.g. if you store with Xet, which deduplicates files by chunk
cc @julien.ledem.net FYI
Which means versioned datasets cost only a fraction of their original storage ! 🔥🤯
e.g. if you store with Xet, which deduplicates files by chunk
cc @julien.ledem.net FYI
>>> import pyarrow.parquet as pq
>>> writer = pq.ParquetWriter(
... out, schema,
... use_content_defined_chunking=True,
... )
>>> import pyarrow.parquet as pq
>>> writer = pq.ParquetWriter(
... out, schema,
... use_content_defined_chunking=True,
... )