Aritra Roy Gosthipaty
banner
arig23498.bsky.social
Aritra Roy Gosthipaty
@arig23498.bsky.social
MLE @ Hugging Face
5/ Each of the tech mentioned above has its own pros and cons. The processor that you are running in your system (a phone, a laptop, etc) will have a weighted sum of all the above.

It baffles me to think about all of this. 🤗
March 3, 2025 at 6:05 PM
4/N Multi Threading in Single Core

In a core, we can have multiple register blocks (context blocks) to access different instructions. This way if a process is stalled, the processor quickly jumps to another.
March 3, 2025 at 6:05 PM
3/N SIMD paradigm

In a single core if we have duplicate ALUs we can operate of a bunch of data in a single clock tick. The catch? Each operation should be the same.

Single instruction Multiple Data
March 3, 2025 at 6:05 PM
2/N Multi Core Processor:

A single processor consists of a control unit, arithmetic unit and some registers. How about we duplicate this block into multiple blocks? This is the multi-core architecture. As a programmer you would need to explicitly mention which code runs where.
March 3, 2025 at 6:05 PM
1/N Superscalar processors:

Your program is a list of instructions. This list almost always has independent instructions. A superscalar processor would identify them and execute seperately in the same clock tick.
March 3, 2025 at 6:05 PM
Reposted by Aritra Roy Gosthipaty
HF model collection for transformers:
huggingface.co/collections/...

HF model collection for OpenCLIP and timm:
huggingface.co/collections/...

And of course big_vision checkpoints:
github.com/google-resea...
SigLIP2 - a google Collection
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
February 22, 2025 at 3:34 PM
Reposted by Aritra Roy Gosthipaty
Paper:
arxiv.org/abs/2502.14786

HF blog post from @arig23498.bsky.social et al. with a gentle intro to the training recipe and a demo:
huggingface.co/blog/siglip2

Thread with results overview from Xiaohua (only on X, sorry - these are all in the paper):
x.com/XiaohuaZhai/...
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
We introduce SigLIP 2, a family of new multilingual vision-language encoders that build on the success of the original SigLIP. In this second iteration, we extend the original image-text training obje...
arxiv.org
February 22, 2025 at 3:34 PM
I forgot to mention that you can use the same code to access any `warm` model on the Hub.

Here is a list of all the `warm` models: huggingface.co/models?infer...

Happy vibe checking 😇

[N/N]
Models - Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 3, 2024 at 6:41 AM
I have created a simple and quick notebook to access this inference api and use `huggingface_hub` to access the model through it.

huggingface.co/datasets/ari...

[4/N]
qwq-inference-api.ipynb · ariG23498/quick-notebooks at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 3, 2024 at 6:41 AM
But today it was my lucky day. I noticed that the model was already loaded on the Serverless Inference API and was ready to be used.

No more spinning up my GPUs and stress testing them (happy GPU noises)

[3/N]
December 3, 2024 at 6:41 AM
My usual workflow is to visit the Hugging Face Hub model card (here that was hf[dot]co[dot]Qwen/QwQ-32B-Preview) and copy the working code sample.

I am sure this is how most of you work with a new model as well (if not, I would love to hear from you)

[2/N]
December 3, 2024 at 6:41 AM
I like the evaluation part. Is there some evals you particularly like?
November 26, 2024 at 11:18 AM
🙋‍♂️ ariG23498
November 23, 2024 at 3:38 PM
Reposted by Aritra Roy Gosthipaty
awesome, thanks a lot for sharing 🙌
November 13, 2024 at 4:37 PM