Ramchalam K R
ramchalamkr.bsky.social
Ramchalam K R
@ramchalamkr.bsky.social
Ml Research Engineer at the intersection of model training and efficient inference on NPUs
September 19, 2025 at 11:41 AM
Gosh. Thanks I got confused if this was another pack. Apologies:)
December 14, 2024 at 12:06 AM
I'd like to be added. Thanks
December 14, 2024 at 12:02 AM
Alright thank you for the clarification.
December 11, 2024 at 10:19 PM
I did work on structured pruning on weights a few years ago and as we were focused on deployment to edge devices , it was critical. But this approach on the activation/attention head is interesting although the inference graph wouldn't really change on the base model. Would love to further discuss.
December 10, 2024 at 4:11 PM
I do have a few points of clarification. Maybe I will drop by on the poster session.
But the stage 1 looks to me like it's structured pruning on the activation. What I am curious about is does this approach help improving inference ? So I presume we won't need to do compute for certain heads.
December 10, 2024 at 4:08 PM
NeurIPS Poster Stepping Forward on the Last MileNeurIPS 2024
neurips.cc
December 10, 2024 at 2:29 AM
Very interesting work on LoFiT!
December 8, 2024 at 10:07 PM
I'd like to be added to the pack as I would be at NeurIPS 2024 as well. Thanks
December 8, 2024 at 9:06 PM
December 8, 2024 at 4:15 PM
December 6, 2024 at 3:57 PM