Raman Dutt
banner
ramandutt4.bsky.social
Raman Dutt
@ramandutt4.bsky.social
Generative AI @Noah's Ark Lab, Huawei & @TuringInstitute | PhD candidate in Biomedical AI @ University of Edinburgh | Efficient Fine-Tuning in Medical AI, Diffusion Models, Autoregressive Image Generation
This de identification token is consistently present throughout the dataset and contributes nothing towards improving the image quality.

This points towards a major flaw in the dataset given MIMIC is one of the most significant medical datasets for T2I generation. 💔
February 22, 2025 at 3:26 PM
It turns out that this de identification token (“___”) holds the most significant contribution towards memorizing training images.

In other words, steps taken to protect patient information are in fact posing a threat to it.
February 22, 2025 at 3:26 PM
MIMIC dataset contains pairs of images and corresponding text reports. These are raw reports describing the images.

In the dataset, the sensitive patient information is hidden or de identified. This is done by replacing it with three underscores (“___”).
February 22, 2025 at 3:25 PM
Done!
November 27, 2024 at 1:13 PM
Hence, we devise MemControl - a framework that searches for the optimal parameters to be fine-tuned to:

(1) Improve image generation quality
(2) Reduce Memorization!

MemControl leads to optimal model capacity that should be used during fine-tuning: Not more, not less!
November 26, 2024 at 6:32 PM
We also found that fine-tuning different subsets of parameters in a diffusion model can affect generative quality and memorization differently!
Each marker in the figure is a diffusion model finetuned on the same data but with different parameter subset.

Full FT (green) leads to high memorization!
November 26, 2024 at 6:30 PM
We provide empirical proof that reducing the model capacity (by fine-tuning fewer parameters) can lead to reduced memorization!

Q. How to fine-tune with fewer parameters? 🤔
A. Parameter-Efficient Fine-Tuning (PEFT) ✨
November 26, 2024 at 6:26 PM
The conventional way of fine-tuning models (full fine-tuning) can lead to replication of artifacts in X-Rays that can further lead to leakage of patient information, thus endangering patient privacy.

Artifact replication is shown in red boxes.
November 26, 2024 at 6:24 PM
Would love to join!
November 26, 2024 at 11:24 AM
Please add me 😂
November 25, 2024 at 4:06 PM
I would love to be added 😂
November 25, 2024 at 4:05 PM
November 25, 2024 at 4:05 PM
Again, very sorry to hear about what you are going through. Advertising here is a great idea. I personally got some good advice about a condition I was going through. Wish I could be more helpful though. Wishing you the best!
November 25, 2024 at 3:34 PM
So sorry to hear about this @ian-goodfellow.bsky.social . Do you think any of this might be related to prolonged headphone usage in addition to many other factors? Or if that exacerbates the condition?
November 25, 2024 at 3:26 PM