Santiago Castro
banner
bryant1410.bsky.social
Santiago Castro
@bryant1410.bsky.social
🇺🇾 Research Scientist @Netflix, working on Vision+Language research. Opinions are my own.
Reposted by Santiago Castro
Pls RT
Permanent Assistant Professor (Lecturer) position in Computer Vision @bristoluni.bsky.social [DL 6 Jan 2025]
This is a research+teaching permanent post within MaVi group uob-mavi.github.io in Computer Science. Suitable for strong postdocs or exceptional PhD graduates.
t.co/k7sRRyfx9o
1/2
https://tinyurl.com/BristolCVLectureship
t.co
December 4, 2024 at 5:22 PM
HuggingFace is limiting repositories' storage 😱
December 2, 2024 at 7:21 PM
Reposted by Santiago Castro
Can all graduate programs please accept a universal letter system like Interfolio so we don’t have to upload 100 letters individually?! The time waste is insane.

Students are telling me that only *two* of their applications accept Interfolio!
December 1, 2024 at 7:20 PM
Reposted by Santiago Castro
A librarian that previously worked at the British Library created a relatively small dataset of bsky posts, hundreds of times smaller than previous researchers, to help folks create toxicity filters and stuff.

So people bullied him & posted death threats.

He took it down.

Nice one, folks.
November 28, 2024 at 5:33 AM
Reposted by Santiago Castro
Personally, reviewing for NeurIPS a couple years back changed me as a reviewer. For one paper I rejected, I kept citing it throughout the year to people for a finding it had. This made me realise it was a good paper, it just had some easy targets for rejection.
November 27, 2024 at 5:25 PM
Reposted by Santiago Castro
Do you know what rating you’ll give after reading the intro? Are your confidence scores 4 or higher? Do you not respond in rebuttal phases? Are you worried how it will look if your rating is the only 8 among 3’s? This thread is for you.
November 27, 2024 at 5:25 PM
Reposted by Santiago Castro
We just dropped CAT4D, text to dynamic 3D models that you can render in real time. Not posting a video because Bluesky is garbage in this respect; go straight to the real time viewer on a desktop browser and look around. The cat kneading dough is my favorite.
cat-4d.github.io
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
We present CAT4D, a method for creating 4D (dynamic 3D) scenes from monocular video. CAT4D leverages a multi-view video diffusion model trained on a diverse combination of datasets to enable novel vie...
cat-4d.github.io
November 28, 2024 at 2:50 AM
Reposted by Santiago Castro
November 28, 2024 at 1:49 AM
Reposted by Santiago Castro
In the HuggingFace/Bluesky incident, the problem goes deeper than whether the data is "public" or "private"

What matters to people is whether their data was collected, which data was collected, how it may be used, and who it may be used by
November 27, 2024 at 2:57 PM
Reposted by Santiago Castro
ACL syntax track reviewers >> almost any other conference.

These folks care about their sub-field and i learn something new every time!
November 27, 2024 at 7:44 PM
Reposted by Santiago Castro
On July 26th, Nancy Pelosi sells 5000 shares of $MSFT Microsoft,

On Nov 27, 2024, the FTC announces it is launching a wide-ranging US antitrust probe against $MSFT.

This was Nancy Pelosi's largest sell in two years of her portfolio, with $MSFT below her sell now.
November 27, 2024 at 11:44 PM
Reposted by Santiago Castro
We are looking for the current best multi-view full-body 3d pose estimation model/software with Remi Cadene

Any good advice?

Should include hands pose estimation in addition to body preferably

Better if able to use multiple cameras as inputs (multi-view)

for open-source low cost robot teleop
November 27, 2024 at 10:17 PM
Reposted by Santiago Castro
Today I presented my MSc. work "Exploring approaches to Improvisational Interactive Storytelling" in the student seminar.

I narrated a basic setting and used a dice to explain the gamemastering mechanisms to the committee ☺️

OK, now I have to write my thesis! 😅
November 27, 2024 at 4:31 PM
Reposted by Santiago Castro
Check out our new work on video-guided audio gen with a focus on fine-grained creative control! Done by @czyang.bsky.social during an internship with our group at Adobe Research. Super fun model!
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/
November 27, 2024 at 3:00 AM
Reposted by Santiago Castro
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/
November 27, 2024 at 2:58 AM
Reposted by Santiago Castro
Rare personal tweet:
Subletting our furnished apartment in Brooklyn for the spring at a significant discount. It's quite nice and in a fun location. under price. Email me know if you are interested, I will send pictures.
November 25, 2024 at 8:39 PM
Reposted by Santiago Castro
The FATE group at @msftresearch.bsky.social NYC is accepting applications for 2025 interns. 🥳🎉

For full consideration, apply by 12/18.

jobs.careers.microsoft.com/global/en/jo...

Interested in AI evaluation? Apply for the STAC internship too!

jobs.careers.microsoft.com/global/en/jo...
November 25, 2024 at 1:31 PM
Reposted by Santiago Castro
If you want to help improve peer review, we are looking for a new Co-CTO for ACL Rolling Review!

Requirements:
- Post-PhD
- Experienced with Python (including command line use)
- Time commitment of 3 hours a week on average (but note that you are not expected to review while serving)

Contact me!
November 24, 2024 at 7:36 PM
Reposted by Santiago Castro
🌍✨Announcing the 4th edition of the NLP for Positive Impact workshop at #ACL2025 in Vienna!
Come join us and explore various social applications of NLP!
📢 Call for papers & more details coming soon!
🔗https://sites.google.com/view/nlp4positiveimpact/acl-2025-workshop
November 21, 2024 at 8:51 PM
Reposted by Santiago Castro
WhatsApp will soon transcribe your voice messages
WhatsApp will soon transcribe your voice messages
Finally, an easy way to skim through lengthy voice clips.
buff.ly
November 21, 2024 at 5:10 PM
Reposted by Santiago Castro
If you're interested in embeddings and SQLite you should be paying attention to sqlite-vec

Lots of neat stuff in this release - and the blog post provides a very clear explanation of what it can do
sqlite-vec v0.1.6 is now out, with metadata support!

SQLite vector search w/ metadata filters 👀

- Perform extra filtering w/ WHERE clause in KNN queries
- Internally shard vector indexes with partition keys
- Aux columns for easy lookups

read more: alexgarcia.xyz/blog/2024/sq...
sqlite-vec now supports metadata columns and filtering
Metadata, partition key, and auxiliary column support in sqlite-vec
alexgarcia.xyz
November 20, 2024 at 5:53 PM