David Callies
banner
dcallies.bsky.social
David Callies
@dcallies.bsky.social
Open Source Trust & Safety Software Engineer
I've been pilled on negative offsets through python: last = val[-1]
October 17, 2025 at 6:39 PM
How many social surveys did you complete this week??? We noticed you aren't sharing your personal hobbies in work slack, what's with that, you know your performance is tied to internal likes right?

Vs

5 stars or death, working extremely late hours to clear the support queue, etc :P
May 30, 2025 at 1:14 PM
I enjoyed the book, but many of its premisses are so disconnected from workability it was distracting, but maybe that's what I get for working in the industry. However, the onsite apartments were rad, and the real life equivalent of the CitizenM in Menlo Park near MetaHQ net value to the world imho
May 28, 2025 at 3:07 AM
As a "graphql is the required API" fully Stockholm syndromed employee, what stinks about graphql? We have oodles of integrated tooling, so guessing /path/to/thing?param=yes is way easier than the DSL query/finding fetch__the_thing()?
April 11, 2025 at 9:48 PM
The Q in PDQ stands for quality, and it's an attempt by the algorithm to identify images that it's not very good at discerning between before it discards all the information by turning it into a hash.

Classic low quality images are highly padded images, regular patterns, or even blank squares.
April 10, 2025 at 4:09 PM
Awesome work! I've been poking around the edges of fediverse land to see if PDQ would be valuable to use for providing image based features (especially harm detection). Let me know if you think the idea has legs, happy to chat with you about it!
April 10, 2025 at 2:49 PM
Sure did! Over time I've tried to copy out key portions into the various readmes. Now that hashing.pdf is cross linked in a lot of other places, I've been reluctant to touch it...
April 10, 2025 at 1:13 PM
@hailey.at - make sure to filter out low quality score (<50) PDQ hashes! Most common mistake using PDQ, otherwise the results are random instead of perceptually clustered!
April 10, 2025 at 1:08 PM
Awesome work, and super exciting! We debated whether to build this exact thing when we did HMA 1.0, tested it with Workplace instead (less exciting).

cc @julietshen.bsky.social
March 26, 2025 at 11:31 PM
I think my meeting load was lighter when I was a manager, but not by much :P
March 24, 2025 at 1:42 AM
Earlier on in our teams work on HMA, we did some theorizing on this, we were calling it the "Safety Stack" (had logos picked out and everything).

AT/AP are both good models because they imagine interoperable social media. Why not interoperable T&S?
March 24, 2025 at 1:38 AM
We should connect on a potential roadmap. Maybe if we highlighted where the gaps were, focusing how theoretical systems might communicate, we can make some space for more tools to pop up that will be interoperable with the ROOST-verse. [1/2]
March 24, 2025 at 1:38 AM
Pick something that you might conceivably use. I got my start editing the configs of games I played, then learned more to make my own games. My default rec is always python, but it should come with a project - raspberry pi is a good gateway for physical devices that uses it.
March 23, 2025 at 11:21 PM
Not a lot of HMA or not a lot of other tools in the same vein?
March 23, 2025 at 11:16 PM
Fridays are basically my only light day, 1h, leaving me mostly to learning my job is safe from LLMs, who are much worse at react than I am (and I'm bad).

Don't talk to me about Tuesdays
March 21, 2025 at 10:19 PM
The benefit of trimming the record by default for new partners catching up seemed to be worth optimizing for, and a belief that integrators sophisticated enough to use full history would be limited led to my belief that only sharing active records was preferable. History-less also seemed simpler 🤷
March 19, 2025 at 5:09 PM
Agree on attack potentials, I think we have different conclusions on if we need a full immutable record history as a native functionality. We have some time limited history in ThreatExchange (only on harmful/not harmful), but we prune records after 90d. [1/2]
March 19, 2025 at 5:09 PM
Full transitions add more transparency, but I've yet to want that from someone else from a T&S perspective, though I suppose a different background (academic) might. You can synthesize the full transition record if you have made it up to date on the replication.
March 19, 2025 at 3:10 AM
To add more context to the question, retaining history seemed undesirable - if we are undoing a mistaken report on benign content it seemed better that content to be fully forgotten. The transitions add length to the record without necessarily adding functionality on the goal of replication.
March 19, 2025 at 3:08 AM
Read some of the UUID docs, but why does it make sense to make the dataset append only?

To make sure I have the right usecase, is the datasets we are talking about replicating trust and safety data?
March 18, 2025 at 2:14 PM
When people ask me, I say (update_time, record_id) as the sort order, which relies on the record having both of those things. Is UUID being used to mask the underlying id?
March 18, 2025 at 10:35 AM
The interface makes date pagination optional, but the ability to sort by date falls out of the requirement that the API detect updates and to have a correct output even if the items are being updated underneath you. The easiest way to do that is put new updates at the end.
March 18, 2025 at 10:34 AM
E.g. here's the line of code for NCMEC: report.cybertip.org/hashsharing/...

See also this issue: github.com/facebook/Thr...
report.cybertip.org
March 18, 2025 at 2:54 AM