Lightnews — Scholar-powered news

David Callies

@dcallies.bsky.social

I've been pilled on negative offsets through python: last = val[-1]

October 17, 2025 at 6:39 PM

David Callies

@dcallies.bsky.social

How many social surveys did you complete this week??? We noticed you aren't sharing your personal hobbies in work slack, what's with that, you know your performance is tied to internal likes right?

Vs

5 stars or death, working extremely late hours to clear the support queue, etc :P

May 30, 2025 at 1:14 PM

David Callies

@dcallies.bsky.social

I enjoyed the book, but many of its premisses are so disconnected from workability it was distracting, but maybe that's what I get for working in the industry. However, the onsite apartments were rad, and the real life equivalent of the CitizenM in Menlo Park near MetaHQ net value to the world imho

May 28, 2025 at 3:07 AM

David Callies

@dcallies.bsky.social

As a "graphql is the required API" fully Stockholm syndromed employee, what stinks about graphql? We have oodles of integrated tooling, so guessing /path/to/thing?param=yes is way easier than the DSL query/finding fetch__the_thing()?

April 11, 2025 at 9:48 PM

David Callies

@dcallies.bsky.social

The Q in PDQ stands for quality, and it's an attempt by the algorithm to identify images that it's not very good at discerning between before it discards all the information by turning it into a hash.

Classic low quality images are highly padded images, regular patterns, or even blank squares.

April 10, 2025 at 4:09 PM

David Callies

@dcallies.bsky.social

Awesome work! I've been poking around the edges of fediverse land to see if PDQ would be valuable to use for providing image based features (especially harm detection). Let me know if you think the idea has legs, happy to chat with you about it!

April 10, 2025 at 2:49 PM

David Callies

@dcallies.bsky.social

Sure did! Over time I've tried to copy out key portions into the various readmes. Now that hashing.pdf is cross linked in a lot of other places, I've been reluctant to touch it...

April 10, 2025 at 1:13 PM

David Callies

@dcallies.bsky.social

@hailey.at - make sure to filter out low quality score (<50) PDQ hashes! Most common mistake using PDQ, otherwise the results are random instead of perceptually clustered!

April 10, 2025 at 1:08 PM

David Callies

@dcallies.bsky.social

Awesome work, and super exciting! We debated whether to build this exact thing when we did HMA 1.0, tested it with Workplace instead (less exciting).

cc @julietshen.bsky.social

March 26, 2025 at 11:31 PM

David Callies

@dcallies.bsky.social

I think my meeting load was lighter when I was a manager, but not by much :P

March 24, 2025 at 1:42 AM

David Callies

@dcallies.bsky.social

Earlier on in our teams work on HMA, we did some theorizing on this, we were calling it the "Safety Stack" (had logos picked out and everything).

AT/AP are both good models because they imagine interoperable social media. Why not interoperable T&S?

March 24, 2025 at 1:38 AM

David Callies

@dcallies.bsky.social

We should connect on a potential roadmap. Maybe if we highlighted where the gaps were, focusing how theoretical systems might communicate, we can make some space for more tools to pop up that will be interoperable with the ROOST-verse. [1/2]

March 24, 2025 at 1:38 AM

David Callies

@dcallies.bsky.social

Pick something that you might conceivably use. I got my start editing the configs of games I played, then learned more to make my own games. My default rec is always python, but it should come with a project - raspberry pi is a good gateway for physical devices that uses it.

March 23, 2025 at 11:21 PM

David Callies

@dcallies.bsky.social

Not a lot of HMA or not a lot of other tools in the same vein?

March 23, 2025 at 11:16 PM

David Callies

@dcallies.bsky.social

Fridays are basically my only light day, 1h, leaving me mostly to learning my job is safe from LLMs, who are much worse at react than I am (and I'm bad).

Don't talk to me about Tuesdays

March 21, 2025 at 10:19 PM

David Callies

@dcallies.bsky.social

The benefit of trimming the record by default for new partners catching up seemed to be worth optimizing for, and a belief that integrators sophisticated enough to use full history would be limited led to my belief that only sharing active records was preferable. History-less also seemed simpler 🤷

March 19, 2025 at 5:09 PM

David Callies

@dcallies.bsky.social

Agree on attack potentials, I think we have different conclusions on if we need a full immutable record history as a native functionality. We have some time limited history in ThreatExchange (only on harmful/not harmful), but we prune records after 90d. [1/2]

March 19, 2025 at 5:09 PM

David Callies

@dcallies.bsky.social

Full transitions add more transparency, but I've yet to want that from someone else from a T&S perspective, though I suppose a different background (academic) might. You can synthesize the full transition record if you have made it up to date on the replication.

March 19, 2025 at 3:10 AM

David Callies

@dcallies.bsky.social

To add more context to the question, retaining history seemed undesirable - if we are undoing a mistaken report on benign content it seemed better that content to be fully forgotten. The transitions add length to the record without necessarily adding functionality on the goal of replication.

March 19, 2025 at 3:08 AM

David Callies

@dcallies.bsky.social

Read some of the UUID docs, but why does it make sense to make the dataset append only?

To make sure I have the right usecase, is the datasets we are talking about replicating trust and safety data?

March 18, 2025 at 2:14 PM

David Callies

@dcallies.bsky.social

When people ask me, I say (update_time, record_id) as the sort order, which relies on the record having both of those things. Is UUID being used to mask the underlying id?

March 18, 2025 at 10:35 AM

David Callies

@dcallies.bsky.social

The interface makes date pagination optional, but the ability to sort by date falls out of the requirement that the API detect updates and to have a correct output even if the items are being updated underneath you. The easiest way to do that is put new updates at the end.

March 18, 2025 at 10:34 AM

David Callies

@dcallies.bsky.social

E.g. here's the line of code for NCMEC: report.cybertip.org/hashsharing/...

See also this issue: github.com/facebook/Thr...

report.cybertip.org

March 18, 2025 at 2:54 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news