jon esperanza
banner
jonathanesperanza.com
jon esperanza
@jonathanesperanza.com
data and ml @creditkarma
jonathanesperanza.com
Pinned
i'll be using bluesky as a repository of my notes, thoughts, and ideas. i'll start with diving into atproto
airbnb's ml feature platform: Chronon
medium.com/airbnb-engin...

goal: feature engineering velocity and training-serving data consistency

takeaways:
- centralized definition and computation of data
- online and offline compute
- flexible freshness per feature
- several backfill options
Chronon — A Declarative Feature Engineering Framework
A framework for developing production grade features for machine learning models. The purpose of the blog is to provide an overview of…
medium.com
December 7, 2024 at 10:06 PM
www.zillow.com/tech/contain...
- goal: enable ML predictions at scale with containers

takeaways:
- centralizing repetitive heavy lifting code and abstracting plumbing between components
- same image for model training and serving
- image has several exposed endpoints
Improving the home selling & buying experience by containerizing ML deployments
With Zillow Offers we’re transforming how real estate is bought and sold. Underpinning it is a process we follow that ensures every seller and buyer receives a delightful and consistent experience. Un...
www.zillow.com
November 29, 2024 at 6:19 PM
www.zillow.com/tech/buildin...
- goal: transform raw events into "easy-to-use" data that drives business decisions

real-world example of data warehousing and modeling

takeaways:
- 80/20 analytics for self-service -- build dashboards to answer 80% of questions
Building a strong foundation to accelerate StreetEasy’s data science efforts
Introduction Data is at the foundation of everything we do at StreetEasy and  Zillow. Users of our websites and services can get recommendations through our homepage and tailored emails; they can read...
www.zillow.com
November 29, 2024 at 5:39 PM
eng.lyft.com/building-rea...
- goal: enable quicker time to market for real-time ML applications

takeaways:
- understand customer needs leads to effective solutions
- simple and intuitive design to drive adoption
- importance of documentation
Building Real-time Machine Learning Foundations at Lyft
In early 2022, Lyft already had a comprehensive Machine Learning Platform called LyftLearn composed of model serving, training, CI/CD…
eng.lyft.com
November 29, 2024 at 5:10 PM
careersatdoordash.com/blog/leverag...
- goal: send timely notifications to users who abandon their carts
- problem: hard to determine if users truly abandoned their cart or still browsing
- solution: group real-time events into user sessions and trigger notifications upon inactivity
Leveraging Flink to Detect User Sessions and Engage DoorDash Consumers with Real-Time Notifications - DoorDash
Doordash optimizes real-time notifications with the frontend events by leveraging streaming processing.
careersatdoordash.com
November 23, 2024 at 5:02 AM
durability: web.archive.org/web/20180422...
- definition of durability has evolved over time
- basic replication helps, but not enough for "real" durability
- proposed solution: replication in distributed systems
- multiple nodes can work together as a cohesive system to manage data
Whatever happened to Durability? - DriveScale
Durability is one of the fundamental properties that people expect from data bases and file systems. From Wikipedia: "The durability property ensures that..
web.archive.org
November 21, 2024 at 3:55 AM
reading atproto.com/specs/crypto...
- atproto supports p256 and k256 elliptic curves, bluesky defaults to k256
- both curves have lossless compression
- common signing pattern: encode data in DAG-CBOR ➡️ SHA-256 ➡️ sign hash bytes
- encode public keys using multibase and multicode
November 19, 2024 at 5:33 AM
reading atproto.com/specs/reposi...
- data repo structure is a merkle search tree (mst)
- top level node is a signed commit object pointing to the mst root node
- all mutations to records result in a new mst root node ➡️ a new signed commit object

💡: efficient, verifiable, portable
November 19, 2024 at 4:18 AM
as a backend engineer, i started with atproto.com/articles/atp....
awesome article. easy to digest. once you understand the decentralized system you can move on to the details of the protocol, atproto.com/specs/atp.
November 18, 2024 at 3:41 AM
i'll be using bluesky as a repository of my notes, thoughts, and ideas. i'll start with diving into atproto
November 18, 2024 at 3:34 AM