Lightnews — Scholar-powered news

Thiard News@F4F

@newsen.bsky.social

Marblegate Capital Secures $137 Million to Enhance NYC Taxi Operations#USA #New_York_City #Marblegate_Capital #NYC_Taxi #Signal_Taxi

Marblegate Capital Secures $137 Million to Enhance NYC Taxi Operations

Marblegate Capital Corporation has successfully closed financings worth $137 million to bolster operations and growth in the NYC taxi industry.

third-news.com

January 6, 2026 at 12:01 AM

Thiard News@F4F

@newsen.bsky.social

Marblegate Capital Corporation Welcomes Michael Hutchby as CFO to Lead Financial Strategy#United_States #New_York #Marblegate_Capital #NYC_Taxi #Michael_Hutchby

Marblegate Capital Corporation Welcomes Michael Hutchby as CFO to Lead Financial Strategy

Marblegate Capital Corporation announces Michael Hutchby's appointment as CFO, bringing over two decades of experience to enhance financial operations and strategy.

third-news.com

July 1, 2025 at 12:17 PM

Thiard News@F4F

@newsen.bsky.social

Marblegate Capital Merges to Transform the NYC Taxi Medallion Market#USA #New_York #Marblegate_Capital #NYC_Taxi #DePalma_Companies

Marblegate Capital Merges to Transform the NYC Taxi Medallion Market

Marblegate Capital Corporation has successfully completed its merger with Marblegate Acquisition Corp. and DePalma Companies to establish a leading NYC taxi medallion lender and fleet operator.

third-news.com

April 10, 2025 at 11:12 AM

Al Merose (he/him)

@al.merose.com

Ok, just looked at the benchmark overview and I’m a little disappointed. Comparing performance on a dataset of ~30 GBs tells me very little given it could all fit into RAM on commodity hardware. Like, the difference is up to chunking.

Screenshot of a github README about benchmarks that reads: “For this benchmark, we use the full FHVHV dataset stored in Parquet files on S3. The total size of this dataset is 24.7 GiB. The Central Park Weather data ia stored in a single CSV file on S3 and its total size is 514 KiB.
We compared Bodo's performance on this workload to other systems including Dask, Modin on Ray, and PySpark and observed a speedup of 20-240x. The implementations for all of these systems can be found in nyc_taxi. Versions of the packages used are summarized below.”

January 2, 2025 at 3:29 AM

Jarrett Byrnes

@jebyrnes.bsky.social

open_dataset("nyc-taxi/")
nyc_taxi |>
filter(payment_type == "Credit card") |>
group_by(year, month) |>
write_dataset("nyc-taxi-credit")

Input is 1.7 billion rows (70GB), output is 500 million (15GB). Takes 3-4 mins on my laptop 🙂

#rstats (2/3)

November 23, 2024 at 12:55 AM

Jarrett Byrnes

@jebyrnes.bsky.social

RT @djnavarro@fosstodon.org
My favourite trick for working with huge data sets in R. If your dataset is larger than memory and the query result is also larger than memory, you can still use dplyr/arrow pipelines. Example:

library(arrow)
library(dplyr)

nyc_taxi <- (1/3)

November 23, 2024 at 12:49 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news