Lightnews — Scholar-powered news

Morris Alper

@malper.bsky.social

ConlangCrafter could potentially be used in pedagogy, typological and NLP work, and many entertainment applications. Imagine a video game where aliens can speak countless new procedurally-generated languages.

October 11, 2025 at 5:35 AM

Morris Alper

@malper.bsky.social

To enhance consistency and diversity, our pipeline incorporates randomness injection and self-refinement mechanisms. This is measured by our novel evaluation framework, providing rigorous evaluation for the new task of computational conlanging.

October 11, 2025 at 5:35 AM

Morris Alper

@malper.bsky.social

The ConlangCrafter pipeline harnesses an LLM to generate a description of a constructed language and self refines it in the process. We decompose language creation into phonology, grammar, and lexicon, and then translate sentences while constructing new needed grammar points.

October 11, 2025 at 5:35 AM

Morris Alper

@malper.bsky.social

Conlangs (Constructed Languages), from Tolkien’s Elvish to Esperanto, have long been created for artistic, philosophical, or practical purposes.
As generative AI proves its creative power, we ask:
Can it also take on the laborious art of conlang creation?

October 11, 2025 at 5:35 AM

Morris Alper

@malper.bsky.social

The number of languages in the world just got a lot higher! At least constructed ones.
Meet ConlangCrafter - a pipeline for creating novel languages with LLMs.
A Japanese-Esperanto creole? An alien cephalopod color-based language?
Enter your idea and see a conlang emerge. 🧵👇

October 11, 2025 at 5:35 AM

Morris Alper

@malper.bsky.social

At inference time, we inject the appearance of the observed view to get consistent novel views. This also enables cool applications like appearance-conditioned NVS! (4/5)

June 17, 2025 at 4:16 PM

Morris Alper

@malper.bsky.social

To learn from this data, we use a novel multi-view diffusion architecture adapted from CAT3D, modeling appearance variations with a bottleneck encoder applied to VAE latents and disambiguating scene scale via warping. (3/5)

June 17, 2025 at 4:16 PM

Morris Alper

@malper.bsky.social

Photos like the ones below differ in global appearance (day vs. night, lighting), aspect ratio, and even weather. But they give clues to how scenes are build in 3D. (2/5)

June 17, 2025 at 4:16 PM

Morris Alper

@malper.bsky.social

💥New preprint! WildCAT3D uses tourist photos in-the-wild as supervision to learn to generate novel, consistent views of scenes like the one shown below. h/t Tom Monnier and all collaborators (1/5)

June 17, 2025 at 4:16 PM

Morris Alper

@malper.bsky.social

Finally we show that ProtoSnap-aligned skeletons can be used as conditions for a ControlNet model to generate synthetic OCR training data. By controlling the shapes of signs in training, we can achieve SOTA on cuneiform sign recognition. (Bottom: synthetic generated sign images)

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

Our results show that ProtoSnap effectively aligns wedge-based skeletons to scans of real cuneiform signs, with global and local refinement steps. We provide a new expert-annotated test set to quantify these results.

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

ProtoSnap uses features from a fine-tuned diffusion model to optimize for the correct alignment between a skeleton matched with a prototype font image and a scanned sign. Perhaps surprising that image generation models can be applied to this sort of discriminative task!

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

We tackle this by directly measuring the internal configuration of characters. Our approach ProtoSnap "snaps" a prototype (font)-based skeleton onto a scanned cuneiform sign using a multi-stage pipeline with SOTA methods from computer vision and generative AI.

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

Some prior work has tried to classify scans of signs categorically, but signs' shapes differ drastically in different time periods and regions making this less effective. E.g. both signs below are AN, from different eras. (Top: font prototype; bottom: scan of sign real tablet)

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

Cuneiform at #ICLR2025! ProtoSnap finds the configuration of wedges in scanned cuneiform signs for downstream applications like OCR. A new tool for understanding the ancient world!
tau-vailab.github.io/ProtoSnap/
h/t Rachel Mikulinsky @ShGordin @ElorHadar and all collaborators.
🧵👇

February 4, 2025 at 6:24 PM

Morris Alper

@malper.bsky.social

Our results show that ProtoSnap effectively aligns wedge-based skeletons to scans of real cuneiform signs, with global and local refinement steps. We provide a new expert-annotated test set to quantify these results.

February 4, 2025 at 6:13 PM

Morris Alper

@malper.bsky.social

ProtoSnap uses features from a fine-tuned diffusion model to optimize for the correct alignment between a skeleton matched with a prototype font image and a scanned sign. Perhaps surprising that image generation models can be applied to this sort of discriminative task!

February 4, 2025 at 6:13 PM

Morris Alper

@malper.bsky.social

We tackle this by directly measuring the internal configuration of characters. Our approach ProtoSnap "snaps" a prototype (font)-based skeleton onto a scanned cuneiform sign using a multi-stage pipeline with SOTA methods from computer vision and generative AI.

February 4, 2025 at 6:13 PM

Morris Alper

@malper.bsky.social

Some prior work has tried to classify scans of signs categorically, but signs' shapes differ drastically in different time periods and regions making this less effective. E.g. both signs below are AN, from different eras. (Top: font prototype; bottom: scan of sign real tablet)

February 4, 2025 at 6:13 PM

Morris Alper

@malper.bsky.social

We show that our dataset serves as a new, challenging benchmark for common floorplan understanding tasks such as semantic segmentation. We also show it can be used to enable new tasks such as floorplan generation conditioned on building type and boundary.

December 10, 2024 at 4:20 PM

Morris Alper

@malper.bsky.social

We use modern foundation models (LLMs, vision-language models) to filter and structure raw, noisy open data to identify floorplan images and extract structured metadata, including global properties (e.g. floorplan type) and grounded architectural features within images.

December 10, 2024 at 4:20 PM

Morris Alper

@malper.bsky.social

WAFFLE (WikipediA-Fueled FLoorplan Ensemble) is a multimodal dataset of ~20K diverse floorplans, of many building types (e.g. homes, churches, hospitals, schools, ...), regions, eras, and data formats, along with structured metadata.

December 10, 2024 at 4:20 PM

Morris Alper

@malper.bsky.social

Bite into WAFFLE 🧇, our new multimodal floorplan dataset and paper - now accepted to #WACV2025!
Work with Keren Ganon, Rachel Mikulinsky, Hadar Elor.
More info below👇

December 10, 2024 at 4:20 PM

Morris Alper

@malper.bsky.social

We show that our dataset serves as a new, challenging benchmark for common floorplan understanding tasks such as semantic segmentation. We also show it can be used to enable new tasks such as floorplan generation conditioned on building type and boundary.

December 10, 2024 at 4:17 PM

Morris Alper

@malper.bsky.social

We use modern foundation models (LLMs, vision-language models) to filter and structure raw, noisy open data to identify floorplan images and extract structured metadata, including global properties (e.g. floorplan type) and grounded architectural features within images.

December 10, 2024 at 4:17 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news