Lightnews — Scholar-powered news

Erik Arakelyan

@kirekara.bsky.social

620 followers 120 following 14 posts

Researcher @Nvidia | PhD from @CopeNLU | Formerly doing magic at @Amazon Alexa AI and @ARM. ML MSc graduate from @UCL. Research is the name of the game. ᓚᘏᗢ

http://osoblanco.github.io

Posts Replies Media Videos

Erik Arakelyan

@kirekara.bsky.social

By comparing relations in code with those in search traces, we measure emergent hallucinations and unused relations, highlighting areas of sub-optimal reasoning. We also assess the uniqueness of emergent facts per inference hop, indicating the extent of problem-space exploration.

November 8, 2024 at 2:19 PM

Erik Arakelyan

@kirekara.bsky.social

We found out that there is a strong correlation between the search faithfulness towards the code and model performance across all of the models.

November 8, 2024 at 2:18 PM

Erik Arakelyan

@kirekara.bsky.social

Using FLARE also allows the evaluation of faithfulness of the completed search w.r.t. the defined facts, relations, and search logic (taken from Prolog). We simply compare (ROUGE-Lsum) the simulated search with the actual code execution when available.

November 8, 2024 at 2:17 PM

Erik Arakelyan

@kirekara.bsky.social

The method boosts the performance of various LLMs at different scales (8B -> 100B+) compared to CoT and Faithful CoT on various Mathematical, Multi-Hop, and Relation Inference tasks.

November 8, 2024 at 2:16 PM

Erik Arakelyan

@kirekara.bsky.social

LLM formalizes the tasks using Prolog into facts, relations, and search logic and simulates exhaustive search by iteratively exploring the problem space with backtracking.

November 8, 2024 at 2:15 PM

Erik Arakelyan

@kirekara.bsky.social

👋Psst! Want more faithful, verifiable and robust #LLM reasoning than with CoT, but using external solvers is meh? Our FLARE💫uses Logic Prog with Exhaustive Simulated Search to achieve this.🧵
@pminervini.bsky.social, Patrick Lewis, Pat Verga and @iaugenstein.bsky.social

arxiv.org/abs/2410.11900

November 8, 2024 at 2:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news