Presentation

Track: VoiceTech

How Susceptible are LLMs to Logical Fallacies?

This work investigates the rational thinking capability of Large Language Models (LLMs) in multi-round argumentative
debates by exploring the impact of fallacious arguments on their logical reasoning performance. More specifically, we
present Logic Competence Measurement Benchmark (LOGICOM), a diagnostic benchmark to assess the robustness
of LLMs against logical fallacies. LOGICOM involves two agents: a persuader and a debater engaging in a multi-round
debate on a controversial topic, where the persuader tries to convince the debater of the correctness of its claim.
First, LOGICOM assesses the potential of LLMs to change their opinions through reasoning. Then, it evaluates the
debater’s performance in logical reasoning by contrasting the scenario where the persuader employs logical fallacies
against one where logical reasoning is used. We use this benchmark to evaluate the performance of GPT-3.5 and
GPT-4 using a dataset containing controversial topics, claims, and reasons supporting them. Our findings indicate
that both GPT-3.5 and GPT-4 can adjust their opinion through reasoning. However, when presented with logical
fallacies, GPT-3.5 and GPT-4 are erroneously convinced 41% and 69% more often, respectively, compared to when
logical reasoning is used. Finally, we introduce a new dataset containing over 5k pairs of logical vs. fallacious
arguments. The source code is publicly available.

Dan Pluth - Speaker

Presentation Video

Presentation Notes

rtc_2024.pdf

Real Time Communications Conference & Expo at Illinois Tech

IEEE International Conference

Follow Us

Share This

News

RTC Conference 2024

RTC Conference is Today!

RTC Conference at Illinois Tech

More Info: