Towards Combining the Cognitive Abilities of Large Language Models with the Rigor of Deductive Progam Verification

27. Oct 2024·

Bernhard Beckert

Jonas Klamroth

Wolfram Pfeifer

Patrick Röper

Samuel Teuber

· 0 min read

PDF Cite DOI

Abstract

Recent investigations hint at the ability of large language models (LLMs) to generate formal specifications for given program code. In this work, we systematically discuss and categorize different use cases and application scenarios that combine specification synthesis via LLMs with deductive program verification. We present preliminary quantitative experiments on the capabilities of LLMs to generate correct specifications. To this end, we use a prototypical integration of GPT (versions 3.5 and 4o) with the deductive program verifier KeY and the bounded model checker JJBMC. We evaluated our prototype on a set of Java programs that are partially annotated with specifications written in the Java Modeling Language (JML). We show that GPT 4o generates correct annotations in approximately half of all instances across the investigated scenarios. For the case of faulty specifications, we investigate how a feedback loop can help to improve the original answer. Finally, we present a vision of how Large Language Models may support rigorous formal verification of software systems and describe the necessary next steps in this direction.

Type

Conference paper

Publication

12th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation

Last updated on 27. Oct 2024

Formal Methods Program Analysis Deductive Program Verification Large Language Models Specification Generation Design by Contract Java Modeling Language

Authors

Samuel Teuber

PhD Student

Interested in formal methods for software and machine learning verification with a focus on cyber-physical systems and algorithmic fairness.

← Provably Safe Neural Network Controllers via Differential Dynamic Logic 10. Dec 2024

An Information-Flow Perspective on Algorithmic Fairness 20. Feb 2024 →