What I understood so far: The starting point must be an existing C codebase with good unit tests. The LLM is used as a possibly buggy code translator that also has the ability to sometimes fix errors and repeat. It is hoped that the feedback loop converges.
From their paper:
We restrict the input C program to the following conditions: (1) Acyclic data structures: data structures [should have no] pointer cycles ... (2) No multithreading ... (3) No type punning: performing raw memory accesses on untyped or multi-typed memory regions would hinder the best effort type analysis that our approach aims to perform.
no subject
https://syzygy-project.github.io/assets/paper.pdf
> combine the generative capabilities of LLMs with semantic execution information collected by dynamic analyses on the source C codebase
> test translation approach for reliable equivalence testing
no subject
From their paper:
We restrict the input C program to the following conditions:
(1) Acyclic data structures: data structures [should have no] pointer cycles ...
(2) No multithreading ...
(3) No type punning: performing raw memory accesses on untyped or multi-typed memory regions would hinder the best effort type analysis that our approach aims to perform.
no subject