Paper reading: ChatDev: Communicative Agents for Software Development [extensive reading]

ChatDev: Communicative Agents for Software Development

arXiv 2024, THUNLP

a chat-powered software development framework integrating multiple “software agents” for active involvement in three core phases of the software lifecycle: design, coding, and testing

Communicative De-hallucination


  • Novel Communicative De-hallucination
  • Agent-based self-consistency?
  • Defines dataset, goal, and metric


  • Mainly on evaluation metrics.
  • Completeness – Why there will be incomplete code?
  • How do you define executability?
    • How much code can be compiled isn’t a good metric
    • Python?
  • Consistency – weird definition
    • Semantic embedding of software code with textual requirements
    • WHY?
  • Quality:
    • Multiply of completeness, executability, and consistency
    • Okay