Paper reading: RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

NUS, arxiv preprint

Motivation:

Reflecting past experiences in current decision-making processes, an innate human behavior, continues to pose significant challenges. Addressing this, we propose Retrieval-Augmented Planning (RAP) framework, designed to dynamically leverage past experiences corresponding to the current situation and context, thereby enhancing agents’ planning capabilities.

Strength

  • Retrieve related experience as new ICL examples
  • image-20240716200249688

Challenges

  • lack of comparison between fine-tunning the model on the memory database