Paper reading: WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents [extensive reading]

发表于 2024-07-15 更新于 2024-07-17 分类于 Paper Reading 本文字数： 161 阅读时长 ≈ 1 分钟

UCB and Bardeen; seems rejected by CoLM 2024.

WILBUR, an approach that uses a differentiable ranking model and a novel instruction synthesis technique to optimally populate a black-box large language model’s prompt with task demonstrations from previous runs.

Strength:

Interesting Claim: learning how specific websites work is needed for both person and LLMs.
Implementation:
- explore, reflect and backtrack: verify whether the action succeed, and if not, backtrack to a previous successful state, while storing the failure in the model’s context
- retrieve demonstrations from a scalable knowledge bank: teach the agent to perform a similar task on a potentially unseen website and teach the agent to act on a similar web page regardless of the task
  - demonstration ranking model
- webvoyager sota, 53%

Drawbacks

no structural information in their dom representation: why?
I would like to first see experiments that verify your hypothesis (models need to learn how to use unseen websites)
too similar with RAP and Agent Hospital paper