Paper reading: MIND2WEB: Towards a Generalist Agent for the Web
MIND2WEB: Towards a Generalist Agent for the Web
OSU-NLP, NIPS 2023 Dataset Track
A dataset of real tasks on real-world websites, including tasks and user interaction traces.
Dataset Format
- Action Traces
- cleaned html and raw html at each point
- repr of action
Strength
- Very good project homepage – https://osu-nlp-group.github.io/Mind2Web/
- annotation process:
- human-annotated dataset
- select element then select action
- Method:
- two-step.
- select candidate dom elements
- a 0-1 score for Each Candidate Element
- random negative samples
- Generate action
- a multi-choice on the candidate element
- select candidate dom elements
- two-step.
Challenges
- a 0-1 score for each candidate element is too complex
- you need many calls for a single action
- unfeasible