Skip to content

[Feature] Add WFGY 16-problem RAG / agent failure map as a debugging lens for Generative Agents #201

@onestardao

Description

@onestardao

Hi, thanks for releasing Generative Agents. It has become a reference point for multi-agent simulations of human-like behaviour.

I maintain WFGY, an MIT-licensed framework that focuses on how RAG and agent systems fail in practice. The core is a 16-problem map that covers retrieval, reasoning, memory, multi-agent chaos and deployment issues:

This map is already used or cited in several research contexts, for example:

  • Harvard MIMS Lab ToolUniverse (LLM tools benchmark; WFGY listed in the robustness / RAG debugging section)
  • QCRI LLM Lab Multimodal RAG Survey (survey repo that includes WFGY as an open-source diagnostic reference)
  • University of Innsbruck Data Science Group Rankify (research RAG toolkit that links to the WFGY ProblemMap for troubleshooting)

Why this is relevant for Generative Agents

In generative-society simulations, people often hit patterns like:

  • believable local behaviour, but globally unstable or collapsing dynamics
  • long-run memory incoherence (agents forget important past events or contradict themselves)
  • retrieval or note-taking that looks fine in a unit test but leads to wrong long-term “beliefs”.

These patterns map directly to several WFGY problems, for example:

  • No.3 long reasoning chain drift
  • No.7 memory coherence breaks
  • No.11 symbolic collapse (abstract prompts no longer map to consistent structure)
  • No.13 multi-agent chaos.

Proposal

I would like to propose a small addition to the docs or the “limitations / debugging” section:

  1. Introduce the WFGY 16-problem map at a very high level, as a vocabulary for failure modes in generative-society setups.
  2. Provide a short table mapping common Generative Agents issues to specific WFGY problem numbers.
  3. Link back to the ProblemMap README so researchers can adopt the taxonomy for their own analyses.

If this sounds useful, I can draft the text in the style of the existing documentation and open a PR for review.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions