Correct way of combining policy/entropy losses from multiple actions and agents

We should have a good theoretical understanding and/or empirical answers to the following questions:
- When a policy controls multiple entities, should the policy/entropy losses from each entity be summed, averaged, or combined in some other way?
- How do we combine losses from different actions?
- Do we need a different weight for losses from different action types and is there a good way to find the weights?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correct way of combining policy/entropy losses from multiple actions and agents #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Correct way of combining policy/entropy losses from multiple actions and agents #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions