OfflineRL
OfflineRL copied to clipboard
Question about cql_loss calculation in COMBO
When COMBO is derived from CQL, why do they calculate CQL_loss differently?