[go: up one dir, main page]

Stack_compress unexpected behavior

Hello! As I understand, I should use on_episode_end method at the end of episode to prevent overwriting of stacked values in current episode by values from next. However it still overwrites them:

import numpy as np
from cpprb import PrioritizedReplayBuffer

rb = PrioritizedReplayBuffer(32, env_dict={'done': {'dtype': 'bool'},
                                           'a' : {'shape': (3)}},
                             stack_compress='a')

a = np.array([0, 1, 2])
for i in range(3):
    done = i == 2
    rb.add(a=a, done=done)
    if done:
        rb.on_episode_end()
    a += 1
rb.add(a=np.ones(3), done=False)
print(rb.get_all_transitions())

Output:

{'done': array([[False],
       [False],
       [ True],
       [False]]), 'a': array([[0., 1., 2.],
       [1., 2., 1.], # Should be [1, 2, 3]
       [2., 3., 4.],
       [1., 1., 1.]], dtype=float32)}