A Hierarchical Deep Reinforcement Learning Algorithm with Stochastic Policy Gradient for Robust Robotic Manipulation
pdf

Keywords

Hierarchical Reinforcement Learning
Deep Reinforcement Learning
Stochastic Policy Gradient
Robotic Manipulation
Autonomous Robotics
Policy Optimization

Abstract

 The domain of robotic manipulation has witnessed  significant advancements through the application of deep  reinforcement learning, yet substantial challenges remain regarding  sample efficiency, generalization, and robustness against  environmental perturbations. This paper introduces a novel  Hierarchical Deep Reinforcement Learning framework integrated  with a Stochastic Policy Gradient mechanism designed specifically  to address the high-dimensional state-action spaces inherent in  multi-joint robotic control. By decomposing complex manipulation  tasks into temporally extended sub-goals managed by a high-level  policy, and executing primitive motor commands via a low-level 
controller, the proposed architecture effectively mitigates the sparse  reward problem. Furthermore, the incorporation of a stochastic  policy gradient enables the agent to maintain extensive exploration  capabilities while ensuring robust performance in the presence of  sensor noise and dynamic friction changes. We demonstrate the  efficacy of this approach through rigorous simulation experiments  involving complex pick-and-place and stacking tasks. The results  indicate that our method significantly outperforms varying state-of the-art baselines in terms of convergence speed and success rates  under adversarial conditions. 

pdf
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Copyright (c) 2026 Ha-Eun Yoon , Ji-A Jang (Author)