This repository contains an implementation of the Deep Deterministic Policy Gradient (DDPG) algorithm applied to the Pendulum-v1 environment from OpenAI's Gymnasium. The aim is to train an agent to balance the pendulum using the actor-critic algorithm.
The project is organized as follows:
Train Expert (DDPG).ipynb: The Jupyter notebook that demonstrates the training of the agent and tests its performance.algorithms/ddpg.py: The implementation of the DDPG algorithm, including training and testing functions.networks/actor_critic.py: Defines the neural networks for the Actor and Critic models used in DDPG.utils/normalize_env.py: A utility to normalize the action space of the environment.utils/ou_noise.py: Implements the Ornstein-Uhlenbeck noise process for exploration.utils/replay.py: Implements a ReplayBuffer to store and sample experiences during training.To run the project, you need to install the following dependencies:
pip install gymnasium matplotlib numpy torch imageio tqdm
After training the agent, the following files are generated: