This is the repository of a project for my masters in Data Science @ City University of London for the purpose of Deep Learning 3: Optimization module.
In the q_learing_policies directory are located the 4 ipynb files, each of which apply a different exploration policy. The algorithm is applied to a self made environment which is based on the frozen lake principles. We investigate the importance of the selected exlporation technique in the simple Q Learinng algorithm on the efficiency of the model. The compared exploration strategies are:
- random policy
- epsilon greedy-constant policy
- epsilon greedy-decay policy
- boltzmann policy
dddqn_LunarLander.ipynb contains a Dueling Double Deep Q Network implementation in the Lunar Lander environment from OpenAI. Information about the environment can be found both in the report of this repo and in the official website of the environment.
For the completion of the project Kimon Iliopoulos and Konstantinos Gkolias contributed equally.