Deep-RL: DQN — Regression or Classification?

Setup: We assume a Deep-RL example of a DQN with 4 discrete available actions, and we want to select one of the available actions depening on our estimated state

Introduction

Q-Learning

Monte-Carlo Methods

Temporal Difference

Sarsa

SarsaMax (Q-Learning)

Updating Q-Table using Sarsa

Deep Q Networks

In the Neural Network, we have: “Input=State” and “Output=Action”

Classification or Regression?

The important note here is that, each of these outputs (one for each possible action) measures the same thing! Estimated total reward!

So step back and think about it…

They all measure reward scores! So since we measure the same thing in multiple outputs, then the output with the highest reward over the same measurement is the topmost candidate.

--

--

--

MSc Computer Science. — Software engineer and programming instructor. Actively involved in Android Development and Deep Learning.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Handwritten Text Recognition system using TensorFlow

Matrix Design for Vector Space Models in Natural Language Processing

Online Toxic Comment Classification

A genetic algorithm solving Flappy Bird using data science

How do you distinguish between Cats and Dogs, anyway?

Improving latent representations in Variational Autoencoders with tf 2.x

Mercedes-Benz Greener Manufacturing: Getting into Top 50!

Face Detection in 10 lines for beginners

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ioannis Anifantakis

Ioannis Anifantakis

MSc Computer Science. — Software engineer and programming instructor. Actively involved in Android Development and Deep Learning.

More from Medium

Activation Functions

Classification Of Tweets

K-means clustering and its Real use-cases in the Security Domain.

An Intuitive Approach to the Scytale Cipher — Java