Home » Blog » Reinforcement Learning | Machine Learning Beginners

Reinforcement Learning | Machine Learning Beginners

Hello guys,

In this article we are going to understand last flavor of machine learning that is reinforcement learning.

Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward.

Ohh no, what definition says!!!! Didn’t understand?? Don’t worry I am going to tell you what is reinforcement learning in easy words.

Basically, Reinforcement learning is learning by interacting with the environment. We can take it as a human experience, like we (human being) learn something from experience.

When we are doing something for the first time, then at that time we don’t know how we can do it, what consequences are there, what is the output for this. But when we we were doing it for the couple of day or month we understand that if we do this type of thing then these are the consequences, or if we do this thing in this manner then we can done that task faster and etc.

As same manner in Reinforcement learning, we are training our model such that it can learn by interacting with environment. Model can be rewarded or penalize.

Machine take decisions on their own, after that if the decision is right then machine will be rewarded for that decision or if the decision is wrong then machine will be penalize for that decision, based on this scenario we train our model.

The goal of Reinforcement learning is to maximize the right decision(get maximum rewards).

Real World Example Of Reinforcement Learning

Traffic Light Controller

Researchers tries to design traffic light controller to solve the congestion problem. Tested only on simulated environment though, their methods gives superior results than the traditional methods.

Robotics

There are tremendous work on applying RL in Robotics. The RGB images were fed to a CNN and outputs were the motor torques. The RL component was the guided policy search to generate training data that came from its own state distribution.

Leave a Reply

Your email address will not be published.