Neural networks

Activation function

If no activation function is applied, the neural network is just a fancy linear regression model. The activation brings non-linearity into the model. This non-linearity is going to be your model much more complex and more powerful.

Convolutional Neural Networks

In order to recognize a car or a human, it’s fairly challenging to do so if you’re thinking of things 1 pixel at a time. Instead, a convolutional neural network (CNN) aims to use a sliding window (a kernel) that takes into account a group of pixels, to hopefully recognize small features like “edges” or “curves” and then another layer might take combinations of edges or curves to detect shapes like squares or circles, or other complex types of shapes…and so on.

From ChatGPT:

Convolutional neural networks (CNNs) are commonly used in deep reinforcement learning (RL) for processing high-dimensional input data such as images, which are often encountered in visual RL tasks such as Atari games and robotics. The integration of CNNs in deep RL typically involves using them as function approximators to learn the optimal policy or value function for the RL agent.

In a deep RL setting, the CNN takes in the raw state observations as input and produces a low-dimensional feature representation that is then fed into the RL algorithm. The CNN parameters are learned through backpropagation of the RL objective function, which typically involves minimizing the mean squared error between the predicted values and the true values.

One common approach is to use a CNN as a function approximator for the Q-value function in Q-learning. The CNN takes as input the state observation, and outputs a Q-value for each possible action. The action with the highest Q-value is then selected as the action to be taken by the RL agent.

Another approach is to use a CNN as a policy network, where the network directly maps state observations to action probabilities. The policy network is trained using policy gradient methods such as REINFORCE or actor-critic algorithms, where the gradients are computed using the chain rule of calculus.

Overall, the integration of CNNs in deep RL has led to significant advances in solving complex RL problems that involve high-dimensional input data, and has enabled the development of RL agents that can learn to play Atari games, control robots, and perform other challenging tasks.