Hands-On Neural Networks
上QQ阅读APP看书,第一时间看更新

ReLU

ReLU is one of the most commonly used activation functions. It behaves like a linear function when the input is greater than 0; otherwise, it will always be equal to 0. It's the analog of the half-wave rectification in electrical engineering, :

The ReLU function

The range for this function is from 0 to infinite. The issue is that the negative values become zero; therefore, the derivative will always be constant. This is clearly an issue for backpropagation, but in practical cases, it does not have an effect. 

There are a few variants of ReLU; one of the most common ones is Leaky ReLU, which aims to allow a positive small gradient when the function is not active. Its formula is as follows:

Here,  is typically 0.01, as shown in the following diagram:

The Leaky ReLU function