Welcome to Echo AI documentation!

About

EchoAI Package is created to provide an implementation of the most promising mathematical algorithms, which are missing in the most popular deep learning libraries, such as PyTorch, Keras and TensorFlow.

Implemented Activation Functions

The list of activation functions implemented in Echo:

# Activation PyTorch Keras TensorFlow Keras
1 Weighted Tanh Torch.WeightedTanh Keras.WeightedTanh Tensorflow_Keras.WeightedTanh
2 Aria2 Torch.aria2 Keras.Aria2 Tensorflow_Keras.Aria2
3 SiLU Torch.Silu
4 E-Swish Torch.Eswish Keras.Eswish Tensorflow_Keras.ESwish
5 Swish Torch.Swish Keras.Swish Tensorflow_Keras.Swish
6 ELiSH Torch.Elish Keras.Elish Tensorflow_Keras.ELiSH
7 Hard ELiSH Torch.HardElish Keras.HardElish Tensorflow_Keras.HardELiSH
8 Mila Torch.Mila Keras.Mila Tensorflow_Keras.Mila
9 SineReLU Torch.SineReLU Keras.SineReLU Tensorflow_Keras.SineReLU
10 Flatten T-Swish Torch.FTS Keras.FTS Tensorflow_Keras.FlattenTSwish
11 SQNL Torch.SQNL Keras.SQNL Tensorflow_Keras.SQNL
12 Mish Torch.Mish Keras.Mish Tensorflow_Keras.Mish
13 Beta Mish Torch.BetaMish Keras.BetaMish Tensorflow_Keras.BetaMish
14 ISRU Torch.ISRU Keras.ISRU Tensorflow_Keras.ISRU
15 ISRLU Torch.ISRLU Keras.ISRLU Tensorflow_Keras.ISRLU
16 Bent’s Identity Torch.BentID Keras.BentID Tensorflow_Keras.BentIdentity
17 Soft Clipping Torch.SoftClipping Keras.SoftClipping Tensorflow_Keras.SoftClipping
18 SReLU Torch.SReLU Keras.SReLU Tensorflow_Keras.SReLU
19 BReLU Torch.BReLU
Tensorflow_Keras.BReLU
20 APL Torch.APL
Tensorflow_Keras.APL
21 Soft Exponential Torch.SoftExponential Keras.SoftExponential Tensorflow_Keras.SoftExponential
22 Maxout Torch.Maxout
Tensorflow_Keras.MaxOut
23 CELU
Keras.Celu Tensorflow_Keras.CELU
23 ReLU6
Keras.ReLU6
24 Hard Tanh
Keras.HardTanh Tensorflow_Keras.HardTanh
25 Log Sigmoid
Keras.LogSigmoid Tensorflow_Keras.LogSigmoid
26 Tanh Shrink
Keras.TanhShrink Tensorflow_Keras.TanhShrink
27 Hard Shrink
Keras.HardShrink Tensorflow_Keras.HardShrink
28 Soft Shrink
Keras.SoftShrink Tensorflow_Keras.SoftShrink
29 Softmin
Keras.SoftMin Tensorflow_Keras.SoftMin
30 LogSoftmax
Keras.LogSoftmax Tensorflow_Keras.LogSoftMax

Installation

To install EchoAI package from source follow the instructions below:

  1. Clone or download GitHub repository.
  2. Navigate to echoAI folder:
>>> $ cd Echo
  1. Install the package with pip:
>>> $ pip install .

To install EchoAI package from PyPI follow the instructions below:

>>> $ pip install echoAI

Examples

Torch Activation Functions

The following code block contains an example of usage of a PyTorch activation function from Echo package:

# import activations from EchoAI
from echoAI.Activation.Torch.weightedTanh import WeightedTanh
import echoAI.Activation.Torch.functional as Func

# use activations in layers of model defined in class
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()

        # initialize layers
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)

    def forward(self, x):
        # make sure the input tensor is flattened
        x = x.view(x.shape[0], -1)

        # apply activation function from Echo
        x = Func.weighted_tanh(self.fc1(x), weight = 1)

        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = F.log_softmax(self.fc4(x), dim=1)

        return x

def main():
    # Initialize the model using defined Classifier class
    model = Classifier()

    # Create model with Sequential
    model = nn.Sequential(OrderedDict([
                         ('fc1', nn.Linear(784, 256)),
                         # use activation function from Echo
                         ('wtahn1',  WeightedTanh(weight = 1)),
                         ('fc2', nn.Linear(256, 128)),
                         ('bn2', nn.BatchNorm1d(num_features=128)),
                         ('relu2', nn.ReLU()),
                         ('dropout', nn.Dropout(0.3)),
                         ('fc3', nn.Linear(128, 64)),
                         ('bn3', nn.BatchNorm1d(num_features=64)),
                         ('relu3', nn.ReLU()),
                         ('logits', nn.Linear(64, 10)),
                         ('logsoftmax', nn.LogSoftmax(dim=1))]))

Keras Activation Functions

The following code block contains an example of usage of a Keras activation function from Echo package:

# Import the activation function from Echo package
from echoAI.Activation.Keras.custom_activations import WeightedTanh

# Define the CNN model
def CNNModel(input_shape):
    """
    Implementation of the simple CNN.

    INPUT:
        input_shape -- shape of the images of the dataset

    OUTPUT::
        model -- a Model() instance in Keras
    """

    # Define the input placeholder as a tensor with shape input_shape.
    X_input = Input(input_shape)

    # Zero-Padding: pads the border of X_input with zeroes
    X = ZeroPadding2D((3, 3))(X_input)

    # CONV -> BN -> Activation Block applied to X
    X = Conv2D(32, (3, 3), strides = (1, 1), name = 'conv0')(X)
    X = BatchNormalization(axis = 3, name = 'bn0')(X)

    # Use custom activation function from Echo package
    X = WeightedTanh()(X)

    # MAXPOOL
    X = MaxPooling2D((2, 2), name='max_pool')(X)

    # FLATTEN X (means convert it to a vector) + FULLYCONNECTED
    X = Flatten()(X)
    X = Dense(10, activation='softmax', name='fc')(X)

    # Create model
    model = Model(inputs = X_input, outputs = X, name='CNNModel')

    return model

# Create the model
model = CNNModel((28,28,1))

# Compile the model
model.compile(optimizer = "adam", loss = "mean_squared_error", metrics = ["accuracy"])

PyTorch Extensions

Torch.aria2

Applies the Aria-2 function element-wise:

\[Aria2(x, \alpha, \beta) = (1+e^{-\beta*x})^{-\alpha}\]

See Aria paper: https://arxiv.org/abs/1805.08878

class echoAI.Activation.Torch.aria2.Aria2(beta=1.0, alpha=1.5)[source]

Applies the Aria-2 function element-wise:

\[Aria2(x, \alpha, \beta) = (1+e^{-\beta*x})^{-\alpha}\]

Plot:

_images/aria2.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • alpha: hyper-parameter which has a two-fold effect; it reduces the curvature in 3rd quadrant as well as increases the curvature in first quadrant while lowering the value of activation (default = 1)
  • beta: the exponential growth rate (default = 0.5)
References:
Examples:
>>> m = Aria2(beta=0.5, alpha=1)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Mish

Applies the mish function element-wise:

\[mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))\]

Heng’s optimized implementation of mish: https://www.kaggle.com/c/severstal-steel-defect-detection/discussion/111457#651223

class echoAI.Activation.Torch.mish.Mish(*args, **kwargs)[source]

Applies the mish function element-wise:

\[mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))\]

Plot:

_images/mish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Examples:
>>> m = Mish()
>>> input = torch.randn(2)
>>> output = m(input)
class echoAI.Activation.Torch.mish.MishFunction(*args, **kwargs)[source]

Torch.BetaMish

Applies the β mish function element-wise:

\[\beta mish(x) = x * tanh(ln((1 + e^{x})^{\beta}))\]
class echoAI.Activation.Torch.beta_mish.BetaMish(beta=1.5)[source]

Applies the β mish function element-wise:

\[\beta mish(x) = x * tanh(ln((1 + e^{x})^{\beta}))\]

Plot:

_images/beta_mish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • beta: hyperparameter (default = 1.5)
References
  • β-Mish: An uni-parametric adaptive activation function derived from Mish:

https://github.com/digantamisra98/Beta-Mish)

Examples:
>>> m = BetaMish(beta=1.5)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Silu

Applies the Sigmoid Linear Unit (SiLU) function element-wise:

\[silu(x) = x * sigmoid(x)\]

See related paper: https://arxiv.org/pdf/1606.08415.pdf

class echoAI.Activation.Torch.silu.Silu(inplace=False)[source]

Applies the Sigmoid Linear Unit (SiLU) function element-wise:

\[silu(x) = x * sigmoid(x)\]

Plot:

_images/silu.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • inplace - (bool) if inplace == True operation is performed inplace
References:
  • Related paper:

https://arxiv.org/pdf/1606.08415.pdf

Examples:
>>> m = Silu(inplace = False)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Eswish

Applies the E-Swish function element-wise:

\[ESwish(x, \beta) = \beta*x*sigmoid(x)\]

See E-Swish paper: https://arxiv.org/abs/1801.07145

class echoAI.Activation.Torch.eswish.Eswish(beta=1.75)[source]

Applies the E-Swish function element-wise:

\[ESwish(x, \beta) = \beta*x*sigmoid(x)\]

Plot:

_images/eswish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • beta: a constant parameter (default value = 1.375)
References:
  • See related paper:

https://arxiv.org/abs/1801.07145

Examples:
>>> m = Eswish(beta=1.375)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Swish

Applies the Swish function element-wise:

\[Swish(x, \beta) = x*sigmoid(\beta*x) = \frac{x}{(1+e^{-\beta*x})}\]

See Swish paper: https://arxiv.org/pdf/1710.05941.pdf

class echoAI.Activation.Torch.swish.Swish(beta=1.25)[source]

Applies the Swish function element-wise:

\[Swish(x, \beta) = x*sigmoid(\beta*x) = \frac{x}{(1+e^{-\beta*x})}\]

Plot:

_images/swish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • beta: hyperparameter, which controls the shape of the bump (default = 1.25)
References:
  • See Swish paper:

https://arxiv.org/pdf/1710.05941.pdf

Examples:
>>> m = Swish(beta=1.25)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Elish

Applies the ELiSH (Exponential Linear Sigmoid SquasHing) function element-wise:

\[\begin{split}ELiSH(x) = \left\{\begin{matrix} x / (1+e^{-x}), x \geq 0 \\ (e^{x} - 1) / (1 + e^{-x}), x < 0 \end{matrix}\right.\end{split}\]

See ELiSH paper: https://arxiv.org/pdf/1808.00783.pdf

class echoAI.Activation.Torch.elish.Elish[source]

Applies the ELiSH (Exponential Linear Sigmoid SquasHing) function element-wise:

\[\begin{split}ELiSH(x) = \left\{\begin{matrix} x / (1+e^{-x}), x \geq 0 \\ (e^{x} - 1) / (1 + e^{-x}), x < 0 \end{matrix}\right.\end{split}\]

Plot:

_images/elish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
References:
  • Related paper:

https://arxiv.org/pdf/1710.05941.pdf

Examples:
>>> m = Elish()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.HardElish

Applies the HardELiSH function element-wise:

\[\begin{split}HardELiSH(x) = \left\{\begin{matrix} x \times max(0, min(1, (x + 1) / 2)), x \geq 0 \\ (e^{x} - 1)\times max(0, min(1, (x + 1) / 2)), x < 0 \end{matrix}\right.\end{split}\]

See HardELiSH paper: https://arxiv.org/pdf/1808.00783.pdf

class echoAI.Activation.Torch.hard_elish.HardElish[source]

Applies the HardELiSH function element-wise:

\[\begin{split}HardELiSH(x) = \left\{\begin{matrix} x \times max(0, min(1, (x + 1) / 2)), x \geq 0 \\ (e^{x} - 1)\times max(0, min(1, (x + 1) / 2)), x < 0 \end{matrix}\right.\end{split}\]

Plot:

_images/hard_elish.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
References:
  • See HardELiSH paper:

https://arxiv.org/pdf/1710.05941.pdf

Examples:
>>> m = HardElish()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.Mila

Applies the Mila function element-wise:

\[mila(x) = x * tanh(ln(1 + e^{\beta + x})) = x * tanh(softplus(\beta + x))\]

Refer to: https://github.com/digantamisra98/Mila

class echoAI.Activation.Torch.mila.Mila(beta=-0.25)[source]

Applies the Mila function element-wise:

\[mila(x) = x * tanh(ln(1 + e^{\beta + x})) = x * tanh(softplus(\beta + x)\]

Plot:

_images/mila.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • beta: scale to control the concavity of the global minima of the function (default = -0.25)
References:
Examples:
>>> m = Mila(beta=-0.25)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.SineReLU

Applies the SineReLU function element-wise:

\[\begin{split}SineReLU(x, \epsilon) = \left\{\begin{matrix} x , x > 0 \\ \epsilon * (sin(x) - cos(x)), x \leq 0 \end{matrix}\right.\end{split}\]

See related Medium article: https://medium.com/@wilder.rodrigues/sinerelu-an-alternative-to-the-relu-activation-function-e46a6199997d

class echoAI.Activation.Torch.sine_relu.SineReLU(epsilon=0.01)[source]

Applies the SineReLU function element-wise:

\[\begin{split}SineReLU(x, \epsilon) = \left\{\begin{matrix} x , x > 0 \\ \epsilon * (sin(x) - cos(x)), x \leq 0 \end{matrix}\right.\end{split}\]

Plot:

_images/sine_relu.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • epsilon: hyperparameter (default = 0.01) used to control the wave amplitude
References:
  • See related Medium article:

https://medium.com/@wilder.rodrigues/sinerelu-an-alternative-to-the-relu-activation-function-e46a6199997d

Examples:
>>> m = SineReLU()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.FTS

Applies the FTS (Flatten T-Swish) function element-wise:

\[\begin{split}FTS(x) = \left\{\begin{matrix} \frac{x}{1 + e^{-x}} , x \geq 0 \\ 0, x < 0 \end{matrix}\right.\end{split}\]

See Flatten T-Swish paper: https://arxiv.org/pdf/1812.06247.pdf

class echoAI.Activation.Torch.fts.FTS[source]

Applies the FTS (Flatten T-Swish) function element-wise:

\[\begin{split}FTS(x) = \left\{\begin{matrix} \frac{x}{1 + e^{-x}} , x \geq 0 \\ 0, x < 0 \end{matrix}\right.\end{split}\]

Plot:

_images/fts.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
References:
  • See Flattened T-Swish paper:

https://arxiv.org/pdf/1812.06247.pdf

Examples:
>>> m = FTS()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.SQNL

Applies the SQNL function element-wise:

\[\begin{split}SQNL(x) = \left\{\begin{matrix} 1, x > 2 \\ x - \frac{x^2}{4}, 0 \leq x \leq 2 \\ x + \frac{x^2}{4}, -2 \leq x < 0 \\ -1, x < -2 \end{matrix}\right.\end{split}\]

See SQNL paper: https://ieeexplore.ieee.org/document/8489043

class echoAI.Activation.Torch.sqnl.SQNL[source]

Applies the SQNL function element-wise:

\[\begin{split}SQNL(x) = \left\{\begin{matrix} 1, x > 2 \\ x - \frac{x^2}{4}, 0 \leq x \leq 2 \\ x + \frac{x^2}{4}, -2 \leq x < 0 \\ -1, x < -2 \end{matrix}\right.\end{split}\]

Plot:

_images/sqnl.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
References:
  • See SQNL paper:

https://ieeexplore.ieee.org/document/8489043

Examples:
>>> m = SQNL()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.ISRU

Applies the ISRU (Inverse Square Root Unit) function element-wise:

\[ISRU(x) = \frac{x}{\sqrt{1 + \alpha * x^2}}\]

ISRU paper: https://arxiv.org/pdf/1710.09967.pdf

class echoAI.Activation.Torch.isru.ISRU(alpha=1.0)[source]

Applies the ISRU function element-wise:

\[ISRU(x) = \frac{x}{\sqrt{1 + \alpha * x^2}}\]

Plot:

_images/isru.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • alpha: A constant (default = 1.0)
References:
  • ISRU paper:

https://arxiv.org/pdf/1710.09967.pdf

Examples:
>>> m = ISRU(alpha=1.0)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.ISRLU

Applies the ISRLU (Inverse Square Root Linear Unit) function element-wise:

\[\begin{split}ISRLU(x)=\left\{\begin{matrix} x, x\geq 0 \\ x * (\frac{1}{\sqrt{1 + \alpha*x^2}}), x <0 \end{matrix}\right.\end{split}\]

ISRLU paper: https://arxiv.org/pdf/1710.09967.pdf

class echoAI.Activation.Torch.isrlu.ISRLU(alpha=1.0)[source]

Applies the ISRLU function element-wise:

\[\begin{split}ISRLU(x)=\left\{\begin{matrix} x, x\geq 0 \\ x * (\frac{1}{\sqrt{1 + \alpha*x^2}}), x <0 \end{matrix}\right.\end{split}\]

Plot:

_images/isrlu.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • alpha: hyperparameter α controls the value to which an ISRLU saturates for negative inputs (default = 1)
References:
Examples:
>>> m = ISRLU(alpha=1.0)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.BentID

Applies the Bent’s Identity function element-wise:

\[bentId(x) = x + \frac{\sqrt{x^{2}+1}-1}{2}\]
class echoAI.Activation.Torch.bent_id.BentID[source]

Applies the Bent’s Identity function element-wise:

\[bentId(x) = x + \frac{\sqrt{x^{2}+1}-1}{2}\]

Plot:

_images/bent_id.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Examples:
>>> m = BentID()
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.SoftClipping

Applies Soft Clipping function element-wise:

\[SC(x) = 1 / \alpha * log(\frac{1 + e^{\alpha * x}}{1 + e^{\alpha * (x-1)}})\]

See SC paper: https://arxiv.org/pdf/1810.11509.pdf

class echoAI.Activation.Torch.soft_clipping.SoftClipping(alpha=0.5)[source]

Applies the Soft Clipping function element-wise:

\[SC(x) = 1 / \alpha * log(\frac{1 + e^{\alpha * x}}{1 + e^{\alpha * (x-1)}})\]

Plot:

_images/sc.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • alpha: hyper-parameter, which determines how close to linear the central region is and how sharply the linear region turns to the asymptotic values
References:
Examples:
>>> m = SoftClipping(alpha=0.5)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.WeightedTanh

Applies the weighted tanh function element-wise:

\[weightedtanh(x) = tanh(x * weight)\]
class echoAI.Activation.Torch.weightedTanh.WeightedTanh(weight=1, inplace=False)[source]

Applies the weighted tanh function element-wise:

\[weightedtanh(x) = tanh(x * weight)\]

Plot:

_images/weighted_tanh.png
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input
Arguments:
  • weight: hyperparameter (default = 1.0)
  • inplace: perform inplace operation (default = False)
Examples:
>>> m = WeightedTanh(weight = 1)
>>> input = torch.randn(2)
>>> output = m(input)
forward(input)[source]

Forward pass of the function.

Torch.SReLU

Script defined the SReLU (S-shaped Rectified Linear Activation Unit):

\[\begin{split}h(x_i) = \left\{\begin{matrix} t_i^r + a_i^r(x_i - t_i^r), x_i \geq t_i^r \\ x_i, t_i^r > x_i > t_i^l\\ t_i^l + a_i^l(x_i - t_i^l), x_i \leq t_i^l \\ \end{matrix}\right.\end{split}\]

See SReLU paper: https://arxiv.org/pdf/1512.07030.pdf

class echoAI.Activation.Torch.srelu.SReLU(in_features, parameters=None)[source]

SReLU (S-shaped Rectified Linear Activation Unit): a combination of three linear functions, which perform mapping R → R with the following formulation:

\[\begin{split}h(x_i) = \left\{\begin{matrix} t_i^r + a_i^r(x_i - t_i^r), x_i \geq t_i^r \\ x_i, t_i^r > x_i > t_i^l\\ t_i^l + a_i^l(x_i - t_i^l), x_i \leq t_i^l \\ \end{matrix}\right.\end{split}\]

with 4 trainable parameters.

Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input

Parameters:

\[\{t_i^r, a_i^r, t_i^l, a_i^l\}\]

4 trainable parameters, which model an individual SReLU activation unit. The subscript i indicates that we allow SReLU to vary in different channels. Parameters can be initialized manually or randomly.

References:
  • See SReLU paper:

https://arxiv.org/pdf/1512.07030.pdf

Examples:
>>> srelu_activation = srelu((2,2))
>>> t = torch.randn((2,2), dtype=torch.float, requires_grad = True)
>>> output = srelu_activation(t)
forward(x)[source]

Forward pass of the function

Torch.BReLU

Torch.APL

Torch.SoftExponential

Torch.Maxout

echoAI.Activation.Torch.functional

Script provides functional interface for custom activation functions.

echoAI.Activation.Torch.functional.aria2(input, beta=1, alpha=1.5)[source]

Applies the Aria-2 function element-wise:

\[Aria2(x, \alpha, \beta) = (1+e^{-\beta*x})^{-\alpha}\]

See additional documentation for echoAI.Activation.Torch.aria2.

echoAI.Activation.Torch.functional.bent_id(input)[source]

Applies the Bent’s Identity function element-wise:

\[bentId(x) = x + \frac{\sqrt{x^{2}+1}-1}{2}\]

See additional documentation for echoAI.Activation.Torch.bent_id.

echoAI.Activation.Torch.functional.beta_mish(input, beta=1.5)[source]

Applies the β mish function element-wise:

\[\beta mish(x) = x * tanh(ln((1 + e^{x})^{\beta}))\]

See additional documentation for echoAI.Activation.Torch.beta_mish.

echoAI.Activation.Torch.functional.elish(input)[source]

Applies the ELiSH (Exponential Linear Sigmoid SquasHing) function element-wise:

See additional documentation for echoAI.Activation.Torch.elish.

\[\begin{split}ELiSH(x) = \left\{\begin{matrix} x / (1+e^{-x}), x \geq 0 \\ (e^{x} - 1) / (1 + e^{-x}), x < 0 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.elish.

echoAI.Activation.Torch.functional.eswish(input, beta=1.75)[source]

Applies the E-Swish function element-wise:

\[ESwish(x, \beta) = \beta*x*sigmoid(x)\]

See additional documentation for echoAI.Activation.Torch.eswish.

echoAI.Activation.Torch.functional.fts(input)[source]

Applies the FTS (Flatten T-Swish) activation function element-wise:

\[\begin{split}FTS(x) = \left\{\begin{matrix} \frac{x}{1 + e^{-x}} , x \geq 0 \\ 0, x < 0 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.fts.

echoAI.Activation.Torch.functional.hard_elish(input)[source]

Applies the HardELiSH (Exponential Linear Sigmoid SquasHing) function element-wise:

\[\begin{split}HardELiSH(x) = \left\{\begin{matrix} x \times max(0, min(1, (x + 1) / 2)), x \geq 0 \\ (e^{x} - 1)\times max(0, min(1, (x + 1) / 2)), x < 0 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.hard_elish.

echoAI.Activation.Torch.functional.isrlu(input, alpha=1.0)[source]

Applies the ISRLU function element-wise:

\[\begin{split}ISRLU(x)=\left\{\begin{matrix} x, x\geq 0 \\ x * (\frac{1}{\sqrt{1 + \alpha*x^2}}), x <0 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.isrlu.

echoAI.Activation.Torch.functional.isru(input, alpha=1.0)[source]

Applies the ISRU function element-wise:

\[ISRU(x) = \frac{x}{\sqrt{1 + \alpha * x^2}}\]

See additional documentation for echoAI.Activation.Torch.isru.

echoAI.Activation.Torch.functional.lecun_tanh(input)[source]

Applies the Le Cun’s Tanh function element-wise:

\[lecun_tanh(x) = 1.7159 * tanh((2/3) * input)\]

See additional documentation for echoAI.Activation.Torch.lecun_tanh.

echoAI.Activation.Torch.functional.mila(input, beta=-0.25)[source]

Applies the mila function element-wise:

\[mila(x) = x * tanh(softplus(\beta + x)) = x * tanh(ln(1 + e^{\beta + x}))\]

See additional documentation for echoAI.Activation.Torch.mila.

echoAI.Activation.Torch.functional.mish(input, inplace=False)[source]

Applies the mish function element-wise:

\[mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))\]

See additional documentation for echoAIAI.Activation.Torch.mish.

echoAI.Activation.Torch.functional.silu(input, inplace=False)[source]

Applies the Sigmoid Linear Unit (SiLU) function element-wise:

\[SiLU(x) = x * sigmoid(x)\]

See additional documentation for echoAI.Activation.Torch.silu.

echoAI.Activation.Torch.functional.sineReLU(input, eps=0.01)[source]

Applies the SineReLU activation function element-wise:

\[\begin{split}SineReLU(x, \epsilon) = \left\{\begin{matrix} x , x > 0 \\ \epsilon * (sin(x) - cos(x)), x \leq 0 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.sine_relu.

echoAI.Activation.Torch.functional.soft_clipping(input, alpha=0.5)[source]

Applies the Soft Clipping function element-wise:

\[SC(x) = 1 / \alpha * log(\frac{1 + e^{\alpha * x}}{1 + e^{\alpha * (x-1)}})\]

See additional documentation for echoAI.Activation.Torch.soft_clipping.

echoAI.Activation.Torch.functional.sqnl(input)[source]

Applies the SQNL activation function element-wise:

\[\begin{split}SQNL(x) = \left\{\begin{matrix} 1, x > 2 \\ x - \frac{x^2}{4}, 0 \leq x \leq 2 \\ x + \frac{x^2}{4}, -2 \leq x < 0 \\ -1, x < -2 \end{matrix}\right.\end{split}\]

See additional documentation for echoAI.Activation.Torch.sqnl.

echoAI.Activation.Torch.functional.swish(input, beta=1.25)[source]

Applies the Swish function element-wise:

\[Swish(x, \beta) = x*sigmoid(\beta*x) = \frac{x}{(1+e^{-\beta*x})}\]

See additional documentation for echoAI.Activation.Torch.swish.

echoAI.Activation.Torch.functional.weighted_tanh(input, weight=1, inplace=False)[source]

Applies the weighted tanh function element-wise:

\[weightedtanh(x) = tanh(x * weight)\]

See additional documentation for echoAI.Activation.Torch.weightedTanh.

Keras Extensions

Keras.Mish

Keras.WeightedTanh

Keras.Aria2

Keras.Eswish

Keras.Swish

Keras.Elish

Keras.HardElish

Keras.Mila

Keras.SineReLU

Keras.FTS

Keras.SQNL

Keras.BetaMish

Keras.ISRU

Keras.ISRLU

Keras.BentID

Keras.SoftClipping

Keras.Celu

Keras.ReLU6

Keras.HardTanh

Keras.LogSigmoid

Keras.TanhShrink

Keras.HardShrink

Keras.SoftShrink

Keras.SoftMin

Keras.LogSoftmax

Keras.SoftExponential

Keras.SReLU

Tensorflow Keras Extensions

Tensorflow_Keras.Mish

Mish Activation Function.

\[mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))\]

Plot:

_images/mish.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.
References:
  • Mish paper:

https://arxiv.org/abs/1908.08681

Tensorflow_Keras.WeightedTanh

Weighted TanH Activation Function.

\[Weighted TanH(x, weight) = tanh(x * weight)\]

Plot:

_images/weighted_tanh.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • weight: hyperparameter (default=1.0)

Tensorflow_Keras.Swish

Swish Activation Function.

\[Swish(x, \beta) = x*sigmoid(\beta*x) = \frac{x}{(1+e^{-\beta*x})}\]

Plot:

_images/swish.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • a constant or a trainable parameter (default=1; which is equivalent to Sigmoid-weighted Linear Unit (SiL))

References:

Tensorflow_Keras.ESwish

E-Swish Activation Function.

\[ESwish(x, \beta) = \beta*x*sigmoid(x)\]

Plot:

_images/eswish.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • beta: a constant parameter (default value = 1.375)

References:

Tensorflow_Keras.Aria2

Aria-2 Activation Function.

\[Aria2(x, \alpha, \beta) = (1+e^{-\beta*x})^{-\alpha}\]

Plot:

_images/aria2.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • alpha: hyper-parameter which has a two-fold effect; it reduces the curvature in 3rd quadrant as well as increases the curvature in first quadrant while lowering the value of activation (default = 1)
  • beta: the exponential growth rate (default = 0.5)

References:

Tensorflow_Keras.Mila

Mila Activation Function.

\[Mila(x) = x * tanh(ln(1 + e^{\beta + x})) = x * tanh(softplus(\beta + x)\]

Plot:

_images/mila.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • beta: scale to control the concavity of the global minima of the function (default = -0.25)

References:

Tensorflow_Keras.ISRU

ISRU (Inverse Square Root Unit) Activation Function.

\[ISRU(x) = \frac{x}{\sqrt{1 + \alpha * x^2}}\]

Plot:

_images/isru.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • alpha: A constant (default = 1.0)

References:

Tensorflow_Keras.BentIdentity

Bent’s Identity Activation Function.

\[bentId(x) = x + \frac{\sqrt{x^{2}+1}-1}{2}\]

Plot:

_images/bent_id.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.SoftClipping

Soft Clipping Activation Function.

\[SC(x) = 1 / \alpha * log(\frac{1 + e^{\alpha * x}}{1 + e^{\alpha * (x-1)}})\]

Plot:

_images/sc.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • alpha: hyper-parameter, which determines how close to linear the central region is and how sharply the linear region turns to the asymptotic values

References:

Tensorflow_Keras.BetaMish

β mish activation function.

\[\beta mish(x) = x * tanh(ln((1 + e^{x})^{\beta}))\]

Plot:

_images/beta_mish.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • beta: A constant or a trainable parameter (default = 1.5)

References

  • β-Mish: An uni-parametric adaptive activation function derived from Mish:

https://github.com/digantamisra98/Beta-Mish)

Tensorflow_Keras.ELiSH

ELiSH (Exponential Linear Sigmoid SquasHing) Activation Function.

\[\begin{split}ELiSH(x) = \left\{\begin{matrix} x / (1+e^{-x}), x \geq 0 \\ (e^{x} - 1) / (1 + e^{-x}), x < 0 \end{matrix}\right.\end{split}\]

Plot:

_images/elish.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

References:

Tensorflow_Keras.HardELiSH

Hard ELiSH Activation Function.

\[\begin{split}HardELiSH(x) = \left\{\begin{matrix} x \times max(0, min(1, (x + 1) / 2)), x \geq 0 \\ (e^{x} - 1)\times max(0, min(1, (x + 1) / 2)), x < 0 \end{matrix}\right.\end{split}\]

Plot: .. figure:: _static/hard_elish.png

align:center

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

References:

Tensorflow_Keras.SineReLU

Sine ReLU Activation Function.

\[\begin{split}SineReLU(x, \epsilon) = \left\{\begin{matrix} x , x > 0 \\ \epsilon * (sin(x)-cos(x)), x \leq 0 \end{matrix}\right.\end{split}\]

Plot: .. figure:: _static/sine_relu.png

align:center

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

References:

Arguments:

  • epsilon: hyperparameter (default=0.01)

Tensorflow_Keras.FlattenTSwish

FTS (Flatten T-Swish) Activation Function.

\[\begin{split}FTS(x) = \left\{\begin{matrix} \frac{x}{1 + e^{-x}} , x \geq 0 \\ 0, x < 0 \end{matrix}\right.\end{split}\]

Plot: .. figure:: _static/fts.png

align:center

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

References:

  • Flatten T-Swish paper:

https://arxiv.org/pdf/1812.06247.pdf

Tensorflow_Keras.SQNL

SQNL Activation Function.

\[\begin{split}SQNL(x) = \left\{\begin{matrix} 1, x > 2 \\ x - \frac{x^2}{4}, 0 \leq x \leq 2 \\ x + \frac{x^2}{4}, -2 \leq x < 0 \\ -1, x < -2 \end{matrix}\right.\end{split}\]

Plot:

_images/sqnl.png
Shape:
  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

References:

Tensorflow_Keras.ISRLU

ISRLU Activation Function.

\[\begin{split}ISRLU(x)=\left\{\begin{matrix} x, x\geq 0 \\ x * (\frac{1}{\sqrt{1 + \alpha*x^2}}), x <0 \end{matrix}\right.\end{split}\]

Plot:

_images/isrlu.png

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • alpha: hyperparameter α controls the value to which an ISRLU saturates for negative inputs (default = 1)

References:

Tensorflow_Keras.SoftExponential

Soft-Exponential Activation Function with 1 trainable parameter..

\[\begin{split}SoftExponential(x, \alpha) = \left\{\begin{matrix} - \frac{log(1 - \alpha(x + \alpha))}{\alpha}, \alpha < 0\\ x, \alpha = 0\\ \frac{e^{\alpha * x} - 1}{\alpha} + \alpha, \alpha > 0 \end{matrix}\right.\end{split}\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Parameters:

  • alpha - trainable parameter

References:

  • See Soft-Exponential paper:

https://arxiv.org/pdf/1602.01321.pdf

Tensorflow_Keras.CELU

CELU Activation Function.

\[CELU(x, \alpha) = max(0,x) + min(0,\alpha * (exp(x/ \alpha)-1))\]
Shape:
  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • alpha: the α value for the CELU formulation (default=1.0)

References:

Tensorflow_Keras.HardTanh

Hard-TanH Activation Function.

\[\begin{split}Hard-TanH(x) = \left\{\begin{matrix} 1, x > 1 \\ x , -1 \leq x \leq 1 \\ -1, x <- 1 \end{matrix}\right.\end{split}\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.LogSigmoid

Log-Sigmoid Activation Function.

\[Log-Sigmoid(x) = log (\frac{1}{1+e^{-x}})\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.TanhShrink

TanH-Shrink Activation Function.

\[TanH-Shrink(x) = x - tanh(x)\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.HardShrink

Hard-Shrink Activation Function.

\[\begin{split}Hard-Shrink(x) = \left\{\begin{matrix} x, x > \lambda \\ 0 , - \lambda \leq x \leq \lambda \\ x, x <- \lambda \end{matrix}\right.\end{split}\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • lambda: the λ value for the Hardshrink formulation (default=0.5)

Tensorflow_Keras.SoftShrink

Soft-Shrink Activation Function.

\[\begin{split}Soft-Shrink(x) = \left\{\begin{matrix} x - \lambda , x > \lambda \\ 0 , - \lambda \leq x \leq \lambda \\ x + \lambda , x <- \lambda \end{matrix}\right.\end{split}\]
Shape:
  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Arguments:

  • lambda: the λ value for the Softshrink formulation (default=0.5)

Tensorflow_Keras.SoftMin

SoftMin Activation Function.

\[SoftMin(x) = Softmax(-x)\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.LogSoftMax

Log-SoftMax Activation Function.

\[Log-SoftMax(x) = log(Softmax(-x))\]

Shape:

  • Input: Arbitrary. Use the keyword argument input_shape

(tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

  • Output: Same shape as the input.

Tensorflow_Keras.MaxOut

Implementation of Maxout:

\[maxout(\vec{x}) = max_i(x_i)\]

Shape:

  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input

References:

Tensorflow_Keras.SReLU

SReLU (S-shaped Rectified Linear Activation Unit): a combination of three linear functions, which perform mapping R → R with the following formulation:

\[\begin{split}h(x_i) = \left\{\begin{matrix} t_i^r + a_i^r(x_i - t_i^r), x_i \geq t_i^r \\ x_i, t_i^r > x_i > t_i^l\\ t_i^l + a_i^l(x_i - t_i^l), x_i \leq t_i^l \\ \end{matrix}\right.\end{split}\]
Shape:
  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input

Parameters:

\[\{t_i^r, a_i^r, t_i^l, a_i^l\}\]

4 trainable parameters, which model an individual SReLU activation unit. The subscript i indicates that we allow SReLU to vary in different channels. Parameters can be initialized manually or randomly.

References:

Tensorflow_Keras.BReLU

Implementation of BReLU activation function:

\[\begin{split}BReLU(x_i) = \left\{\begin{matrix} f(x_i), i \mod 2 = 0\\ - f(-x_i), i \mod 2 \neq 0 \end{matrix}\right.\end{split}\]

Plot:

_images/brelu.png

Shape:

  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input

References:

Tensorflow_Keras.APL

Implementation of APL (ADAPTIVE PIECEWISE LINEAR UNITS) activation function:

\[APL(x_i) = max(0,x) + \sum_{s=1}^{S}{a_i^s * max(0, -x + b_i^s)}\]

Shape:

  • Input: (N, *) where * means, any number of additional dimensions
  • Output: (N, *), same shape as the input

Arguments:

  • a: variables control the slopes of the linear segments
  • b: variables determine the locations of the hinges

References: