Layers

`BaseLayer`

Bases: ABC

Template for neural network layers.

Layers have to define a trainable property that returns a list of (name_param, param, grad) tuples.

Layers also have to define a name attribute which specifies the base name of the layer. The name can be arbitrary but it has to be unique for each of the layer types.

Source code in src/nnfs/layers.py

class BaseLayer(ABC):
    """Template for neural network layers.

    Layers have to define a `trainable` property that returns a
    list of (name_param, param, grad) tuples.

    Layers also have to define a `name` attribute which specifies
    the base name of the layer. The name can be arbitrary but it has
    to be unique for each of the layer types.
    """

    @abstractmethod
    def __init__(self):
        self.layer_name: str = "UNINITIALIZED_NAME"
        self.index: int = -1

    @abstractmethod
    def forward(self, X_input: np.ndarray) -> np.ndarray:
        pass

    @abstractmethod
    def backward(self, grad_next: np.ndarray) -> np.ndarray:
        pass

    @property
    @abstractmethod
    def trainable(self):
        return None

    @property
    def name(self) -> str:
        """Returns the layer's name.

        It is used by the model to summarize layer architecture, as well as to cache
        layer-specific gradients (for example, to implement momentum).

        Returns:
            A layer_id of the form layer.name_layer.index.
        """
        return f"{self.layer_name}_{self.index}"

`name` `property`

Returns the layer's name.

It is used by the model to summarize layer architecture, as well as to cache layer-specific gradients (for example, to implement momentum).

Returns:

Type	Description
`str`	A layer_id of the form layer.name_layer.index.

`Dense`

Bases: BaseLayer

A fully connected neural network layer.

Parameters:

Name	Type	Description	Default
`input_size`	`int`	Dimensionality of the input to the layer.	required
`output_size`	`int`	Dimensionality of the output of the layer.	required

Attributes:

Name	Type	Description
`W`	`ndarray`	weight parameters.
`b`	`ndarray`	bias parameters.
`dW`	`ndarray`	gradient of loss w.r.t weight parameters.
`db`	`ndarray`	gradient of loss w.r.t bias parameters.
`X_input`	`ndarray`	cached input to the layer.
`layer_name`	`str`	Short name for the layer type.
`index`	`int`	Position of the layer within the full model (initializes at 0).

Source code in src/nnfs/layers.py

class Dense(BaseLayer):
    """
    A fully connected neural network layer.

    Args:
        input_size (int): Dimensionality of the input to the layer.
        output_size (int): Dimensionality of the output of the layer.

    Attributes:
        W (np.ndarray): weight parameters.
        b (np.ndarray): bias parameters.
        dW (np.ndarray): gradient of loss w.r.t weight parameters.
        db (np.ndarray): gradient of loss w.r.t bias parameters.
        X_input (np.ndarray): cached input to the layer.
        layer_name (str): Short name for the layer type.
        index (int): Position of the layer within the full model (initializes at 0).
    """

    def __init__(self, input_size: int, output_size: int):
        # layer parameters
        self.W = np.random.uniform(size=(input_size, output_size))
        self.b = np.zeros(shape=(1, output_size))

        # cache for gradients and inputs
        self.dW = np.zeros(shape=self.W.shape)
        self.db = np.zeros(shape=self.b.shape)
        self.X_input = np.zeros(1)

        # attributes for layer navigation
        self.layer_name: str = "Dense"
        self.index: int = 0

    def forward(self, X_input: np.ndarray) -> np.ndarray:
        """Computes the forward pass for the layer.

        Args:
            X_input (np.ndarray): Input data to be transformed by the layer.

        Returns:
            Output of the layer.
        """
        self.X_input = X_input
        output = np.matmul(X_input, self.W) + self.b
        return output

    def backward(self, grad_next: np.ndarray) -> np.ndarray:
        """Computes the backward pass for the layer.

        This function updates the `dW` and `db` attributes of the layer.

        Args:
            grad_next (np.ndarray): Gradients fed back from the next layer during backpropagation.

        Returns:
            Gradient of the loss w.r.t the input to the layer.
        """
        # Gradients w.r.t parameters
        self.dW = np.matmul(self.X_input.T, grad_next)  # shape: (input_dim, output_dim)
        self.db = np.sum(grad_next, axis=0, keepdims=True)  # shape: (1, output_dim)

        # Gradient w.r.t input (to propagate backward)
        d_input = np.matmul(grad_next, self.W.T)  # shape: (batch_size, input_dim)
        return d_input

    @property
    def trainable(self) -> list[tuple[str, np.ndarray, np.ndarray]]:
        """Returns the layer's trainable parameters and their gradients.

        Each element is a tuple (name, param, grad) representing a parameter name,
        value, and corresponding gradient. These are used by the optimizer during training.

        Returns:
            A list of tuples containing `(name, parameter, grad_parameters)` for each trainable parameter.
        """
        return [
            ("W", self.W, self.dW),
            ("b", self.b, self.db),
        ]

`trainable` `property`

Returns the layer's trainable parameters and their gradients.

Each element is a tuple (name, param, grad) representing a parameter name, value, and corresponding gradient. These are used by the optimizer during training.

Returns:

Type	Description
`list[tuple[str, ndarray, ndarray]]`	A list of tuples containing `(name, parameter, grad_parameters)` for each trainable parameter.

`forward(X_input)`

Computes the forward pass for the layer.

Parameters:

Name	Type	Description	Default
`X_input`	`ndarray`	Input data to be transformed by the layer.	required

Returns:

Type	Description
`ndarray`	Output of the layer.

Source code in src/nnfs/layers.py

def forward(self, X_input: np.ndarray) -> np.ndarray:
    """Computes the forward pass for the layer.

    Args:
        X_input (np.ndarray): Input data to be transformed by the layer.

    Returns:
        Output of the layer.
    """
    self.X_input = X_input
    output = np.matmul(X_input, self.W) + self.b
    return output

`backward(grad_next)`

Computes the backward pass for the layer.

This function updates the dW and db attributes of the layer.

Parameters:

Name	Type	Description	Default
`grad_next`	`ndarray`	Gradients fed back from the next layer during backpropagation.	required

Returns:

Type	Description
`ndarray`	Gradient of the loss w.r.t the input to the layer.

Source code in src/nnfs/layers.py

def backward(self, grad_next: np.ndarray) -> np.ndarray:
    """Computes the backward pass for the layer.

    This function updates the `dW` and `db` attributes of the layer.

    Args:
        grad_next (np.ndarray): Gradients fed back from the next layer during backpropagation.

    Returns:
        Gradient of the loss w.r.t the input to the layer.
    """
    # Gradients w.r.t parameters
    self.dW = np.matmul(self.X_input.T, grad_next)  # shape: (input_dim, output_dim)
    self.db = np.sum(grad_next, axis=0, keepdims=True)  # shape: (1, output_dim)

    # Gradient w.r.t input (to propagate backward)
    d_input = np.matmul(grad_next, self.W.T)  # shape: (batch_size, input_dim)
    return d_input

Layers

BaseLayer

name property

Dense

trainable property

forward(X_input)

backward(grad_next)

`BaseLayer`

`name` `property`

`Dense`

`trainable` `property`

`forward(X_input)`

`backward(grad_next)`