Skip to content

Layers

BaseLayer

Bases: ABC

Template for neural network layers.

Layers have to define a trainable property that returns a list of (name_param, param, grad) tuples.

Layers also have to define a name attribute which specifies the base name of the layer. The name can be arbitrary but it has to be unique for each of the layer types.

Source code in src/nnfs/layers.py
class BaseLayer(ABC):
    """Template for neural network layers.

    Layers have to define a `trainable` property that returns a
    list of (name_param, param, grad) tuples.

    Layers also have to define a `name` attribute which specifies
    the base name of the layer. The name can be arbitrary but it has
    to be unique for each of the layer types.
    """

    @abstractmethod
    def __init__(self):
        self.layer_name: str = "UNINITIALIZED_NAME"
        self.index: int = -1

    @abstractmethod
    def forward(self, X_input: np.ndarray) -> np.ndarray:
        pass

    @abstractmethod
    def backward(self, grad_next: np.ndarray) -> np.ndarray:
        pass

    @property
    @abstractmethod
    def trainable(self):
        return None

    @property
    def name(self) -> str:
        """Returns the layer's name.

        It is used by the model to summarize layer architecture, as well as to cache
        layer-specific gradients (for example, to implement momentum).

        Returns:
            A layer_id of the form layer.name_layer.index.
        """
        return f"{self.layer_name}_{self.index}"

name property

Returns the layer's name.

It is used by the model to summarize layer architecture, as well as to cache layer-specific gradients (for example, to implement momentum).

Returns:

Type Description
str

A layer_id of the form layer.name_layer.index.

Dense

Bases: BaseLayer

A fully connected neural network layer.

Parameters:

Name Type Description Default
input_size int

Dimensionality of the input to the layer.

required
output_size int

Dimensionality of the output of the layer.

required

Attributes:

Name Type Description
W ndarray

weight parameters.

b ndarray

bias parameters.

dW ndarray

gradient of loss w.r.t weight parameters.

db ndarray

gradient of loss w.r.t bias parameters.

X_input ndarray

cached input to the layer.

layer_name str

Short name for the layer type.

index int

Position of the layer within the full model (initializes at 0).

Source code in src/nnfs/layers.py
class Dense(BaseLayer):
    """
    A fully connected neural network layer.

    Args:
        input_size (int): Dimensionality of the input to the layer.
        output_size (int): Dimensionality of the output of the layer.

    Attributes:
        W (np.ndarray): weight parameters.
        b (np.ndarray): bias parameters.
        dW (np.ndarray): gradient of loss w.r.t weight parameters.
        db (np.ndarray): gradient of loss w.r.t bias parameters.
        X_input (np.ndarray): cached input to the layer.
        layer_name (str): Short name for the layer type.
        index (int): Position of the layer within the full model (initializes at 0).
    """

    def __init__(self, input_size: int, output_size: int):
        # layer parameters
        self.W = np.random.uniform(size=(input_size, output_size))
        self.b = np.zeros(shape=(1, output_size))

        # cache for gradients and inputs
        self.dW = np.zeros(shape=self.W.shape)
        self.db = np.zeros(shape=self.b.shape)
        self.X_input = np.zeros(1)

        # attributes for layer navigation
        self.layer_name: str = "Dense"
        self.index: int = 0

    def forward(self, X_input: np.ndarray) -> np.ndarray:
        """Computes the forward pass for the layer.

        Args:
            X_input (np.ndarray): Input data to be transformed by the layer.

        Returns:
            Output of the layer.
        """
        self.X_input = X_input
        output = np.matmul(X_input, self.W) + self.b
        return output

    def backward(self, grad_next: np.ndarray) -> np.ndarray:
        """Computes the backward pass for the layer.

        This function updates the `dW` and `db` attributes of the layer.

        Args:
            grad_next (np.ndarray): Gradients fed back from the next layer during backpropagation.

        Returns:
            Gradient of the loss w.r.t the input to the layer.
        """
        # Gradients w.r.t parameters
        self.dW = np.matmul(self.X_input.T, grad_next)  # shape: (input_dim, output_dim)
        self.db = np.sum(grad_next, axis=0, keepdims=True)  # shape: (1, output_dim)

        # Gradient w.r.t input (to propagate backward)
        d_input = np.matmul(grad_next, self.W.T)  # shape: (batch_size, input_dim)
        return d_input

    @property
    def trainable(self) -> list[tuple[str, np.ndarray, np.ndarray]]:
        """Returns the layer's trainable parameters and their gradients.

        Each element is a tuple (name, param, grad) representing a parameter name,
        value, and corresponding gradient. These are used by the optimizer during training.

        Returns:
            A list of tuples containing `(name, parameter, grad_parameters)` for each trainable parameter.
        """
        return [
            ("W", self.W, self.dW),
            ("b", self.b, self.db),
        ]

trainable property

Returns the layer's trainable parameters and their gradients.

Each element is a tuple (name, param, grad) representing a parameter name, value, and corresponding gradient. These are used by the optimizer during training.

Returns:

Type Description
list[tuple[str, ndarray, ndarray]]

A list of tuples containing (name, parameter, grad_parameters) for each trainable parameter.

forward(X_input)

Computes the forward pass for the layer.

Parameters:

Name Type Description Default
X_input ndarray

Input data to be transformed by the layer.

required

Returns:

Type Description
ndarray

Output of the layer.

Source code in src/nnfs/layers.py
def forward(self, X_input: np.ndarray) -> np.ndarray:
    """Computes the forward pass for the layer.

    Args:
        X_input (np.ndarray): Input data to be transformed by the layer.

    Returns:
        Output of the layer.
    """
    self.X_input = X_input
    output = np.matmul(X_input, self.W) + self.b
    return output

backward(grad_next)

Computes the backward pass for the layer.

This function updates the dW and db attributes of the layer.

Parameters:

Name Type Description Default
grad_next ndarray

Gradients fed back from the next layer during backpropagation.

required

Returns:

Type Description
ndarray

Gradient of the loss w.r.t the input to the layer.

Source code in src/nnfs/layers.py
def backward(self, grad_next: np.ndarray) -> np.ndarray:
    """Computes the backward pass for the layer.

    This function updates the `dW` and `db` attributes of the layer.

    Args:
        grad_next (np.ndarray): Gradients fed back from the next layer during backpropagation.

    Returns:
        Gradient of the loss w.r.t the input to the layer.
    """
    # Gradients w.r.t parameters
    self.dW = np.matmul(self.X_input.T, grad_next)  # shape: (input_dim, output_dim)
    self.db = np.sum(grad_next, axis=0, keepdims=True)  # shape: (1, output_dim)

    # Gradient w.r.t input (to propagate backward)
    d_input = np.matmul(grad_next, self.W.T)  # shape: (batch_size, input_dim)
    return d_input