Algorithm

Softmax Regression

Description

An extension of Logistic Regression to multi-class problems. It models the probability that a sample belongs to a specific class k using the Softmax function. The implementation relies on the 'Log-Sum-Exp' trick to prevent numerical instability (overflow/underflow) during the exponentiation step. This is critical for stable training on datasets with unscaled features.

$$ P(y=j|\mathbf{x}) = \frac{e^{\mathbf{w}_j^T \mathbf{x}}}{\sum_{k=1}^K e^{\mathbf{w}_k^T \mathbf{x}}} $$

Algorithm Workflow

START Identify $K$ unique classes and initialize $W$ matrix ($P \times K$).

EPOCH LOOP Iterate through optimization steps.

LOGITS Compute raw scores $Z = XW$.

MAX TRICK Find $M = \max(Z)$ for numerical stability.

PROBABILITIES Compute Softmax $P(k|x) = \exp(Z_k - M) / \sum \exp(Z_j - M)$.

LOSS GRAD Compute gradients $\nabla W$ based on Cross-Entropy Loss.

UPDATE Adjust weights $W \leftarrow W - \eta \nabla W$.

END Return trained weights for $K$ classes.

Implementation Details

Implemented in `SoftmaxRegression.cpp`.

// Softmax with Log-Sum-Exp trick
double max_z = max(logits);
double sum_exp = 0;
for(double z : logits) sum_exp += exp(z - max_z);
probs[k] = exp(logits[k] - max_z) / sum_exp;

Complexity & Optimization

Time Complexity

O(Epochs * N * P * K).

Space Complexity

O(P * K).

Optimizations

Log-Sum-Exp stability.

Limitations

Linear boundaries.

Use Cases

Multiclass classification.