Algorithm
Linear Regression
Description
A fundamental statistical method that models the relationship between a scalar response and one or more explanatory variables using a linear predictor functions. This implementation solves the Ordinary Least Squares (OLS) problem using the Normal Equation approach. It is heavily optimized using BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage) for native hardware acceleration, making it suitable for high-performance computing tasks.
$$ \hat{\beta} = (X^T X)^{-1} X^T y $$
Algorithm Workflow
STARTINPUT
Matrix $X$ (Features) and Vector $y$ (Targets).
VALIDATE
Check for row/column consistency and dimensions.
PARALLEL MEAN
Compute $\bar{x}$ for each column via OpenMP SIMD reduction.
GRAM MATRIX
Calculate $X^TX$ using vectorized BLAS `dgemm` (Matrix-Matrix
Mutliplication).
TARGET PROJECTION
Calculate $X^Ty$ using BLAS `dgemv` (Matrix-Vector
Multiplication).
CENTERING
Adjust $X^TX$ and $X^Ty$ using $\bar{x}$ to effectively center the
data.
STABILIZE
Add a small $10^{-8}$ Ridge penalty to the diagonal for numerical
stability.
SOLVE
Apply LAPACK `dposv` (Cholesky Decomposition) to solve $A\beta =
b$.
RECOVERY
If Cholesky fails (singular matrix), trigger Custom Fallback Solver
(LU/QR).
INTERCEPT
Compute $\beta_0 = \bar{y} - \sum (\beta_j \cdot \bar{x}_j)$ derived
from means.
OUTPUT
Store coefficients $\beta$ and intercept $\beta_0$.
PREDICT
Compute standard dot product $X_{new} \cdot \beta + \beta_0$.
Implementation Details
Implemented in `LinearRegression.cpp` using BLAS/LAPACK.
// Compute XtX and Xty using BLAS dgemm/dgemv
F77_CALL(dgemm)("T", "N", &p, &p, &n, &one, x, &n, x, ...);
// Solve using Cholesky (dposv)
F77_CALL(dposv)("L", &n, &one, XtX, &lda, Xty, ...);
Complexity & Optimization
Time Complexity
O(N * P^2).
Space Complexity
O(P^2).
Optimizations
BLAS Level 3.
Limitations
Assumes linearity.
Use Cases
Baseline regression.