Regularization - Cost Function

Regularization - Cost Function

Intuition:

Make features contribution to cost function very large. This means their \(\theta\) will be small hence reduce their contribution/magnitude/value
Result: simpler hypothesis and less prone to overfitting
Since we don't know which features to reduce, we add a regularization term to the end of cost function. \(\lambda\), the regularization parameter, will try to balance between fitting the data and reducing overfitting
\(min_\theta\ \dfrac{1}{2m}\ \left[ \sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})^2 + \lambda\ \sum_{j=1}^n \theta_j^2 \right]\)

Resources:

Comments