Naveen Mathew Nathan S.
1 min readJan 24, 2024

--

Very clean explanation of piece-wise linear approximation using a ReLU network. While there's no difference wrt to the training set, predictions of CPWL vs CC functions differ if the input vector is not in the training set (often the case in real world). This becomes a problem especially when the input is outside the range of training examples (Eg: x1=infinity, but the expected output is known to be bounded). Splines do a decent job at mitigating 'exploding predictions' when the input is outside the range of training examples, while simultaneously fitting a good model in the 'convex hull' that envelopes the training input-output vectors.

Non-linear activations in conjunction with linear or ReLU neurons can unlock continuous curve approximations. Here's my bit (though sigmoid is not the most efficient at polynomial approximations): https://medium.com/@snaveenmathew/manufacturing-polynomials-using-a-sigmoid-neural-network-693f6abc2aee

When we combine arbitrary nonlinear 'basis functions' (nonlinear activations) with piecewise linear transformations (ReLU) we can fit arbitrary CC functions using hand-designed neural networks.

--

--

Naveen Mathew Nathan S.
Naveen Mathew Nathan S.

Written by Naveen Mathew Nathan S.

Data Scientist, interested in theory and practice of machine learning.

Responses (1)