Hypothesis
$$ \begin{align*} h_{\theta}(x) &= g(\theta^{T}x)=\frac{1}{1+e^{-\theta^{T}x}} \\ &= P(y=1|x;\theta) \end{align*} $$ where
\(g = \) sigmoid function
Cost Function
$$ J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}\left[y^{(i)}\log(h_{\theta}(x^{(i)}))+(1-y^{(i)})\log(1-h_{\theta}(x^{(i)}))\right] + \frac{\lambda}{2m}\sum_{j=1}^{n}\theta_j^{2} $$
Algorithms
  1. Gradient Descent $$ \begin{align*} \theta_0 &:= \theta_0 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m}(h_\theta(x^{(i)}) - y^{(i)}) x_0^{(i)} \\ \theta_j &:= \theta_j(1-\alpha\frac{\lambda}{m}) - \alpha \frac{1}{m} \sum\limits_{i=1}^{m}(h_\theta(x^{(i)}) - y^{(i)}) x_j^{(i)} \\ &(j > 0) \end{align*} $$