Notations:
m: number of training examplesn = \(n_x\) = dimension of input feature \(x\), so \(x \in R^{n_x}\)
y \(\in\) {0, 1} for Binary Classification
Set {\((x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), ..., (x^{(m)}, y^{(m)})\)} where \((x^{(i)}, y^{(i)})\) is the \(i^{th}\) training example
Stacking data by columns:
- X = [\(x^{(1)}, x^{(2)}, ..., x^{(m)}\)] \(\in R^{n_x \times m} \) is a matrix with each feature vector as column
- Y = [\(y^{(1)}, y^{(2)}, ..., y^{(m)}\)] \(\in R^{1 \times m} \) is a (row) matrix with each output as column
Sigmoid function: \(\sigma(z) = \frac{1}{(1 + e^{-z})} \) where \(z = w^T x + b\) with \(w \in R^{n_x}\), \(b \in R\)
w: weight is row vector \(1 \times n_x\)
b: bias term, is \(\in R\)
\(n^{[i]}\): number of units in layer \(i^{th}\)
\(W^{[i]}, b^{[i]}\) is the Weights and bias term in \(i^{th}\) layer in the neural network
\(a^{[i]}\): activation, refers to values that each layer passing on to subsequent layer
\(z_i^{[j]}\): node \(i^{th}\) in layer \(j^{th}\) (input layer is layer 0)
\(a^{[j](i)}\): activation resulted from \(i^{th}\) training example on layer j
w: weight is row vector \(1 \times n_x\)
b: bias term, is \(\in R\)
\(n^{[i]}\): number of units in layer \(i^{th}\)
\(W^{[i]}, b^{[i]}\) is the Weights and bias term in \(i^{th}\) layer in the neural network
\(a^{[i]}\): activation, refers to values that each layer passing on to subsequent layer
\(z_i^{[j]}\): node \(i^{th}\) in layer \(j^{th}\) (input layer is layer 0)
\(a^{[j](i)}\): activation resulted from \(i^{th}\) training example on layer j
Comments
Post a Comment