INPUT โ [WEIGHTS] โ HIDDEN โ [WEIGHTS] โ OUTPUT
(4) (W1) (10) (W2) (3)
Input Layer: Receives raw data (size = # features)
Hidden Layer: Extracts patterns (size = flexible)
Output Layer: Makes predictions (size = # classes)
z = (xโ ร wโ) + (xโ ร wโ) + ... + (xโ ร wโ) + bias
z = X ยท W + b
ReLU (Hidden Layers):
f(x) = max(0, x)
Sigmoid:
f(x) = 1 / (1 + e^(-x))
Softmax (Output Layer):
f(xแตข) = e^(xแตข) / ฮฃ(e^(xโฑผ))
Loss = -ฮฃ(y_true ร log(y_pred))
weight_new = weight_old - learning_rate ร gradient
from sklearn.datasets import load_iris
iris = load_iris()
X, y = iris.data, iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder(sparse_output=False)
y_train_enc = encoder.fit_transform(y_train.reshape(-1, 1))
import numpy as np
W1 = np.random.randn(input_size, hidden_size) * 0.01
b1 = np.zeros((1, hidden_size))
z1 = np.dot(X, W1) + b1
a1 = relu(z1)
z2 = np.dot(a1, W2) + b2
a2 = softmax(z2)
loss = -np.sum(y_true * np.log(a2 + 1e-8)) / m
W1 -= learning_rate * dW1
b1 -= learning_rate * db1
# During training, monitor:
print(f"Epoch {epoch}: Loss={loss:.4f}, Acc={acc:.4f}")
# Plot after training:
plt.plot(losses)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.show()
| Parameter | Typical Range | Good Starting Point |
|---|---|---|
| Learning Rate | 0.001 - 0.1 | 0.01 or 0.1 |
| Hidden Neurons | 10 - 100 | 10-20 for Iris |
| Epochs | 100 - 2000 | 500-1000 |
| Batch Size | 16 - 128 | 32 |
Learning Rate:
Hidden Neurons:
Epochs:
# Check dimensions
print(X.shape, W.shape)
# Ensure: (samples, features) ร (features, neurons)
# Clip values to prevent overflow
np.clip(x, -500, 500)
# Or reduce learning rate
# Check weights initialized (not zeros)
# Verify learning rate not too small
# Ensure sufficient training epochs
# Try different learning rate
# Add more hidden neurons
# Train for more epochs
# Check data is normalized
from sklearn.metrics import accuracy_score
acc = accuracy_score(y_true, y_pred)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_true, y_pred)
# Plot predictions vs actual
plt.scatter(range(len(y_test)), y_test, label='Actual')
plt.scatter(range(len(y_pred)), y_pred, label='Predicted')
plt.legend()
Before asking for help, check:
✓ DO:
✗ DON'T:
Input(4) โ Hidden(10) โ Output(3)
ReLU Softmax
Input(64) โ Hidden(50) โ Output(10)
ReLU Softmax
Input(n) โ Hidden(20) โ Output(1)
ReLU Sigmoid
Input โ Hidden1 โ Hidden2 โ Hidden3 โ Output
ReLU ReLU ReLU Softmax
Epoch: One complete pass through training data
Batch: Subset of data processed at once
Learning Rate (ฮท): Step size for weight updates
Loss: Measure of prediction error
Gradient: Direction to adjust weights
Activation: Non-linear transformation function
Overfitting: Memorizing training data
Underfitting: Failing to learn patterns
Backpropagation: Algorithm to compute gradients
Forward Pass: Computing predictions
Backward Pass: Computing gradients
After mastering basics:
During Class:
Outside Class:
Visualizations:
Tutorials:
Documentation:
Print this card and keep it handy during labs!
Evolve AI Institute
Free AI Education for All
info@evolveaiinstitute.com | www.evolveaiinstitute.com