Categories
Tags
algorithms APIT arm assembly asynchronous base64 Blogging box c clang-format cmake compiler concurrency const_fn contravariant cos covariant cpp Customization cybersecurity DataStructure db Demo deserialization discrete doc DP Dynamic Example FFI flat_map FP Functional functions futures Fuwari GATs gccrs generics gitignore GUI hacking hashmap haskell heap interop invariant iterator justfile kernel LaTeX LFU linux MachineLearning Markdown math ML OnceLock optimization OS parallels perf physics pin postgresql release RPIT rust science Science serialization shift sin SmallProjects std String surrealdb swisstable synchronous tan traits triangulation utf16 utf8 Video x86_64 xilem zig
453 words
2 minutes
260212_Linear_Regression
link
Summary
- Let’s build Linear Regression from scratch in Rust and briefly explain the math.
- We’ll implement 1D Linear Regression using Batch Gradient Descent.
📌 1️⃣ Mathematical Model
- Hypothesis (model)
→ weight (slope)
→ bias (intercept)
📌 2️⃣ Loss Function (Mean Squared Error)
- Goal:
- Minimize
📌 3️⃣ Gradients
- Partial derivatives:
📌 4️⃣ Update Rule (Gradient Descent)
- → learning rate
🦀 Rust Implementation (From Scratch)
- No external crates. Everything on stack.
fn main() {
// Training data (y = 2x + 1)
let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![3.0, 5.0, 7.0, 9.0, 11.0];
let n = x.len() as f64;
let mut w: f64 = 0.0; // weight
let mut b: f64 = 0.0; // bias
let learning_rate = 0.01;
let iterations = 1000;
for iter in 0..iterations {
let mut dw = 0.0;
let mut db = 0.0;
let mut loss = 0.0;
for i in 0..x.len() {
let y_pred = w * x[i] + b;
let error = y_pred - y[i];
loss += error * error;
dw += error * x[i];
db += error;
}
// compute gradients
dw = (2.0 / n) * dw;
db = (2.0 / n) * db;
loss /= n;
// update parameters
w -= learning_rate * dw;
b -= learning_rate * db;
if iter % 100 == 0 {
println!(
"iter {:4} | w={:.4} b={:.4} loss={:.6}",
iter, w, b, loss
);
}
}
println!("\nFinal Model: y = {:.4}x + {:.4}", w, b);
}- result
iter 0 | w=0.5000 b=0.1400 loss=57.000000
iter 100 | w=2.0815 b=0.7058 loss=0.015866
iter 200 | w=2.0581 b=0.7903 loss=0.008060
iter 300 | w=2.0414 b=0.8505 loss=0.004094
iter 400 | w=2.0295 b=0.8935 loss=0.002080
iter 500 | w=2.0210 b=0.9241 loss=0.001056
iter 600 | w=2.0150 b=0.9459 loss=0.000537
iter 700 | w=2.0107 b=0.9614 loss=0.000273
iter 800 | w=2.0076 b=0.9725 loss=0.000138
iter 900 | w=2.0054 b=0.9804 loss=0.000070
Final Model: y = 2.0039x + 0.9860📈 What Happens Internally
- Dataset:
x: 1 2 3 4 5
y: 3 5 7 9 11- True relationship:
- Gradient descent gradually updates:
w: 0 → 0.8 → 1.5 → 1.9 → 2.0
b: 0 → 0.5 → 0.9 → 1.0- Eventually:
w ≈ 2
b ≈ 1
- Loss → almost 0.
🧠 Principle Behind It
Why it works:
- Compute prediction error
- Measure how sensitive loss is to each parameter
- Move parameters in opposite direction of gradient
- Repeat until convergence
Geometrically:
- Loss surface is a bowl (convex)
- Gradient always points uphill
- We step downhill
🔬 Computational Complexity
- Each iteration:
O(n)- Total:
O(n × iterations)- No heap allocations inside loop except the original vectors.
260212_Linear_Regression
https://younghakim7.github.io/blog/posts/260212_linear_regression/