Subscribe
Sign in
Home
🧁 Articles
🍫 LLM from scratch
🌈 Tech Guides
🤗 Sponsor
About
Latest
Top
Discussions
Latent Mixture-of-Experts (Latent MoE), Clearly Explained
A lesson on NVIDIA's Latent Mixture-of-Experts (MoE) architecture that powers the Nemotron-3 Super and Ultra models.
Jun 12
•
Dr. Ashish Bamania
2
2
I'm not writing another cringe "Fable 5" post
And instead, teaching other "uncool" and useful concepts that actually matter.
Jun 11
•
Dr. Ashish Bamania
6
How LLMs are Actually Trained
In the last lesson, we learned how the Transformer architecture powers an LLM.
Published on AlgoMaster Newsletter
•
Jun 11
This Week In AI Research (1-6 June 26) 🗓️
The top 10 AI research papers that you must know about this week.
Jun 8
•
Dr. Ashish Bamania
5
1
Distributed Training of Llama, Explained Simply
A short and simple lesson on the techniques used to train LLMs like Meta Llama on massive GPU clusters.
Jun 5
•
Dr. Ashish Bamania
13
2
This Week In AI Research (24-31 May 26) 🗓️
The top 10 AI research papers that you must know about this week.
Jun 1
•
Dr. Ashish Bamania
6
2
May 2026
'Tensorwise' goes live!
Read to give it a try? You'd love it!
May 31
•
Dr. Ashish Bamania
7
1
10 Confusing LLM Concepts, Explained Simply
The role of CPU/ GPU/ TPU in LLM workflows, Pruning, Quantization, and more
May 30
•
Dr. Ashish Bamania
9
2
This Week In AI Research (17-23 May 26) 🗓️
The top 10 AI research papers that you must know about this week.
May 26
•
Dr. Ashish Bamania
6
Tensorwise: Early release for paid subscribers 🥳
You're getting to try this first!
May 21
•
Dr. Ashish Bamania
6
Cross-Entropy Loss in LLMs, Explained Visually
A visual guide to understand how LLMs are trained using the cross-entropy loss, step by step.
May 20
•
Dr. Ashish Bamania
7
This Week In AI Research (10-16 May 26) 🗓️
The top 10 AI research papers that you must know about this week.
May 18
•
Dr. Ashish Bamania
7
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts