News
Newest
Ask
Show
Jobs
Open on GitHub
Show HN: Running Gemma-4 26B at 124 tokens/SEC on a CPU, no GPU
(apeg.dev)
10 points | by
arun-prasath
9 hours ago
1 comments
pmb_developer
5 hours ago
The output head byte budget is surprising. Did you try any tradeoff where the head is compressed more aggressively but experts stay mostly untouched?
1 comments