Tag · gemma-4
# gemma-4
All posts tagged "gemma-4".
-
A week of benches on the Olares One: Gemma 4 MTP, Lucebox regression, vLLM no-Genesis hitting the workspace lock
From May 5 to May 8, 2026, I benched everything that fit on a 24GB RTX 5090M. Three findings: Gemma 4 MTP via vLLM lands at 178 t/s 24h after merge, Lucebox v1.9.0 mysteriously regresses from 88 to 69 t/s, vLLM no-Genesis validates PR #39931 but stalls on P65/P22/P38. Plus housekeeping: 8 Qwen3.6 27B apps → 2.
Lire → -
Gemma 4 E4B MTP on the RTX 5090M: 178 t/s, 24h after the vLLM upstream merge
On May 6 at 14:39 UTC, lucianommartins merges PR #41745 into vLLM main: native support for Gemma 4 Multi-Token Prediction drafters. On May 7 at 06:13 UTC the nightly Docker drops. At 06:35 UTC, my Olares One hits 178.6 t/s with 77.3% acceptance — first public Gemma 4 MTP bench on consumer mobile Blackwell.
Lire →