Tag · olares-one
# olares-one
All posts tagged "olares-one".
-
Qwen3.6-27B on upstream llama.cpp: +123% free with MTP, zero fork to maintain
MTP finally lands in llama.cpp upstream (PR #22673 by am17an, May 4). Bench on Olares One RTX 5090M sm_120: 78 t/s with an MTP-enabled GGUF, +123% vs baseline. No Lucebox, no Genesis, no permanent custom fork.
Lire → -
Lucebox on Olares One — Episode 9: the PR that promised +57% and delivered +0.2%
Last night Lucebox crossed 88.5 t/s on Olares One and became the new champion. This morning PR #94 reports +57% on RTX 4090. If it scales, we hit 120 t/s. Spoiler: 88.7 t/s. Full DDTree sweep, three hypotheses, the honest lesson on upstream benches that don't reproduce.
Lire → -
Lucebox on Olares One — Episode 8: seven days of waiting, one lib swapped by hand, 88.5 t/s
Seven days after my PR #188 to HAMi-core, still no review. The saga had its cliffhanger — I was waiting on someone else. Then a stupid idea: compile my patched lib and swap it myself. Three new bugs, one night, and at the end Lucebox hits 88.5 t/s. First llama.cpp-based path to pass vLLM Turbo on this hardware.
Lire → -
My personal Olares Market — 28 apps hand-tuned for the Olares One, one click away
A custom Olares Market hand-tuned for the RTX 5090M of the Olares One. 28 ready-to-install apps: llama.cpp, vLLM, DFlash, Voxtral ASR/TTS, vision, music. How to add it to your device in 30 seconds.
Lire → -
DFlash unblocked on 24GB consumer Blackwell — 80 t/s, 4 days after the "impossible" post
Four days ago I wrote that DFlash on 24GB consumer Blackwell didn't fit. On April 28, a dev publishes a quantized drafter. On April 30, I build, I test, I get 0.97 t/s. On May 1, after my issue, the dev fixes it in 24h. Tonight: 80 t/s. The story of a thesis that lasted 72 hours.
Lire → -
Lucebox on Olares One — Episode 1: 134 t/s on RTX 3090, what about my rig?
You're scrolling r/LocalLLaMA, you see a post claiming 134 t/s on Qwen3.6-27B with an RTX 3090 thanks to Lucebox. Of course you want to try it on your Olares One. Spoiler: it'll take 12 hours of compile time and 6 Docker builds. Episode 1.
Lire →