Tag · saga
# saga
All posts tagged "saga".
-
Lucebox on Olares One — Episode 9: the PR that promised +57% and delivered +0.2%
Last night Lucebox crossed 88.5 t/s on Olares One and became the new champion. This morning PR #94 reports +57% on RTX 4090. If it scales, we hit 120 t/s. Spoiler: 88.7 t/s. Full DDTree sweep, three hypotheses, the honest lesson on upstream benches that don't reproduce.
Lire → -
Lucebox on Olares One — Episode 8: seven days of waiting, one lib swapped by hand, 88.5 t/s
Seven days after my PR #188 to HAMi-core, still no review. The saga had its cliffhanger — I was waiting on someone else. Then a stupid idea: compile my patched lib and swap it myself. Three new bugs, one night, and at the end Lucebox hits 88.5 t/s. First llama.cpp-based path to pass vLLM Turbo on this hardware.
Lire → -
Lucebox on Olares One — Episode 7: six HAMi hooks fixed upstream in one go
The bug is identified: 6 hooks in HAMi-core ignore the return value of cuCtxGetDevice. The fix is 50 lines. But for the entire HAMi community to benefit, it has to go upstream. Here's how that played out.
Lire → -
Lucebox on Olares One — Episode 6: We read the HAMi-core source and we find 6 bugs
NO_VMM doesn't fix anything. The `Illegal device id` bug comes back every run. Time to read the HAMi-core source. And what we find is not a single bug — it's a systemic pattern across 6 different hooks.
Lire → -
Lucebox on Olares One — Episode 5: The runtime slams the door with a negative device id
Image pushed, pod deployed, models downloaded. Everything is ready. Then HAMi vGPU dumps `Illegal device id: -644371744` on every boot, with a random number that changes each run. Smells like uninitialized stack from a mile away.
Lire → -
Lucebox on Olares One — Episode 4: The llama-server submodule serves it up to you 1h later
test_dflash compiles, great. But to serve over HTTP I need llama-server, which compiles from the submodule. And the submodule has its own cmake invocation — where I forgot to add -rpath-link. And boom, 1h later, here we go again.
Lire →