Lucebox on Olares One — Episode 3: LIBRARY_PATH isn't what you think it is

Episode 1 — we discovered Lucebox and decided to package it for Olares. Episode 2 — first build, 2h of compile for 11 undefined references to cuMemCreate, cuMemMap, etc.

Fix applied: LIBRARY_PATH pointing at /usr/local/cuda/lib64/stubs + symlink libcuda.so → libcuda.so.1. Logical. I rerun.

2h later

Recompile. Re-link. And then…

/usr/bin/ld: warning: libcuda.so.1, needed by libggml-cuda.so.0.9.11, not found
/usr/bin/ld: libggml-cuda.so.0.9.11: undefined reference to `cuMemCreate'
... 11 identical undefined references

The exact same error. Letter for letter. As if I had done nothing.

Here’s debug rule #1: if you think you fixed the problem but it comes back identical, you didn’t fix the actual problem. Time to read the manual.

What `LIBRARY_PATH` actually does

LIBRARY_PATH is an environment variable that gcc/clang use to resolve libraries directly referenced by the link command. If you do gcc main.c -lfoo and libfoo.so lives in a directory listed in LIBRARY_PATH, ld will find it. OK.

But if you do gcc main.c -lbar, and libbar.so itself depends on another lib libfoo.so, then LIBRARY_PATH doesn’t help. ld will look for libfoo.so in its standard system search path (/lib, /usr/lib, /usr/lib/x86_64-linux-gnu, etc.) and nowhere else.

That’s an indirect dependency. And it’s exactly our case: we link test_dflash which depends on libggml-cuda.so which depends on libcuda.so.1. ld will find libggml-cuda.so (direct) but not libcuda.so.1 (indirect) — because it doesn’t look in LIBRARY_PATH for indirects.

The ld warning literally said so:

not found (try using -rpath or -rpath-link)

I had read it but not really understood it. The ld docs confirm:

-rpath-link=DIR: When using ELF or SunOS, one shared library may require another. […] If -rpath-link is specified, the linker will use that for indirect resolution.

Bingo.

The real fix

Pass the stubs path to the linker via CMAKE_EXE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS, not via LIBRARY_PATH:

RUN cmake -B build -S . \
    -DCMAKE_BUILD_TYPE=Release \
    -DGGML_CUDA=ON \
    -DCMAKE_CUDA_ARCHITECTURES="120" \
    -DCMAKE_EXE_LINKER_FLAGS="-Wl,-rpath-link,/usr/local/cuda/lib64/stubs" \
    -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath-link,/usr/local/cuda/lib64/stubs" \
    && cmake --build build --target test_dflash -j $(nproc)

Note that I also went from "86;89;120" to just "120". Why? Because I’m not distributing this image to the world — it’ll only run on my Olares One sm_120. Cuts compile time by ~3× without losing anything. If we ever need a wider target, we add it back.

56 minutes later

[ 98%] Built target dflash27b
[100%] Building CXX object CMakeFiles/test_dflash.dir/test/test_dflash.cpp.o
[100%] Linking CXX executable test_dflash
[100%] Built target test_dflash
#13 DONE 3337.7s

Yes! Built target test_dflash. The DFlash CLI binary is compiled. 56 min this time (CUDA cache partially reused from the previous build + a single arch). Not bad.

Except test_dflash is just the Lucebox bench CLI. To do real OpenAI-compatible HTTP serving, I need llama-server, which compiles from the deps/llama.cpp submodule of the Lucebox fork. Build #2.

And surprise — it’s not the same cmake invocation, it’s in a sub-project, and I haven’t carried over the -rpath-link. So when I kick off the llama-server build…

(You see where this is going.)

Episode 4: 1h later, ld dumps the exact same 11 undefined references. See you next time!

Disclosure — All the benchmarks in this post run on my own Olares One. If the content was useful and you’re considering one, ordering through this referral link gets you $400 off ($3,599 instead of $3,999) and pays me $200. I’m mentioning this out of transparency — and yes, incidentally, it helps keep the blog alive (hosting, domain, and the time I spend writing here). Link valid until late June 2026.

2h later

What LIBRARY_PATH actually does

The real fix

56 minutes later

Comments

What `LIBRARY_PATH` actually does