airelien.dev — AI for developers

Field notes · Local · Claude · Agents

AI for developers.

Hi there. This is where I write about what AI actually gives a working developer. My findings, my tests, the way I use it day-to-day. No marketing review, no Twitter hot take — just things I actually try, integrate in the code, keep or discard with eyes open.

Read the posts

My rig · Local LLMs

All local. Zero cloud.

The box I run on every day: an Olares One (RTX 5090M 24 GB, 96 GB DDR5). Spoiler — I genuinely recommend it. It's what I picked specifically to get a serious GPU at home, and it does the job. So when I publish local-inference numbers, this is the rig behind them: llama.cpp tuned to the bone, vLLM with speculative decoding, Qwen3.6 at 100 t/s. No third-party API, no quota that drops you mid-session, no prompt landing in a training set. You keep the keys, the bill stops at electricity.

The harness · Agents · MCP · Tools

Cloud. And the toolchain.

Claude Code, Cursor, persistent agents, MCP servers, validation hooks, prompts that hold for a month. The real dev loop with AI in the editor — what to keep, what to throw, how to plug it into a real codebase without everything falling apart at the first serious refactor.

My numbers, my rig

My own numbers.

Everything I write here, I measured myself: tokens per second, latency, VRAM use, prompt time, MTP acceptance rate, cost per API call. No bench thrown on Twitter without the command behind it, no "they say it's fast". If I publish a number, you'll find the exact stack to reproduce it. Promise.

Posts, right below

On to the posts.

Scroll on — the latest posts are waiting. If a finding saves you time, even better. If something feels off, tell me and I'll fix it. That's what an open blog is for.

Featured

Pinned.

This week

Latest posts.

All posts

AI for developers.

All local. Zero cloud.

Cloud. And the toolchain.

My own numbers.

On to the posts.

Pinned.

Lucebox on Olares One — Episode 1: 134 t/s on RTX 3090, what about my rig?

Why I picked an Olares One to run my LLMs

Genesis on consumer Blackwell — TurboQuant unlocked for Qwen3.6-27B on 24GB

Qwen3.6-27B at 85-100 t/s on a 24GB RTX 5090 Laptop GPU

Latest posts.

Lucebox on Olares One — Episode 7: Issue #187, PR #188, and 6 hooks fixed in one go

Lucebox on Olares One — Episode 6: We read the HAMi-core source and we find 6 bugs

Lucebox on Olares One — Episode 5: The runtime slams the door with a negative device id

Lucebox on Olares One — Episode 4: The llama-server submodule serves it up to you 1h later