Cynthia

What Your Brain Can Teach You About Efficient ML Systems

January 6, 2026
By Cynthia

AI was inspired by the brain. Then we spent decades making it work on GPUs. Now, as we push for efficiency, modern ML keeps rediscovering biological principles: sparse activation (MoE), event-driven processing (batching), low precision (quantization). Turns out evolution already solved the optimization problem we’re facing today.

Production ML
Tagged: AI Systems Design ML Infrastructure ML Optimization MLOps

Tokenizers Are Just Serializers: A Software Engineer’s Guide to LLM Internals

November 23, 2025
By Cynthia

If you’ve debugged protobuf schema mismatches, you already understand 80% of tokenizers. Both fail silently and corrupt your data without warning.

Under the Hood
Tagged: Deep Learning Machine Learning Python

Building a Vision API with Magma8B

November 15, 2025
By Cynthia

Vision models describe what’s in an image, but they can’t handle spatial references. Point at an object and ask “What color is this car?” and the model doesn’t know what you’re talking about. In this post we’ll learn about Set-of-Mark prompting and how vision models can see what you’re seeing 👀

September 6, 2025
By Cynthia

Set yourself apart from other MLEs by learning how to work with audio and serve Text-to-Speech models.

Production ML
Tagged: Audio Processing Python TTS

Your AI Agents Are Racing: A Guide to Multi-Agent Concurrency

July 30, 2025
By Cynthia

Many modern APIs (like AsyncOpenAI) have built-in concurrent execution features. But understanding these concepts can help you add your own concurrency controls. You’re not reinventing the wheel, you’re modifying it for your particular use case.

LLM Tips
Tagged: Deep Learning LLM Machine Learning Python

Random Sampling Is Sabotaging Your Models!

July 18, 2025
By Cynthia

Did you know diversity can detect hallucinations? Ask an LLM “what’s the capital of France” five times and you’ll get some variation of “Paris” over and over again. But ask that same LLM about the “boiling point of a dragon’s scale” and you’ll get five different made-up answers.

LLM Python Research
Tagged: LLM Machine Learning

Self-Hosting LLMs with vLLM

May 29, 2025
By Cynthia

Get ready to save some money 💰. In this post, you’ll learn how to set up your own LLM server using vLLM, choose the right models, and build an architecture that fits your use case.

LLM Production ML
Tagged: Machine Learning vLLM Series

Testing Your LLM Applications (Without Going Broke)

May 8, 2025
By Cynthia

Traditional software is deterministic. Same input, same output, every time. LLMs? Non-deterministic by design. Same input, different output. Even with temperature=0, you get variations. However, there are ways to successfully test your code and in this post you’ll learn them.

LLM
Tagged: Deep Learning MLOps vLLM Series

Stop Guessing What Your LLM Returns

April 26, 2025
By Cynthia

If you don’t know what datatypes your LLM function returns, you’re basically playing Russian roulette during runtime 😨

LLM
Tagged: Machine Learning vLLM Series

Your LLM Works in a Notebook. Now What?

April 7, 2025
By Cynthia

Right now, there’s probably a data scientist waking up to a $3,000 OpenAI bill because a bot found their exposed API key and has been making calls to GPT-4 for 12 hours straight.

LLM
Tagged: Machine Learning vLLM Series

Posts byCynthia