Production ML

What Your Brain Can Teach You About Efficient ML Systems

January 6, 2026
By Cynthia

AI was inspired by the brain. Then we spent decades making it work on GPUs. Now, as we push for efficiency, modern ML keeps rediscovering biological principles: sparse activation (MoE), event-driven processing (batching), low precision (quantization). Turns out evolution already solved the optimization problem we’re facing today.

Production ML
Tagged: AI Systems Design ML Infrastructure ML Optimization MLOps

Building a Vision API with Magma8B

November 15, 2025
By Cynthia

Vision models describe what’s in an image, but they can’t handle spatial references. Point at an object and ask “What color is this car?” and the model doesn’t know what you’re talking about. In this post we’ll learn about Set-of-Mark prompting and how vision models can see what you’re seeing 👀

September 6, 2025
By Cynthia

Set yourself apart from other MLEs by learning how to work with audio and serve Text-to-Speech models.

Production ML
Tagged: Audio Processing Python TTS

Self-Hosting LLMs with vLLM

May 29, 2025
By Cynthia

Get ready to save some money 💰. In this post, you’ll learn how to set up your own LLM server using vLLM, choose the right models, and build an architecture that fits your use case.

LLM Production ML
Tagged: Machine Learning vLLM Series

error: Content is protected!

Production ML

What Your Brain Can Teach You About Efficient ML Systems

Building a Vision API with Magma8B

Building a Production Ready Text-to-Speech API

Self-Hosting LLMs with vLLM

Cynthia