$tail -f ~/ml-notes.log█

Building, optimizing, and serving machine learning systems at scale.

Working notes on building and running ML systems in production: serving, quantization, fine-tuning and evaluation.

browse notes ►github ↗

posts

topics

last_sync

2026-03-18

ROOT: Rodrigo Baron

LatestFEATURED · 2026-03-18

~/posts/retool-with-slime.mdx

RLHFSlime

Slime RL Training: Teaching Models to Use Tools Strategically

A walkthrough of implementing ReTool-style code-interpreter training using Slime, covering SFT cold-start, multi-turn rollout with per-token loss masking, and a GRPO reward function.

Mar 18, 2026OPEN NOTE

Topics20 ENTRIES

$ filter --topic

Mar 9, 2026

RLHFVerl

LLM Post-Training with Verl

A practical guide to LLM post-training with verl using a small-scale recipe. Covers RL concepts, the HybridFlow programming model, actors and reward modules, a GSM8K math training recipe with GRPO.

►

Jan 19, 2026

DStack

Building an AI Home Lab

In this post we will build an MLOPs system to fine-tune a Small Language Model to be better at function calling, we will use llama-3-8b-Instruct model and do Supervised Fine-Tuning (SFT) on Salesforce/xlam-function-calling-60k dataset. At end we will have home AI lab that works the way a production system should: version-controlled, automated, reproducible, and scalable.

►

Time Series Forecasting Meets Foundation Models

Jan 6, 2026

ForecastFine-tuning

Time Series Forecasting Meets Foundation Models

In this blog post we'll explore Chronos-2, an foundation model for time series forecasting and compare with statistical methods...

In this post, we'll build a Agentic RAG system that will analyze queries, break down complex questions, search then reason over result, and know when need more context. By the end, you'll have a complete implementation that you can adapt to your own document collections.

►

Nov 14, 2025

llamacppqwen3

LLaMA.cpp: Serving Language Models

In this article we will explore how to use llama.cpp to deploy large language models (LLM) and vision language models (VLM) on a consumer hardware. Showing how the community was been using local models to create and consume AI applications.

►

Nov 11, 2025

Project-Management

Two-Stage Delivery Framework

It's a strategic framework that acknowledges a fundamental truth about modern engineering - sometimes you need to move fast, and sometimes you need to build things right. Rather than choosing one at the expense of the other, two-stage delivery separates the development lifecycle into two deliberate phases.

In this article, we'll explore how to implement MCP Agents from scratch, showing how easy and powerful developing custom Agents is and extending their capability to many already available tools.

►

Oct 22, 2024

Agent

Anthill the Multi-Agent Framework

Presenting Anthill 🐜 a multi-agent framework which implement OpenAI Routines and Handoffs design patterns. Additionally support many LLMs, have a build-in multi-step reasoning system, and allow developers guide and validate agent steps..

►

Legal Document Analysis RAG with PostgreSQL

Sep 30, 2024

PostgresRAG

Legal Document Analysis RAG with PostgreSQL

In this article, we will explore how to build a advanced Retrieval-Augmented Generation (RAG) application using PostgreSQL to store and query legal case documents using hybrid search.

►

Sep 13, 2024

TutorialAgent

AI-Powered Cold Mail Generator

This article provides a step-by-step guide to build a cold mail generator powered by LLM's (LLama 3.1) and learn the recipes for creating production-friendly AI projects.

►

Feb 1, 2024

Prompt-EngineerLangChain

LLM Prompter: Advanced

Last post of “LLM Prompter” series which teach howto use the power of the modern dragons (LLMs/Generative AI).

►

Jan 17, 2024

Prompt-EngineerLangChain

LLM Prompter: RAG

Third post “LLM Prompter” series which teach howto use the power of the modern dragons (LLMs/Generative AI).

►

Jan 16, 2024

Prompt-EngineerLangChain

LLM Prompter: ReAct

Second post “LLM Prompter” series which teach howto use the power of the modern dragons (LLMs/Generative AI).

First post “LLM Prompter” series which teach howto use the power of the modern dragons (LLMs/Generative AI).

►

Mar 20, 2022

Project-ManagementMLOPs

Machine Learning Project in 2020s

The first non-technical post where I point out some issues when trying to implement a machine learning project.

►

Jan 26, 2022

Feature-StoreMLOPs

QAFS - Quality Aware Feature Store

Just released QAFS another machine learning engineering tool to help build and maintain ML products.

Announcing the first release of Quick-Deploy a tools to optimize, convert and deploy machine learning models as fast inference API.

►

Dec 21, 2021

TutorialK8s

Hello K8s: Observability

The last post of K8s dev/lab tutorial series which we'll setup Metrics Server, Kube State Metrics, Prometheus and Grafana.

►

Dec 16, 2021

TutorialK8s

Hello K8s: Resource Management

The second post of K8s lab/dev setup covering ingress, storage and Kubernetes Dashboard for resource management.

►

Dec 9, 2021

TutorialK8s

Hello K8s: Installation

The first post of a series of three to setup kubernetes local lab/dev using Ubuntu Server 20.04 as master and node.

►