DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Becoming the Prediction Machine

Becoming the Prediction Machine

2
Comments
10 min read
Weaviate for RAG: When It Shines (and When It Doesn’t)

Weaviate for RAG: When It Shines (and When It Doesn’t)

Comments
2 min read
We Evaluated 13 LLM Gateways for Production. Here's What We Found

We Evaluated 13 LLM Gateways for Production. Here's What We Found

5
Comments
5 min read
Meet TOON: A Token-First Data Format Built for AI

Meet TOON: A Token-First Data Format Built for AI

Comments
3 min read
LLM’in Düşüncesini İzlemek

LLM’in Düşüncesini İzlemek

Comments
4 min read
Docify: Building a Production RAG System for Knowledge Management

Docify: Building a Production RAG System for Knowledge Management

Comments
4 min read
Stop Clicking, Start Talking: Control Your Unity Editor with AI Using MCP for Unity

Stop Clicking, Start Talking: Control Your Unity Editor with AI Using MCP for Unity

Comments
3 min read
The Common Roadblocks for AI Storytelling

The Common Roadblocks for AI Storytelling

1
Comments
2 min read
RAG for Developers — Built for Code, Not Just Text (Review Requested)

RAG for Developers — Built for Code, Not Just Text (Review Requested)

Comments
1 min read
Python LLM: reasoning is disabled by default in llm-api-adapter

Python LLM: reasoning is disabled by default in llm-api-adapter

Comments
2 min read
RAG Chunking Strategies Deep Dive

RAG Chunking Strategies Deep Dive

Comments
7 min read
An Intro to Large Language Models and the Transformer Architecture: Talking to a calculator

An Intro to Large Language Models and the Transformer Architecture: Talking to a calculator

Comments
4 min read
The Architecture of Agent Memory: How LangGraph Really Works

The Architecture of Agent Memory: How LangGraph Really Works

1
Comments
11 min read
⚛ MCP Explained: A Simple Guide 📜 to AI 🤖 Agents

⚛ MCP Explained: A Simple Guide 📜 to AI 🤖 Agents

Comments
3 min read
The AI Agent Reality Check: What Actually Works in Production (And What Doesn't)

The AI Agent Reality Check: What Actually Works in Production (And What Doesn't)

Comments
6 min read
TPU: Why Google Doesn't Wait in Line for NVIDIA GPUs (1/2)

TPU: Why Google Doesn't Wait in Line for NVIDIA GPUs (1/2)

Comments
8 min read
Personal Identity Agent for your Agent

Personal Identity Agent for your Agent

Comments
1 min read
Release my PR for the project Bifrost

Release my PR for the project Bifrost

Comments
2 min read
Code Generation for Ablation Technique — Documentation

Code Generation for Ablation Technique — Documentation

Comments
3 min read
How Bifrost Integrates With Your Existing LLM Stack (No Refactoring Required)

How Bifrost Integrates With Your Existing LLM Stack (No Refactoring Required)

Comments
4 min read
Semantic Caching Cut Our LLM Costs by 40%

Semantic Caching Cut Our LLM Costs by 40%

Comments
3 min read
Uncounted Tokens: The Game of Attack and Defense in AI Gateway Rate Limiting

Uncounted Tokens: The Game of Attack and Defense in AI Gateway Rate Limiting

Comments
3 min read
The Observability Tax: What You're Actually Paying for AI Agents (2026 Cost Reality)

The Observability Tax: What You're Actually Paying for AI Agents (2026 Cost Reality)

Comments
2 min read
Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

1
Comments
14 min read
Why Your API's Error Messages Fail When Called by an LLM (And How to Fix Them)

Why Your API's Error Messages Fail When Called by an LLM (And How to Fix Them)

Comments
9 min read
loading...