Implementing Prompt Compression to Reduce Agentic Loop Costs
Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.
Implementing Permission-Gated Tool Calling in Python Agents
AI agents have evolved beyond passive chatbots.
The Roadmap to Mastering Tool Calling in AI Agents
Most
Implementing Statistical Guardrails for Non-Deterministic Agents
Non-deterministic agents are those where the same input can lead to distinct outputs across multiple runs.
Agentic RAG Explained in 3 Levels of Difficulty
Traditional
Effective KV Compression with TurboQuant
TurboQuant has recently been launched by Google as a novel algorithmic suite and library for applying advanced quantization and compression to large language models (LLMs) and vector search engines — an indispensable element of RAG systems.
Building AI Agents in Python with Pydantic AI
Effective Context Engineering for AI Agents: A Developer’s Guide
When
Text Summarization with Scikit-LLM
In a
Building AI Agents with Local Small Language Models
The idea of building your own AI agent used to feel like something only big tech companies could pull off.