Published onDecember 25, 2025Hybrid Context Compaction: Managing Token Growth in Agentic Loopsllmcontext-engineeringtoken-efficiencyaiHow adding observation masking to our existing LLM summarization reduced token costs, improved response latency, and eased rate limit pressure. Adapted from JetBrains research.
Published onOctober 26, 2025Compressing LLM Context Windows: Efficient Data Formats and Context Managementllmcontext-compressiontoken-efficiencycontext-engineeringExplore techniques to reduce token usage in LLM contexts by using compact data formats, summarization, and intelligent retrieval.