Changes and Releases
Updates on Giant Swarm workload cluster releases, apps, UI improvements and documentation changes.
Added
- Make AI chat sampling parameters configurable per installation. The plugin previously called
streamText() without temperature, topP, topK, seed, minP, or maxOutputTokens, so the server’s defaults applied – which for vLLM means temperature=1.0, top_p=1.0, top_k=-1, seed=null. That is far too loose for a tool-using agent backed by a reasoning model and was the dominant cause of token-cost variance in production agent loops (same prompt, fresh chat, observed total-token spread of 22k / 607k / 22k across three runs against the same Qwen3 endpoint). Config now accepts an aiChat.sampling block with temperature, topP, topK, minP, seed, and maxOutputTokens; all fields are optional and default behaviour with no sampling: block is unchanged. temperature, topP, topK, seed, and maxOutputTokens are forwarded through the AI SDK to every provider that supports them. minP is spliced into the request body via the OpenAI-compatible provider’s transformRequestBody hook, since vLLM accepts it as a top-level field but it is not part of the AI SDK call settings. The ai-chat-backend README documents recommended values per model family (Qwen3 thinking/non-thinking, Qwen3-Coder, GPT-4 / GPT-4o, Anthropic Claude).
See ./docs/releases/v0.129.0-changelog.md for more information.
Fixed
- Fix AI chat backend crash on every chat send. The
tools field in the request body is optional, so it arrives as undefined whenever the frontend does not register any client-side tools. The previous implementation called Object.entries(tools) unconditionally, which threw TypeError: Cannot convert undefined or null to object, returning a 500 from POST /api/ai-chat/chat on every request. The browser surfaced this as “network error” and the LLM never reached the tool-merge step, so MCP-provided tools (muster, prometheus, kubernetes, …), skill tools, getDate, and the context-usage tool were also unreachable from chat. frontendTools now accepts null/undefined and treats them as an empty registry. - Rebuild MCP clients when the underlying transport closes. The AI chat backend caches one
MCPClient per server for 30 minutes, but the cache was holding on to clients whose StreamableHTTP transport had already been torn down by muster (idle timeout, server reset, …). Tool calls then failed with MCPClientError: Attempted to send a request from a closed client until the TTL expired, surfacing in the browser as a “network error” banner. The cache now chains into the transport’s onclose callback and evicts the entry as soon as the connection drops, so the next request rebuilds the client cleanly. Tool execution also detects the same SDK error eagerly and marks the entry dead so a single failure – not a 30-minute window – triggers reconnection.
See ./docs/releases/v0.128.1-changelog.md for more information.
Changed
- Updating to the
v1.3.2 version.