Changes and Releases
Updates on Giant Swarm workload cluster releases, apps, UI improvements and documentation changes.
Fixed
- AI chat instrumentation: render the SSE stream summary inline in the log message (
bytes=... events=... sawFinish=... lastEventType=...) instead of attaching it as a trailing object argument, so it stays readable in browser-console consumers that flatten the args array (devtools-snapshotters, log shippers, the Cursor IDE browser, etc.) rather than collapsing to “[object Object]”. Also classify a stream read error that arrives after a finish event was already parsed as a post-completion teardown (console.warn, message already committed) instead of a network failure (console.error); this matches the actual semantics of the AI SDK aborting the underlying fetch once it has finished consuming the stream.
See ./docs/releases/v0.129.2-changelog.md for more information.
Fixed
- AI chat: always log network-level diagnostics (request URL, response status / headers, fetch errors, SSE stream lifecycle including premature termination) to the browser console so a “Network error” banner can be triaged from a user report without requiring the
ai-chat-verbose-debugging feature flag. Verbose payload-level logging (messages, system prompt, tool schemas, per-event SSE detail) remains gated on the feature flag in non-production builds.
See ./docs/releases/v0.129.1-changelog.md for more information.
Added
- Make AI chat sampling parameters configurable per installation. The plugin previously called
streamText() without temperature, topP, topK, seed, minP, or maxOutputTokens, so the server’s defaults applied – which for vLLM means temperature=1.0, top_p=1.0, top_k=-1, seed=null. That is far too loose for a tool-using agent backed by a reasoning model and was the dominant cause of token-cost variance in production agent loops (same prompt, fresh chat, observed total-token spread of 22k / 607k / 22k across three runs against the same Qwen3 endpoint). Config now accepts an aiChat.sampling block with temperature, topP, topK, minP, seed, and maxOutputTokens; all fields are optional and default behaviour with no sampling: block is unchanged. temperature, topP, topK, seed, and maxOutputTokens are forwarded through the AI SDK to every provider that supports them. minP is spliced into the request body via the OpenAI-compatible provider’s transformRequestBody hook, since vLLM accepts it as a top-level field but it is not part of the AI SDK call settings. The ai-chat-backend README documents recommended values per model family (Qwen3 thinking/non-thinking, Qwen3-Coder, GPT-4 / GPT-4o, Anthropic Claude).
See ./docs/releases/v0.129.0-changelog.md for more information.
Fixed
- Fix AI chat backend crash on every chat send. The
tools field in the request body is optional, so it arrives as undefined whenever the frontend does not register any client-side tools. The previous implementation called Object.entries(tools) unconditionally, which threw TypeError: Cannot convert undefined or null to object, returning a 500 from POST /api/ai-chat/chat on every request. The browser surfaced this as “network error” and the LLM never reached the tool-merge step, so MCP-provided tools (muster, prometheus, kubernetes, …), skill tools, getDate, and the context-usage tool were also unreachable from chat. frontendTools now accepts null/undefined and treats them as an empty registry. - Rebuild MCP clients when the underlying transport closes. The AI chat backend caches one
MCPClient per server for 30 minutes, but the cache was holding on to clients whose StreamableHTTP transport had already been torn down by muster (idle timeout, server reset, …). Tool calls then failed with MCPClientError: Attempted to send a request from a closed client until the TTL expired, surfacing in the browser as a “network error” banner. The cache now chains into the transport’s onclose callback and evicts the entry as soon as the connection drops, so the next request rebuilds the client cleanly. Tool execution also detects the same SDK error eagerly and marks the entry dead so a single failure – not a 30-minute window – triggers reconnection.
See ./docs/releases/v0.128.1-changelog.md for more information.
Changed
- Updating to the
v1.3.2 version.