Select Language:
IT之家 reported on February 8 that Claude Code has introduced a new Fast Mode research preview. This feature is designed to significantly reduce response latency by optimizing the API configuration of Opus 4.6, all while maintaining the same level of model quality.
According to the developers, Fast Mode is not a separate model but relies on the same Opus 4.6 engine. Instead, it employs a different API setup, prioritizing speed over cost efficiency. The response quality and overall functionality remain unchanged, with the main difference being faster response times.
Available to subscribers of Pro and Team plans as well as Console users, Fast Mode requires additional usage fees. Currently, the pricing is set at $30 per million tokens for input and $150 for output. Notably, third-party providers such as Amazon Bedrock, Google Vertex AI, and Microsoft Azure Foundry do not support this feature.
In addition, users who subscribe before the deadline—11:59 PM Pacific Time on February 16—can enjoy a limited-time discount of 50% on all plans.
This development follows the release of Claude Opus 4.6 by Anthropic on February 6. The new model supports a 200K context window (with a testing version allowing up to 1 million tokens) and can output up to 128K tokens—doubling the previous 64K limit. The update also introduces an adaptive thinking mode, which adjusts its depth based on the complexity of the questions, along with a new maximum effort parameter for more advanced responses. Additionally, the new version features a context compression function that summarizes earlier parts of the conversation when nearing the window limit, enabling near-infinite dialogue lengths.





