Claude 3.5 Sonnet

Today marks the launch of Claude 3.5 Sonnet, the inaugural model in the anticipated Claude 3.5 family. Claude 3.5 Sonnet significantly elevates the standard for intelligence in AI models, outperforming both competitor models and its predecessor, Claude 3 Opus, across various evaluations. It combines the speed and cost efficiency of the Claude 3 Sonnet mid-tier model.

Claude 3.5 Sonnet is available for free on Claude.ai and the Claude iOS app. Subscribers to Claude Pro and Team plans benefit from significantly higher rate limits. Additionally, the model is accessible via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Pricing is set at $3 per million input tokens and $15 per million output tokens, with a 200K token context window.

Frontier Intelligence at Double the Speed

Claude 3.5 Sonnet sets new benchmarks in graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). This model excels in understanding nuance, humor, and complex instructions, producing high-quality content with a natural tone.

Operating at twice the speed of Claude 3 Opus, Claude 3.5 Sonnet is ideal for complex tasks like context-sensitive customer support and multi-step workflow orchestration, thanks to its performance boost and cost-effective pricing.

In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus, which solved 38%. This evaluation tested the model’s ability to fix bugs or add functionality to an open-source codebase based on natural language descriptions. Equipped with relevant tools, Claude 3.5 Sonnet can independently write, edit, and execute code, demonstrating advanced reasoning and troubleshooting capabilities. It is highly effective for updating legacy applications and migrating codebases.

State-of-the-Art Vision

Claude 3.5 Sonnet is our most advanced vision model, surpassing Claude 3 Opus on standard vision benchmarks. These improvements are especially significant for tasks requiring visual reasoning, such as interpreting charts and graphs. Claude 3.5 Sonnet can also accurately transcribe text from imperfect images, a crucial capability for sectors like retail, logistics, and financial services, where AI can extract more insights from images and graphics than text alone.

https://www.youtube-nocookie.com/embed/dhxrHvgXpSM?autoplay=0&mute=0&controls=1&origin=https%3A%2F%2Fwww.anthropic.com&playsinline=1&showinfo=0&rel=0&iv_load_policy=3&modestbranding=1&enablejsapi=1&widgetid=1

Artifacts—A New Way to Use Claude

We are excited to introduce Artifacts on Claude.ai, a new feature that enhances how users interact with Claude. When generating content like code snippets, text documents, or website designs, these Artifacts appear in a dedicated window alongside the conversation. This dynamic workspace allows users to see, edit, and build upon Claude’s creations in real-time, seamlessly integrating AI-generated content into their projects and workflows.

This preview feature signifies Claude’s evolution from a conversational AI to a collaborative work environment. It marks the beginning of a broader vision for Claude.ai, which will soon support team collaboration. Eventually, entire organizations will be able to centralize their knowledge, documents, and ongoing work in one shared space, with Claude acting as an on-demand teammate.

https://www.youtube-nocookie.com/embed/rHqk0ZGb6qo?autoplay=0&mute=0&controls=1&origin=https%3A%2F%2Fwww.anthropic.com&playsinline=1&showinfo=0&rel=0&iv_load_policy=3&modestbranding=1&enablejsapi=1&widgetid=3

Commitment to Safety and Privacy

Our models undergo rigorous testing to minimize misuse. Despite Claude 3.5 Sonnet’s advancements, our red teaming assessments have determined it remains at ASL-2. More details can be found in the model card addendum.

In our commitment to safety and transparency, we have engaged with external experts to test and refine the safety mechanisms of this model. Claude 3.5 Sonnet was provided to the UK’s Artificial Intelligence Safety Institute (UK AISI) for pre-deployment safety evaluation. The UK AISI’s results were shared with the US AI Safety Institute (US AISI) under a Memorandum of Understanding, made possible by the partnership between the US and UK AISIs announced earlier this year.

We have incorporated feedback from external experts to ensure robust evaluations and address new trends in misuse. For instance, feedback from child safety experts at < a href="https://www.thorn.org/">Thorn helped us update our classifiers and fine-tune our models.

One of our core principles in AI model development is privacy. We do not train our generative models on user-submitted data unless explicitly permitted by the user. To date, no customer or user-submitted data has been used to train our generative models.

Coming Soon

We aim to continually improve the balance between intelligence, speed, and cost. To complete the Claude 3.5 family, Claude 3.5 Haiku and Claude 3.5 Opus will be released later this year.

In addition to developing the next-generation model family, we are creating new modalities and features to support more business use cases, including integrations with enterprise applications. We are also exploring features like Memory, which will enable Claude to remember user preferences and interaction history, making the user experience even more personalized and efficient.

We continuously work to enhance Claude and appreciate user feedback. You can submit feedback on Claude 3.5 Sonnet directly in the product to help shape our development roadmap and improve your experience. We look forward to seeing how you build, create, and discover with Claude.