Claude Sonnet 4.5

寒くなりました

夜中、寒くて目が覚めました。
そろそろ布団も秋冬仕様にしたほうが良いのかもしれません。

そうするとまた暑くなったりしますけどね・・・。

Claude Sonnet 4.5

というわけで（？）
Anthoropicから最新AIモデルのClaude Sonnet 4.5がリリースされました。

🔗 Introducing Claude Sonnet 4.5

AIモデルをテストするOSWorldで首位に立ったとのことです。

ベンチマーク結果

けっこうすごくない？

Introducing Claude Sonnet 4.5—the best coding model in the world.

It's the strongest model for building complex agents. It's the best model at using computers. And it shows substantial gains on tests of reasoning and math. pic.twitter.com/7LwV9WPNAv
— Claude (@claudeai) September 29, 2025

上記を表にしたもの

ベンチマーク	Claude Sonnet 4.5	Claude Opus 4.1	Claude Sonnet 4	GPT-5	Gemini 2.5 Pro
Agentic coding SWE-bench Verified	77.2% 82.0%(parallel test-time)	74.5% 79.4%(parallel test-time)	72.7% 80.2%(parallel test-time)	72.8%(GPT-5) 74.5%(GPT-5-Codex)	67.2%
Agentic terminal coding Terminal-Bench	50.0%	46.5%	36.4%	43.8%	25.3%
Agentic tool use τ2-bench	Retail:86.2% Airline:70.0% Telecom:98.0%	Retail:86.8% Airline:63.0% Telecom:71.5%	Retail:83.8% Airline:63.0% Telecom:49.6%	Retail:81.1% Airline:62.6% Telecom:96.7%	—
Computer use OSWorld	61.4%	44.4%	42.2%	—	—
High school math competition AIME 2025	100%(python) 87.0%(no tools)	78.0%	70.5%	99.6%(python) 94.6%(no tools)	88.0%
Graduate-level reasoning GPQA Diamond	83.4%	81.0%	76.1%	85.7%	86.4%
Multilingual Q&A MMMLU	89.1%	89.5%	86.5%	89.4%	—
Visual reasoning MMMU (validation)	77.8%	77.1%	74.4%	84.2%	82.0%
Financial analysis Finance Agent	55.3%	50.9%	44.5%	46.9%	29.4%

Notion AI に登場

リリース直後ですが、早速Notionにも登場しました。
ありがたい話でございます。

Notionと相性良さそう。
とても使いやすいです。

Claude Sonnet 4.5

Clause Sonnet 4.5 — NotionでClaude Sonnet 4.5を選択した状態です。

Notionで現在利用できるAIモデル

寒くなりました#

Claude Sonnet 4.5#

ベンチマーク結果#

上記を表にしたもの#

Notion AI に登場#

寒くなりました

Claude Sonnet 4.5

ベンチマーク結果

上記を表にしたもの

Notion AI に登場