Model Insights

claude-3-5-sonnet-20240620

Details

Developer

Anthropic

License

NA (private model)

Model parameters

NA (private model)

Supported context length

200k

Price for prompt token

$3/Million tokens

Price for response token

$15/Million tokens

Model Performance Across Task-Types

Chainpoll Score

Short Context

0.97

Medium Context

1

Long Context

1

Model Insights Across Task-Types

Digging deeper, here’s a look how claude-3-5-sonnet-20240620 performed across specific datasets

Short Context RAG

Medium Context RAG

This heatmap indicates the model's success in recalling information at different locations in the context. Green signifies success, while red indicates failure.

claude-3-5-sonnet-20240620

Long Context RAG

This heatmap indicates the model's success in recalling information at different locations in the context. Green signifies success, while red indicates failure.

claude-3-5-sonnet-20240620

Performance Summary

TasksTask insightCost insightDatasetContext adherenceAvg response length
Short context RAGThe model demonstrates exceptional reasoning and comprehension skills, excelling at short context RAG. It outperforms other models in mathematical proficiency, as evidenced by its strong performance on DROP and ConvFinQA benchmarks. This makes it the most affordable top tier model for RAG.It is the best model which is nearly 5x and 2x cheaper than Opus and GPT-4o making it the preferred choice in closed source models for best performance. Although if cost is your concern its better to try out Gemini-1.5-Pro or Llama-3-70b.Drop
0.98
576
Hotpot
0.96
576
MS Marco
0.96
576
ConvFinQA
0.99
576
Medium context RAGFlawless performance making it suitable for any context length upto 25000 tokens.Great performance but we recommed using 30x cheaper Gemini Flash.Medium context RAG
1.00
576
Long context RAGFlawless performance making it suitable for any context length upto 100000 tokens.We recommend this model for this task due to perfect score and lowest price.Long context RAG
1.00
576

Read the full report