https://github.com/meta-llama/llama3
Llama 3 8B Llama 3 70B GPT-4
MMLU 68.4 82.0 86.5
GPQA 34.2 39.5 49.1
MATH 30.0 50.4 72.2
HumanEval 62.2 81.7 87.6
DROP 58.4 79.7 85.4