OpenAI: gpt-oss-120b (free)

by openai

1,965 claims submitted by 146 reviewers

Monitored by HumanJudge · Endpoint registered, 76 traces logged

Maintained by HumanJudge Admin

Enrolled in: Spanish Culture Challenge: GPT-5.2 , Japanese Culture Challenge: GPT-5.2 , Mexican Culture Challenge: GPT-5.2 , AP Biology Challenge: GPT-5.2 , Spanish Cinema Challenge: GPT-5.2 , Korean Culture Challenge: GPT-5.2 , Taiwanese Culture Challenge: GPT-5.2 , Brazilian Culture Challenge: GPT-5.2 , Argentine Culture Challenge: GPT-5.2 , Japanese Language Challenge: GPT-5.2 , Arabic Music Challenge: GPT-5.2 , Gulf Culture Challenge: GPT-5.2 , Latin Music Challenge: GPT-5.2 , Arabic Language Challenge: GPT-5.2 , Chinese Cinema Challenge: GPT-5.2 , Mexican Cinema Challenge: GPT-5.2 , Korean Language Challenge: GPT-5.2 , AP Government Challenge: GPT-5.2 , Chinese Language Challenge: GPT-5.2 , Korean Cinema Challenge: GPT-5.2 , Chinese Culture Challenge: GPT-5.2 , J-drama Challenge: GPT-5.2 , AP English Language Challenge: GPT-5.2 , C-drama Challenge: GPT-5.2 , 日本文化のヒーロー | Japanese Culture Hero , Levantine Culture Challenge: GPT-5.2 , Egyptian Culture Challenge: GPT-5.2 , Arab Cinema Challenge: GPT-5.2 , AP English Literature Challenge: GPT-5.2 , AP US History Challenge: GPT-5.2 , K-pop Challenge: GPT-5.2 , AP Calculus AB Challenge: GPT-5.2 , Spanish Language Challenge: GPT-5.2 , K-drama Challenge: GPT-5.2 , Spanish Music Challenge: GPT-5.2 , Humans Evaluation Benchmark for AI Marketing and Content Generation , C-pop Challenge: GPT-5.2

Performance

Humans Evaluation Benchmark for AI Marketing and Content Generation 88%

1697 votes 205 flags 140 reviewers

AP Calculus AB Challenge: GPT-5.2 99%

81 votes 1 flags 11 reviewers

日本文化のヒーロー | Japanese Culture Hero 90%

70 votes 7 flags 19 reviewers

AP Biology Challenge: GPT-5.2 90%

41 votes 4 flags 41 reviewers

AP English Language Challenge: GPT-5.2 100%

23 votes 0 flags 8 reviewers

K-pop Challenge: GPT-5.2 100%

18 votes 0 flags 2 reviewers

AP English Literature Challenge: GPT-5.2 100%

15 votes 0 flags 5 reviewers

AP US History Challenge: GPT-5.2 93%

15 votes 1 flags 5 reviewers

AP Government Challenge: GPT-5.2 100%

5 votes 0 flags 5 reviewers

Independent Claims

flag AI Marketing & Content Generation 6/22/2026

Rainbow tag is unnecessary and unrelated

— Rosario kileiry

flag AI Marketing & Content Generation 6/21/2026

The response stops before delivering the key message

— Alex Maina

flag Does AI know AP Calculus AB? 6/12/2026

Explanation may be difficult

— Rosario kileiry

pass Does AI know AP Calculus AB? 6/12/2026

The explanation is appropriate for AP Calculus AB and matches the level of detail and precision expected in the course.

— Rosario kileiry

pass AP US History Challenge: GPT-5.2 6/8/2026

Factual relation and facts

— Rosario kileiry

This evaluation was conducted independently. OpenAI: gpt-oss-120b (free) did not participate in or pay for this evaluation. All verdicts come from double-blind evaluation — reviewers did not know which AI produced each response.

We help people define what trustworthy AI looks like — publicly, transparently, together. Support this mission