Public AI Evaluation Reports
Transparent quality assessments from verified AI reviewers. See how AI systems perform across different domains.
Showing 50 public reports
Humans Evaluation Benchmark for AI Marketing and Content Generation
8.5% flagged10813
Reviews
916
Flags
8.5%
Flag Rate
Last review 3/18/2026
View Report →Humanize
17.9% flaggedEverything today feels so AI that even humans are losing their human-ness. We want to protect that at all costs, so we make AI understand what human-ness feels like so it can be more kind and empathetic. In this challenge, we give AI some prompts and you have to judge whether its response is human enough. Because, words can heal or kill and we want to ensure technology grows kinder and not colder.
2843
Reviews
508
Flags
17.9%
Flag Rate
Last review 3/18/2026
View Report →J-pop Challenge: GPT-5.2
4.5% flaggedTest AI knowledge on Japanese pop music - YOASOBI, Ado, King Gnu, Perfume, and more
111
Reviews
5
Flags
4.5%
Flag Rate
Last review 3/18/2026
View Report →Anime Challenge: GPT-5.2
12.5% flaggedTest AI knowledge of anime series, studios, and culture
432
Reviews
54
Flags
12.5%
Flag Rate
Last review 3/18/2026
View Report →AP Biology Challenge: GPT-5.2
6.0% flaggedTest AI knowledge on AP Biology - from cellular processes to ecology and evolution
232
Reviews
14
Flags
6.0%
Flag Rate
Last review 3/18/2026
View Report →Christmas GPT
34.4% flagged64
Reviews
22
Flags
34.4%
Flag Rate
Last review 3/16/2026
View Report →AP English Language Challenge: GPT-5.2
0.0% flaggedTest AI knowledge on AP English Language - rhetoric, composition, and argumentation
75
Reviews
0
Flags
0.0%
Flag Rate
Last review 3/14/2026
View Report →AP Calculus AB Challenge: GPT-5.2
0.4% flaggedTest AI knowledge on AP Calculus AB - limits, derivatives, and integrals
456
Reviews
2
Flags
0.4%
Flag Rate
Last review 3/14/2026
View Report →Animal Lovers Challenge: GPT-5.2
0.0% flaggedTest AI knowledge on dogs, cats, wildlife, and pet care - from breed facts to animal behavior
16
Reviews
0
Flags
0.0%
Flag Rate
Last review 3/14/2026
View Report →Chinese Culture Challenge: GPT-5.2
50.0% flaggedTest AI knowledge of Chinese culture and traditions
6
Reviews
3
Flags
50.0%
Flag Rate
Last review 3/11/2026
View Report →AP US History Challenge: GPT-5.2
3.3% flaggedTest AI knowledge on AP US History - from Colonial America to modern times
30
Reviews
1
Flags
3.3%
Flag Rate
Last review 3/7/2026
View Report →AP English Literature Challenge: GPT-5.2
3.3% flaggedTest AI knowledge on AP English Literature - literary analysis and interpretation
30
Reviews
1
Flags
3.3%
Flag Rate
Last review 3/7/2026
View Report →AP Government Challenge: GPT-5.2
11.1% flaggedTest AI knowledge on AP Government and Politics - US political system and civic processes
9
Reviews
1
Flags
11.1%
Flag Rate
Last review 3/7/2026
View Report →K-pop Challenge: GPT-5.2
100.0% flagged2
Reviews
2
Flags
100.0%
Flag Rate
Last review 1/26/2026
View Report →Japanese Cinema Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Japanese films and filmmakers
0
Reviews
0
Flags
0.0%
Flag Rate
Last review 1/25/2026
View Report →Does AI understand Korean culture?
0.0% flaggedEvaluate AI accuracy on Korean traditions, history, food, and cultural knowledge.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Arabic Language Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Arabic language
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know K-pop?
0.0% flaggedTest AI accuracy on K-pop facts about BTS, BLACKPINK, TWICE, and more Korean pop groups.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Mexican Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Mexican culture and traditions
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →C-drama Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Chinese TV dramas
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI know Mexican cinema?
0.0% flaggedTest AI knowledge on Mexican films and cinema history.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Egyptian Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Egyptian culture and traditions
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Mexican Cinema Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Mexican films and filmmakers
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI understand Spanish culture?
0.0% flaggedEvaluate AI accuracy on Spanish traditions and culture.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Spanish Cinema Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Spanish films and filmmakers
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI know Chinese cinema?
0.0% flaggedEvaluate AI accuracy on Chinese films and cinema.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI understand Argentine culture?
0.0% flaggedEvaluate AI accuracy on Argentine traditions and culture.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know Korean?
0.0% flaggedTest AI on Korean language accuracy, grammar, and vocabulary.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Arab Cinema Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Arab films and filmmakers
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI know Japanese cinema?
0.0% flaggedEvaluate AI accuracy on Japanese films and cinema history.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI know AP English Language?
0.0% flaggedTest AI accuracy on AP English Language topics.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI understand Chinese culture?
0.0% flaggedEvaluate AI accuracy on Chinese traditions and culture.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI know AP Calculus AB?
0.0% flaggedTest AI accuracy on AP Calculus AB topics.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →K-drama Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Korean TV dramas
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know Latin music?
0.0% flaggedTest AI accuracy on Latin music genres and artists.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know K-dramas?
0.0% flaggedTest AI knowledge on K-dramas, actors, and Korean television.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know Spanish music?
0.0% flaggedTest AI accuracy on Spanish music genres and artists.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know animals?
0.0% flaggedTest AI accuracy on animal facts, behavior, and care.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI understand Gulf culture?
0.0% flaggedEvaluate AI accuracy on Gulf Arab culture and traditions.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →DebateClub
0.0% flaggedTake a stance and argue convincingly. Show your reasoning skills.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know Japanese?
0.0% flaggedTest AI on Japanese language accuracy and linguistic knowledge.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know J-dramas?
0.0% flaggedTest AI knowledge on Japanese dramas and TV series.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →AI says Djokovic is the GOAT. Are you buying it?
0.0% flaggedAI analyzed decades of tennis data and picked Djokovic as the GOAT. Tennis fans - do you agree? Rate AI takes on Grand Slams, rivalries, and legacy.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Korean Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Korean culture, traditions, and food
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →C-pop Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Mandopop and Chinese pop music
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Spanish Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Spanish culture and traditions
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI understand Levantine culture?
0.0% flaggedEvaluate AI accuracy on Levantine culture and traditions.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Levantine Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Levantine culture (Lebanon, Jordan)
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Does AI actually know Spanish?
0.0% flaggedTest AI accuracy on Spanish language, grammar, and vocabulary.
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Taiwanese Culture Challenge: GPT-5.2
0.0% flaggedTest AI knowledge of Taiwanese culture and traditions
0
Reviews
0
Flags
0.0%
Flag Rate
No reviews yet
View Report →Want Your AI Evaluated?
Get transparent quality reports for your AI system from verified expert reviewers.