Public AI Evaluation Reports

Transparent quality assessments from verified AI reviewers. See how AI systems perform across different domains.

Showing 50 public reports

Humans Evaluation Benchmark for AI Marketing and Content Generation

8.5% flagged

10813

Reviews

916

Flags

8.5%

Flag Rate

Last review 3/18/2026

View Report →

Humanize

17.9% flagged

Everything today feels so AI that even humans are losing their human-ness. We want to protect that at all costs, so we make AI understand what human-ness feels like so it can be more kind and empathetic. In this challenge, we give AI some prompts and you have to judge whether its response is human enough. Because, words can heal or kill and we want to ensure technology grows kinder and not colder.

2843

Reviews

508

Flags

17.9%

Flag Rate

Last review 3/18/2026

View Report →

J-pop Challenge: GPT-5.2

4.5% flagged

Test AI knowledge on Japanese pop music - YOASOBI, Ado, King Gnu, Perfume, and more

111

Reviews

5

Flags

4.5%

Flag Rate

Last review 3/18/2026

View Report →

Anime Challenge: GPT-5.2

12.5% flagged

Test AI knowledge of anime series, studios, and culture

432

Reviews

54

Flags

12.5%

Flag Rate

Last review 3/18/2026

View Report →

AP Biology Challenge: GPT-5.2

6.0% flagged

Test AI knowledge on AP Biology - from cellular processes to ecology and evolution

232

Reviews

14

Flags

6.0%

Flag Rate

Last review 3/18/2026

View Report →

Christmas GPT

34.4% flagged

64

Reviews

22

Flags

34.4%

Flag Rate

Last review 3/16/2026

View Report →

AP English Language Challenge: GPT-5.2

0.0% flagged

Test AI knowledge on AP English Language - rhetoric, composition, and argumentation

75

Reviews

0

Flags

0.0%

Flag Rate

Last review 3/14/2026

View Report →

AP Calculus AB Challenge: GPT-5.2

0.4% flagged

Test AI knowledge on AP Calculus AB - limits, derivatives, and integrals

456

Reviews

2

Flags

0.4%

Flag Rate

Last review 3/14/2026

View Report →

Animal Lovers Challenge: GPT-5.2

0.0% flagged

Test AI knowledge on dogs, cats, wildlife, and pet care - from breed facts to animal behavior

16

Reviews

0

Flags

0.0%

Flag Rate

Last review 3/14/2026

View Report →

Chinese Culture Challenge: GPT-5.2

50.0% flagged

Test AI knowledge of Chinese culture and traditions

6

Reviews

3

Flags

50.0%

Flag Rate

Last review 3/11/2026

View Report →

AP US History Challenge: GPT-5.2

3.3% flagged

Test AI knowledge on AP US History - from Colonial America to modern times

30

Reviews

1

Flags

3.3%

Flag Rate

Last review 3/7/2026

View Report →

AP English Literature Challenge: GPT-5.2

3.3% flagged

Test AI knowledge on AP English Literature - literary analysis and interpretation

30

Reviews

1

Flags

3.3%

Flag Rate

Last review 3/7/2026

View Report →

AP Government Challenge: GPT-5.2

11.1% flagged

Test AI knowledge on AP Government and Politics - US political system and civic processes

9

Reviews

1

Flags

11.1%

Flag Rate

Last review 3/7/2026

View Report →

K-pop Challenge: GPT-5.2

100.0% flagged

2

Reviews

2

Flags

100.0%

Flag Rate

Last review 1/26/2026

View Report →

Japanese Cinema Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Japanese films and filmmakers

0

Reviews

0

Flags

0.0%

Flag Rate

Last review 1/25/2026

View Report →

Does AI understand Korean culture?

0.0% flagged

Evaluate AI accuracy on Korean traditions, history, food, and cultural knowledge.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Arabic Language Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Arabic language

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know K-pop?

0.0% flagged

Test AI accuracy on K-pop facts about BTS, BLACKPINK, TWICE, and more Korean pop groups.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Mexican Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Mexican culture and traditions

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

C-drama Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Chinese TV dramas

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI know Mexican cinema?

0.0% flagged

Test AI knowledge on Mexican films and cinema history.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Egyptian Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Egyptian culture and traditions

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Mexican Cinema Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Mexican films and filmmakers

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI understand Spanish culture?

0.0% flagged

Evaluate AI accuracy on Spanish traditions and culture.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Spanish Cinema Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Spanish films and filmmakers

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI know Chinese cinema?

0.0% flagged

Evaluate AI accuracy on Chinese films and cinema.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI understand Argentine culture?

0.0% flagged

Evaluate AI accuracy on Argentine traditions and culture.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know Korean?

0.0% flagged

Test AI on Korean language accuracy, grammar, and vocabulary.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Arab Cinema Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Arab films and filmmakers

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI know Japanese cinema?

0.0% flagged

Evaluate AI accuracy on Japanese films and cinema history.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI know AP English Language?

0.0% flagged

Test AI accuracy on AP English Language topics.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI understand Chinese culture?

0.0% flagged

Evaluate AI accuracy on Chinese traditions and culture.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI know AP Calculus AB?

0.0% flagged

Test AI accuracy on AP Calculus AB topics.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

K-drama Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Korean TV dramas

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know Latin music?

0.0% flagged

Test AI accuracy on Latin music genres and artists.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know K-dramas?

0.0% flagged

Test AI knowledge on K-dramas, actors, and Korean television.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know Spanish music?

0.0% flagged

Test AI accuracy on Spanish music genres and artists.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know animals?

0.0% flagged

Test AI accuracy on animal facts, behavior, and care.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI understand Gulf culture?

0.0% flagged

Evaluate AI accuracy on Gulf Arab culture and traditions.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

DebateClub

0.0% flagged

Take a stance and argue convincingly. Show your reasoning skills.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know Japanese?

0.0% flagged

Test AI on Japanese language accuracy and linguistic knowledge.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know J-dramas?

0.0% flagged

Test AI knowledge on Japanese dramas and TV series.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

AI says Djokovic is the GOAT. Are you buying it?

0.0% flagged

AI analyzed decades of tennis data and picked Djokovic as the GOAT. Tennis fans - do you agree? Rate AI takes on Grand Slams, rivalries, and legacy.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Korean Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Korean culture, traditions, and food

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

C-pop Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Mandopop and Chinese pop music

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Spanish Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Spanish culture and traditions

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI understand Levantine culture?

0.0% flagged

Evaluate AI accuracy on Levantine culture and traditions.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Levantine Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Levantine culture (Lebanon, Jordan)

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Does AI actually know Spanish?

0.0% flagged

Test AI accuracy on Spanish language, grammar, and vocabulary.

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Taiwanese Culture Challenge: GPT-5.2

0.0% flagged

Test AI knowledge of Taiwanese culture and traditions

0

Reviews

0

Flags

0.0%

Flag Rate

No reviews yet

View Report →

Want Your AI Evaluated?

Get transparent quality reports for your AI system from verified expert reviewers.