AI Evaluation Tools
For Domain Experts
Use our Chrome extension and playground to evaluate AI outputs. Vote Pass or Flag. Document AI failures. Build your public expert profile.
AI Response:
"NewJeans debuted in 2023 with their single 'Attention'..."
Your reasoning:
"Incorrect. NewJeans debuted in 2022, not 2023."
Two Ways to Evaluate
Use our playground directly, or evaluate AI anywhere with the Chrome extension.
Direct Call (Playground)
Use our built-in playground to chat with AI and evaluate responses directly on HumanJudge.
- ✓ Chat with AI on your challenge topic
- ✓ Vote Pass or Flag on each response
- ✓ Add your reasoning and documentation
- ✓ Everything saved to your public profile
Visit the App (Extension)
Use the Chrome extension to evaluate AI outputs on any website — ChatGPT, Claude, or third-party apps.
- ✓ Evaluate AI anywhere on the web
- ✓ Capture context from external apps
- ✓ Submit evaluations directly
- ✓ Works with any AI product
How Verdicts Work
Every evaluation is a simple decision: Pass or Flag.
Pass
The AI response is accurate and helpful in your domain expertise.
- • Factually correct information
- • Appropriate context and nuance
- • No significant errors or omissions
Flag
The AI response has errors, misconceptions, or problems that you can identify.
- • Factual errors or hallucinations
- • Missing critical context
- • Cultural insensitivity or bias
Document Your Reasoning
Every verdict includes your expert reasoning. This builds your public profile and helps AI developers understand exactly what went wrong.
Your reasoning becomes part of your public expert profile.
Install the Chrome Extension
The GrandJury extension is your evaluation toolkit.
or install manually
Download & Extract
Click download and extract the ZIP to a folder.
Open Extensions
Go to chrome://extensions
Developer Mode
Toggle "Developer mode" ON in top-right.
Load Extension
Click "Load unpacked" and select the folder.
Build Your Public Expert Profile
Every evaluation builds your visible track record. Your name. Your expertise. Your judgments.
Your Name
Public profile with your real identity — no anonymous labor.
Verified Expertise
Track record in your domains visible to everyone.
Growing Value
Early contributors get first access to monetization.
Common Questions
What's the difference between the playground and extension?
The playground lets you chat with AI directly on HumanJudge and evaluate responses. The extension lets you evaluate AI outputs anywhere on the web — ChatGPT, Claude, third-party apps, etc.
How do I decide Pass vs Flag?
Use your domain expertise. Pass if the AI response is accurate and helpful. Flag if you spot errors, hallucinations, missing context, or cultural issues. Document your reasoning either way.
Is my evaluation data public?
Your verdicts and reasoning appear on your public profile. This builds your expert reputation. You control what topics you evaluate.
Do I need special qualifications?
No gatekeeping. If you have genuine knowledge in a topic, you can evaluate AI in that domain. Your track record speaks for itself.
How do I get started?
1. Create an account, 2. Browse challenges in your areas of expertise, 3. Start evaluating using the playground or Chrome extension.
Ready to Evaluate AI?
Browse challenges in your area of expertise and start building your public profile.