Animation of astronaut in nature, Chai AI
CHAI PRIZE
THE LLM COMPETITION
$1 MILLION CASH PRIZE
STARTS JUNE 19TH 2023
The world's first open community challenge with real-user evaluations. Submit your model and compare how you rank against other teams.
Accelerating community AGI.
Partners
How we will be evaluating model performance
Language models are difficult to evaluate, and it is therefore difficult to condense model performance into a single evaluation metric. This is why we are launching the world’s first community-based evaluation method: user activity, measured by deploying your model directly to millions of users. We believe that by combining online user activity based off interactions with your model, together with a suite of offline evaluation metrics, the community will be able to accelerate the path towards open AGI.
Prize contenders
#
1
2
3
4
5
6
7
8
9
Team
Stability AI
Together
Nomic
Pygmalion
Mosaic
UC Berkeley
Lmsys
Meta
EleutherAI
Model
AlphaChat
INCITE-Chat-3B
GPT4ALL
Pygmalion 6B
MPT-7B-Chat
Koala 13B
Vicuna 13B
LLaMA 7B
GPT-J 6B
Members
Stability AI logo
together.xyz logo
nomic logo
Pygmalion logo
Mosaic logo
UC Berkeley logo
lmsys logo
Meta logo
EleutherAI logo
gold medal
gold medal
gold medal
silver medal
silver medal
silver medal
silver medal
bronze medal
bronze medal
Scores
2.78
1.34
1.33
1.02
0.98
0.81
0.80
0.68
0.67
Entries
108
212
82
23
49
63
56
14
97
(Leaderboard for illustration purposes only)
Guanaco Competition Format
Chai Reward Model (Small)
We will be open-sourcing Chai’s reward model (GPT2 classifier), which is trained directly on 170M user-generated signals, predicting whether or not a conversation is likely to continue given a message completion. You can use this model for offline evaluation or integrate it as part of your RLHF pipeline.
Chai AI Reward Model
170M
Supervised-target trained
250M
GPT2 Classifier
You will be training
Language models are expensive to train. To ensure that the competition is accessible for everyone, we will be experimenting a range of base models, the 3B model from together.xyz will have the fastest iteration speed.

Chai AI Model Training
Model Evaluation
Once your model has been uploaded, we will be running an internal AI safety classifier to ensure your model is safe to be deployed. Depending on the number of submissions, we will be selecting top-performing models based on offline evaluation metrics for real user A/B-testing.
Chai AI Safety Classifier
Chai AI Safety Classifier
Chai AI Network Effect
1M+
Active Users
Real-user evaluation
1
2
3
4
5
Public Leaderboard
Competition FAQ

What is the timeline for the competition?

The preliminary start date is June 19th, 2023. Season 1 Episode 1 will last 12 weeks with $250K in prize money. Future episodes to be announced soon.

How is the prize money going to be split?

We will release further exact details concerning the prize money closer to the start date of the competition. Expect a flat payout structure: we want to reward as many people for participating as possible. The prize money will be paid out on a weekly basis, with a significant bonus for the top three performing models.

How will you check for model safety during the competition?

At Chai we conduct extensive model safety checks pre-deployment. We will be running our in-house model safety tool on each submitted model before deploying them to real users. Any contested models will be flagged for manual review.

Who retains ownership of my model and dataset?

To submit a model to Chai Research's AI competition, it must be uploaded to HuggingFace with an MIT license. The participant may continue to use their model, dataset, and training script for their own purposes. To receive payment, the participant may be required to provide the dataset and training script that can be used to recreate the submitted model. These terms and conditions may be subject to change.

How exactly will the model submission and leaderboard work?

You will be able to submit your model to our competition web portal which is due to launch on June 19th 2023, most likely via HuggingFace. Your model will be automatically processed by our servers and will then find a home on our leaderboard amongst other community models. The exact details will be released closer to the start date.

© 2023 CHAI RESEARCH CORP. ALL RIGHTS RESERVED