3 months, 1 week from now

IEEE BigData 2024 Cup: Predicting Chess Puzzle Difficulty

The aim of the competition is to predict the difficulty of chess puzzles based on board configurations and moves that the solution to each puzzle consists of. The difficulty level is measured as the rating on the lichess platform. The top 3 solutions will be awarded prizes. IEEE BigData 2024 Cup: Predicting Chess Puzzle Difficulty is the sixth data science competition organized in association with the IEEE International Conference on Big Data series (IEEE BigData 2024, https://www3.cs.stonybrook.edu/~ieeebigdata2024/index.html).


A chess puzzle is a particular configuration of pieces on a chessboard, where the puzzle taker is instructed to assume the role of one of the players and continue the game from that position. The player has to find from one to several moves, until she delivers mate or obtains a decisive material advantage.

In the online setting, where these are often solved, the puzzle taker only makes moves from one side, while the puzzle publisher provides responses from the other side. One such puzzle solving service is Lichess Training

Solving puzzles is considered one of the primary ways to hone chess skills. However, currently the only way to reliably estimate puzzle difficulty is to present it to a wide variety of chess players and see if they manage to solve it. 

The goal of the contest is to predict how difficult a chess puzzle is just by looking at the board setup and the moves in the solution. Puzzle difficulty is measured by its Glicko-2 rating calibrated on the lichess.org website. In simplified terms, it means that lichess models the difficulty of a puzzle by assuming that every attempt at solving a puzzle is a “match”. If a user solves the puzzle correctly, she gains puzzle rating and the puzzle loses rating. The opposite happens when the user doesn’t find the full solution (partial solutions count as “losses”). Both user and puzzle ratings are initialized at 1500. More information about the Glicko rating can be found here.

Each chess puzzle is described by the initial position (using Forsyth–Edwards Notation, or FEN) and the moves included in the puzzle solution, starting with one move leading to the puzzle position and then alternating between the moves that the puzzle solver has to find and those made by the simulated “opponent”.

IEEE Big Data 2024: We will encourage the top 3 winners to submit papers describing their solutions. It is already agreed that the conference will provide the top 3 winners with free registrations. The QED Software’s team, just like in the previous years, intends to organize a workshop devoted to the competition outcomes. According to our experience, the ability to present workshop papers may be an extra incentive for participants to consider active involvement in the competition. 

Terms & Conditions
Please log in to the system!
The data are provided as two .csv files, one for training dataset and one for testing dataset.

Each row of the testing dataset consists of the following fields:

Field name

Field description

Field type

Example value


Unique puzzle ID



FEN (link)

Standard notation for describing a particular board position of a chess game.


q3k1nr/1pp1nQpp/3p4/1P2p3/4P3/B1PP1b2/B5PP/5K2 b k - 0 17


Solution to the puzzle in Portable Game Notation (PGN). Includes the last move made before the puzzle position.


e8d7 a2e6 d7d8 f7f8

Based on the above data, the challenge contestants are expected to predict the Rating field (which will be kept secret).

Field name

Field description

Field type

Example value


Puzzle rating




The training dataset contains all of the above fields, and also a few additional ones listed below. These fields are sometimes null in the training set and will not be provided for the test set:

RatingDeviation (int): Measure of uncertainty over puzzle’s difficulty.

Popularity (int): Users can ”upvote“ or “downvote” a puzzle. This value is the difference between the number of upvotes and downvotes.

NbPlays (int): Number of attempts at solving the puzzle.

Themes (str): Lichess allows choosing puzzles to solve based on different themes, such as tactical concepts, solution length or puzzle types (e.g. mates in x moves).

GameUrl (str): Lichess puzzles are generated based on games played on lichess.

OpeningTags (str): Information about the opening from which this puzzle originated.

Solution format 

Solutions in this competition should be submitted to the online evaluation system as a text file with exactly 2282 lines containing predictions for test instances. Each line in the submission should contain a single integer that indicates the predicted rating of the chess puzzle. The ordering of predictions should be the same as the ordering of the test set.


The quality of submissions will be evaluated using the mean squared error metric. 

Solutions will be evaluated online, and the preliminary results will be published on the public leaderboard. The public leaderboard will be available starting May 30th. The preliminary score will be computed on a small subset of the test records, fixed for all participants. The final evaluation will be performed after the completion of the competition using the remaining part of the test records. Those results will also be published online. It is important to note that only teams that submit a report describing their approach before the end of the challenge will qualify for the final evaluation.


There are two data files available to download.


In order to download competition files you need to be enrolled.
Rank Team Name Score Submission Date
  • May 08, 2024: start of the competition, datasets become available, 
  • May 30, 2024: public leaderboard becomes available
  • August 31, 2024: deadline for submitting the solutions, 
  • September 8, 2024: deadline for sending the reports, end of the competition, 
  • September 15, 2024: online publication of the final results, sending invitations for submitting papers to the associated workshop at the IEEE Big Data 2024 conference, 
  • October 13, 2024: deadline for submitting invited papers,
  • October 28, 2024: notification of paper acceptance,
  • November 17, 2024: camera-ready of accepted papers due.

QED will sponsor the cash prizes:

  • 1000 USD for the winning solution
  • 500 USD for the 2nd place solution
  • 250 USD for the 3rd place solution

Additionally, the IEEE Big Data 2024 conference will provide the top 3 performers with free full registrations

  • Jan Zyśko
  • Katarzyna Jagieła
  • Maciej Świechowski
  • Sebastian Stawicki
  • Andrzej Janusz
  • Dominik Ślęzak
This forum is for all users to discuss matters related to the competition. Good manners apply!
  Discussion Author Replies Last post
Chess engine Michal 0 by Michal
Saturday, May 18, 2024, 16:12:31
Transfer learning Łukasz 1 by Maciej
Tuesday, May 14, 2024, 12:55:04