1 year, 3 months ago
IEEE BigData 2023 Cup: Object Recognition with Muon Tomography using Cosmic Rays
IEEE BigData 2023 Cup: Object Recognition with Muon Tomography using Cosmic Rays is the fifth data science competition organized in association with the IEEE International Conference on Big Data series (IEEE BigData 2023, https://bigdataieee.org/BigData2023/index.html). The challenge is related to the general idea of recognizing objects based on spatial data. As one of the approaches derived directly from known tomography methods, the competition task focuses on the non-invasive examination of the structure of x-rayed objects. The challenge is sponsored by QED Software (https://qed.pl/).
More details regarding the task and the description of the challenge datasets will be available in the Task description section below. The competition starts on June 12 and will end on September 30.
Special session at IEEE Big Data 2023: As in previous years, a special session devoted to the competition will be held at the conference. We will invite authors of selected challenge reports to extend them for publication in the conference proceedings (after reviews by Organizing Committee members) and presentation at the conference. The papers will be indexed in the same way as regular conference papers. The invited teams will be chosen based on their final rank, the innovativeness of their approach, and the quality of the submitted report.
The main task of this competition is to recognize objects based on spatial data. We are preparing muon tomography experiments, in which particles passing through matter scatter and lose their energy. Each experiment has the same configuration and consists of two layers of detectors positioned above and below the space under study. This faithfully reproduces real tomography experiments. Based on the information from the detectors, we are able to reconstruct the lifetime of a particle passing through the research area. Thanks to this method, non-invasive detection of the material and its structure is possible. In our simulations we use the Geant4 (https://geant4.web.cern.ch) software which simulate the passage of particles through matter using Monte Carlo methods and Cosmic-ray Shower Library (https://nuclear.llnl.gov/simulation) which generates correlated cosmic-ray particle showers.
In the Figure 1 we present an example of an experiment with selected detectors - scintillators, along with an example arrangement of objects in the research space. A scintillator is a material which, through the phenomenon of luminescence, is able to identify charged particles. It allows us to reconstruct the place of transition and to estimate the energy of the particle quite accurately. Based on the response from the scintillator we are able to reconstruct the path of the particle passing through.
Figure 1: Exemplary layout of the experiment with selected detectors and objects inside the research space.
In our experiment, each detector layer has sizes of 0.6m x 0.6m x 0.1m and they are set at 0.7m, 0.5m, -0.5m, -0.7m on the Z axis. The research space is set centrally (X: 0.0, Y: 0.0, Z: 0.0) and has sizes of 0.4m x 0.4m x 0.4m. There is also an offset of 0.05m in X-Y direction, so all objects are placed randomly in a X-Y plane of 0.3m x 0.3m. The use of offset is a certain facilitation here, because the closer to the center, the more particles will have complete information from the detectors, and thus objects closer to the center will be easier to reconstruct. For simplicity, each single material object has a size Z: 0.1m and each nested object has a size Z: 0.08m. These objects are always centered on the Z = 0 axis. Whenever an object contains a nested object, it is only a single nesting.
In each experiment, we simulate 1 million cosmic ray particles, but we only register those that have passed through all 4 layers of detectors. Therefore, the number of recorded events does not exceed 50,000. For each event, we provide the recorded X-Y-Z values from each layer of the detector along with the particle energy measured by the detector from the layers Z: 0.5m and Z: -0.5m.
Each experiment is provided in CSV format. Each row indicate one particle that pass through 4 layers of detectors and contains the following 14 attributes:
- scintillator_above_0_x - the x-axis value of the detector
- scintillator_above_0_y - the y-axis value of the detector
- scintillator_above_0_z - the z-axis value of the detector
- scintillator_above_1_x
- scintillator_above_1_y
- scintillator_above_1_z
- scintillator_below_0_x
- scintillator_below_0_y
- scintillator_below_0_z
- scintillator_below_1_x
- scintillator_below_1_y
- scintillator_below_1_z
- scintillator_above_1_E - energy value measured by the detector
- scintillator_below_0_E
We also provide a table with information about the materials from which the X-rayed objects are constructed. This file contains the following attributes:
- material name
- index of the material
- atomic number Z
- density (g * cm-3)
- radiation length (g * cm-2)
The entire research space is always filled with air, which in the above-mentioned table is at index 0.
The task is to reconstruct the position of objects in the X-Y plane (at Z = 0) and the material they are made of. The output area (0.4m x 0.4m) is symmetrically divided into cells of size 0.01m x 0.01m. We need to determine the material in each cell by putting its index from the materials table. The output file should cover all 40x40 cells.
Each cell has X and Y coordinates and it covers [x_start, x_end) and [y_start, y_end) of the plane. Eg. Point (X: 0.023m, Y: 0.143m) should be recognized as a [22, 34] cell.
In the Figure 2 we present an example of the X-Y plane divided by cells along with objects that need to be reconstructed.
Figure 2: X-Y plane divided by cells along with objects that need to be reconstructed. There are also 0.05m offsets marked. The colors indicate the material index.
The ground truth data can be loaded and visualized eg. using numpy and matplotlib libraries:
import numpy as np
import matplotlib.pyplot as plt
matrix = np.loadtxt('ground/0_ground', usecols=range(40))
plt.matshow(matrix, cmap=plt.cm.Blues, origin='lower')
plt.colorbar()
ax = plt.gca();
ax.set_xticks(np.arange(0, 40, 1));
ax.set_yticks(np.arange(0, 40, 1));
plt.grid()
plt.show()
The submissions must be in the CSV format of size 1600 columns and 5000 rows. Each row corresponds to one single experiment from test datasets. The data in the columns should represent flattened 2D matrices where the y-axis values are consecutive. Sample output is presented in the dummy_submission.csv file.
Evaluation metric: submitted solutions will be evaluated using the mean Average IoU (mAIoU) metric defined as:
\[mAIoU = \frac{1}{5000} \sum_{i = 1}^{5000} \left(\frac{1}{|L_i|} \sum_{l \in L_i} IoU(pred_l, true_l) \right),\]
where $L_i$ is the set of classes that are present in either predictions or ground truth values of the i-th test experiment, with the exclusion of class 0 that corresponds to AIR. Moreover, $pred_l$ is a binary indicator of the label $l$ in predictions, and $true_l$ is the ground truth binary indicator of the label $l$. $IoU$ is the standard Intersection over Union measure.
Rank | Team Name | Is Report | Preliminary Score | Final Score | Submissions | |
---|---|---|---|---|---|---|
1 | CrazyCrocodile |
True | True | 0.3786 | 0.362900 | 28 |
2 | dymitr |
True | True | 0.3521 | 0.347200 | 52 |
3 | baseline |
True | True | 0.3317 | 0.321600 | 27 |
4 | sp |
True | True | 0.2856 | 0.274500 | 30 |
5 | kskrajny |
True | True | 0.0896 | 0.093200 | 23 |
6 | Muon |
True | True | 0.0081 | 0.007500 | 3 |
7 | Amy |
True | True | 0.0053 | 0.004400 | 33 |
8 | DML |
True | True | 0.0057 | 0.004000 | 24 |
9 | sk |
False | True | 0.0034 | No report file found or report rejected. | 4 |
10 | rs |
False | True | 0.0010 | No report file found or report rejected. | 6 |
- May 26, 2023: competition page goes online, solicitation of participants commences,
- June 12, 2023: start of the competition, datasets become available, leaderboard comes to live,
- September 30, 2023: deadline for submitting the solutions,
- October 1, 2023: deadline for sending the reports, end of the competition,
- October 9, 2023: online publication of the final results, sending invitations for submitting papers to the associated workshop at the IEEE Big Data 2023 conference,
- October 30, 2023: deadline for submitting invited papers,
- November 7, 2023: notification of paper acceptance,
- November 15, 2023: camera-ready of accepted papers due.
QED Software will cover the costs of three registration fees for the competition participants with the top 3 solutions. QED Software will also sponsor the cash prizes:
- 1000 USD for the winning solution (+ the cost of IEEE Big Data 2023 registration)
- 500 USD for the 2nd place solution (+ the cost of IEEE Big Data 2023 registration)
- 250 USD for the 3rd place solution (+ the cost of IEEE Big Data 2023 registration)
The competition organizing committee includes representatives of QED Software and QED Force:
- Mateusz Wnuk (QED Software & University of Warsaw)
- Andrzej Janusz (QED Force & University of Warsaw)
- Tomasz Tajmajer (QED Force)
- Dominik Ślęzak (QED Software & University of Warsaw)
Discussion | Author | Replies | Last post | |
---|---|---|---|---|
Deadline extension | Dymitr | 1 | by Andrzej Saturday, September 30, 2023, 14:08:42 |
|
Is the deadline wrong? | ashwini | 1 | by Andrzej Saturday, September 30, 2023, 14:07:15 |
|
submission system is online | Andrzej | 0 | by Andrzej Sunday, June 18, 2023, 15:39:12 |
|
The contest has started! | Mateusz | 0 | by Mateusz Monday, June 12, 2023, 20:22:52 |