1 month from now

IEEE BigData 2019 Cup: Suspicious Network Event Recognition

Suspicious Network Event Recognition is a data mining challenge organized in association with IEEE BigData 2019 conference. The task is to decide which alerts should be regarded as suspicious based on information extracted from network traffic logs. The competition is kindly sponsored by Security On-Demand (https://www.securityondemand.com/) and QED Software (http://qed.pl/).

Cyber threat detection and analytics play a pivotal role in providing security to organizations that provide web services, and to their users. Importance of this field is continuously growing due to the increasing abundance of Internet services, wireless networks, smart devices, etc.   Since the cybersecurity domain is hugely complex, it is also one of the major challenges of the contemporary world.

In this challenge, the task is to detect truly suspicious events and false alarms within the set of so-called network traffic alerts, that the Security Operations Center (SOC) Team members @ SOD have to analyze on an everyday basis. An efficient classification model could help the SOC Team to optimize their operations significantly. It is worth adding that although the competition sponsor is entirely commercial, the knowledge and experience that can be gathered by the competition participants may be highly beneficial to improve the intelligent cybersecurity modules in many organizations.

More details regarding the task and a description of the challenge data can be found in the Task description section.

Special track at IEEE BigData 2019: A special session devoted to the challenge will be held at the IEEE BigData 2019 conference. We will invite authors of selected challenge reports to extend them for publication in the conference proceedings (after reviews by Organizing Committee members) and presentation at the conference. The publications will be indexed in the same way as regular conference papers. The invited teams will be chosen based on their final rank, innovativeness of their approach and quality of the submitted report. 

References:

  • D. Ślęzak, A. Chądzyńska-Krasowska, J. Holland, P. Synak, R. Glick, M. Perkowski: Scalable Cyber-security Analytics with a New Summary-based Approximate Query Engine. IEEE BigData 2017: 1840-1849
Terms & Conditions
 
 
Please logIn to the system!

The data set available in the challenge consist of alerts investigated by a SOC team at SoD (called ‘investigated alerts’). Each record is described by various statistics selected based on experts’ knowledge, and a hierarchy of associated IP addresses (anonymized), called assets. For each alert in the ‘investigated alerts’ data tables, there is a history of related log events (a detailed set of network operations acquired by SoD, anonymized to ensure the safety of SoD clients).

In total, training and test data sets cover a period between October 1, 2018, and March 31, 2019. A description of columns from the ‘investigated alerts’ data is provided in a separate file column_descriptions.txt. The main data was divided into a training set and a test set based on alert timestamps. Approximately four months are used as the training set (the file cybersecurity_training.csv) and the remaining part is used as a test set (the file cybersecurity_test.csv). The format of those two files is the same - columns are separated by the '|' sign, however, the target column called 'notified' is missing in the test data.

There will also be a second data set available to participants of the challenge. Due to its large size, it will be hosted on an external platform. We will provide access to this data on request, to participants who exceed the baseline score on the public leaderboard. This data contains information about individual event logs associated with each of the alerts from the main data (both training and test parts). A more detailed description of this set will be provided at a later stage of the competition.

The task and the format of submissions: the task for participants of this challenge is to predict which of the investigated alerts were considered truly suspicious by the SOC team and led to issuing a notification to SoD’s clients. In the training data, this information is indicated by the column 'notified'. A submission should have a form of scores assigned to every record from the test data - each score in a separate line of a text file. An example of a correctly formatted submission file is provided in the Data files section.

Evaluation: the quality of submissions will be evaluated using the AUC measure. Solutions will be evaluated online and the preliminary results will be published on the public leaderboard. The preliminary score will be computed on a small subset of the test records, fixed for all participants. The final evaluation will be performed after completion of the competition using the remaining part of the test records. Those results will also be published online. It is important to note that only teams which submit a report describing their approach before the end of the challenge will qualify for the final evaluation. 

In order to download competition files you need to be enrolled.
Rank Team Name Score Submission Date
1
lana
0.9394 2019-08-1 04:30:35
2
NSEC SJTU
0.9379 2019-08-13 11:58:37
3
hieuvq
0.9372 2019-08-11 02:08:15
4
Dymitr
0.9322 2019-08-1 14:01:55
5
SleepyCat
0.9321 2019-08-13 09:26:49
6
test_123
0.9321 2019-07-29 04:02:10
7
security
0.9291 2019-08-24 19:36:59
8
DeepIf
0.9272 2019-07-13 10:11:16
9
amy
0.9258 2019-07-30 11:10:39
10
UMF Canada
0.9253 2019-08-23 00:20:18
11
extended baseline
0.9224 2019-07-13 02:52:55
12
maper1
0.9184 2019-08-19 09:10:30
13
sink
0.9180 2019-07-30 09:59:56
14
HSOC
0.9151 2019-08-23 17:47:19
15
iddqd
0.9151 2019-07-13 15:59:58
16
jd
0.9145 2019-08-17 00:09:10
17
r2
0.9126 2019-08-8 23:37:56
18
GoGoPowerRangers
0.9095 2019-07-8 18:04:04
19
piero
0.9082 2019-07-15 15:47:08
20
xyz
0.9066 2019-07-18 09:11:33
21
DennisShaw
0.9048 2019-07-31 04:07:49
22
Sinister Three
0.9045 2019-07-2 16:42:48
23
shire
0.8978 2019-07-5 13:05:06
24
hopium
0.8966 2019-07-26 02:01:45
25
IF
0.8961 2019-08-16 07:33:20
26
Chain
0.8943 2019-07-11 17:30:36
27
maa
0.8935 2019-07-13 16:39:04
28
323_touch_fish
0.8934 2019-07-13 05:51:20
29
michalm
0.8901 2019-07-23 08:52:01
30
baseline solution
0.8899 2019-07-2 18:09:18
31
12213
0.8890 2019-07-31 04:13:05
32
Marshalls
0.8861 2019-07-5 11:24:05
33
Dan
0.8852 2019-08-5 19:04:38
34
MAGIX.AI
0.8836 2019-08-7 09:05:44
35
marcb
0.8760 2019-07-3 16:59:43
36
Rookie
0.8598 2019-07-13 02:06:17
37
ISO_Project
0.8564 2019-07-16 21:43:19
38
hello_there
0.8557 2019-07-3 18:40:53
39
M
0.8529 2019-07-22 00:38:46
40
nightking
0.8360 2019-07-21 08:55:05
41
InTensorty
0.8354 2019-08-9 22:28:44
42
agui
0.8350 2019-08-6 07:17:02
43
zhaohui
0.8345 2019-08-6 13:18:18
44
阿贵去哪了
0.8293 2019-08-6 14:48:16
45
roger
0.8246 2019-08-6 14:18:41
46
safak
0.8143 2019-07-18 20:45:30
47
Surrey
0.8095 2019-07-10 15:35:11
48
CREDIT
0.8051 2019-08-15 21:22:08
49
j
0.7925 2019-07-30 16:58:33
50
Mars
0.7774 2019-07-22 07:31:57
51
Sof
0.7679 2019-08-5 15:46:07
52
The artic lab
0.7216 2019-07-6 18:53:18
53
Toy
0.6388 2019-07-12 13:24:48
54
钢棍谢师傅
0.5542 2019-08-17 22:20:22
55
lol
0.5356 2019-07-3 14:50:36
56
Radosne Kurki
0.5356 2019-07-8 14:30:55
57
---
0.5356 2019-07-8 21:14:42
58
FuzzyMelon
0.5356 2019-07-9 00:03:30
59
Test Team
0.5356 2019-08-1 14:49:35
60
Kefi
0.5356 2019-08-3 09:57:42
61
Pisa
0.5340 2019-08-8 10:49:11
62
CBRL-TOPCODERS
0.5281 2019-07-22 08:03:14
63
Cloud
0.5175 2019-07-18 08:44:12
64
J_theboss
0.5025 2019-07-30 19:15:56
65
panand
0.5000 2019-07-11 11:46:24
  • May 27, 2019: web site of the challenge opens, the task is revealed,
  • June 15, 2019 July 1, 2019: start of the competition, data become available,
  • September 29, 2019 (23:59 GMT): deadline for submitting the solutions,
  • October 2, 2019 (23:59 GMT): deadline for sending the reports, end of the competition,
  • October 7, 2019: online publication of the final results, sending invitations for submitting papers for the special track at the IEEE BigData 2019 conference,
  • October 28, 2019: deadline for submiting invited papers,
  • November 4, 2019: notification of paper acceptance,
  • November 15, 2019: camera-ready of accepted papers due,
  • December 9-12, 2019: the IEEE BigData 2019 conference (special track date TBA).

Authors of the top-ranked solutions (based on the final evaluation scores) will be awarded prizes funded by our sponsors:

  • First Prize: 1500 USD + one free IEEE BigData 2019 conference registration,
  • Second Prize: 1000 USD + one free IEEE BigData 2019 conference registration,
  • Third Prize: 500 USD + one free IEEE BigData 2019 conference registration.

The award ceremony will take place during the special track at IEEE BigData 2019 conference.

  • Dominik Ślęzak, QED Software & Security On-Demand & University of Warsaw
  • Agnieszka Chądzyńska-Krasowska, Security On-Demand & Polish-Japanese Academy of Information Technology
  • Joel Holland, Security On-Demand
  • Andrzej Janusz, QED Software & University of Warsaw
  • Daniel Kałuża, QED Software
  • Bartek Konarski, Security On-Demand
  • Agnieszka Sochal, QED Software

In case of any questions please post on the competition forum or write an email at contact {at} knowledgepit.ml 

             

This forum is for all users to discuss matters related to the competition. Good manners apply!
  Discussion Author Replies Last post
the first of additional data sets released Andrzej 2 by Daniel
Monday, August 19, 2019, 12:00:09
alert_time w 1 by Daniel
Monday, August 19, 2019, 11:31:36
Training set order Scott 5 by Scott
Wednesday, August 14, 2019, 12:42:43
Role of localized_alerts_data and submission score A 3 by Andrzej
Tuesday, August 06, 2019, 19:47:06
Additional data - event logs Andrzej 1 by Daniel
Friday, August 02, 2019, 18:45:25
why I keep getting this error jayesh 1 by jayesh
Monday, July 29, 2019, 23:22:23
The data sets were released! Andrzej 2 by Andrzej
Thursday, July 04, 2019, 00:32:06
The submission system is online! Andrzej 0 by Andrzej
Tuesday, July 02, 2019, 18:18:22
the submission system opens soon Andrzej 0 by Andrzej
Tuesday, July 02, 2019, 01:59:22
a delay in disclosure of the competition data Andrzej 2 by Andrzej
Monday, July 01, 2019, 21:42:55
No option to add team memeber after creating the group AMIT 1 by Andrzej
Thursday, May 30, 2019, 14:42:57