Evaluation metric

IEEE BigData 2020 Cup: Predicting Escalations in Customer Support

Evaluation metric

by hieuvq - Tuesday, May 19, 2020, 05:57:06

Dear organizers,

I would like to get confirmation on the evaluation metric of this competition. It looks like to me that the evaluation metric of this competition is exactly the same as that of the FedCSIS'20 Challenge. It means that y¯ is the mean value of the target variable COMPUTED from the training data set. Is it true?

Thanks,

RE: Evaluation metric

by andrzej - Thursday, May 21, 2020, 11:50:31

Hello,

Thanks for that question! :-)

In this particular competition, the training instances can be defined in various ways (e.g., at various time points), so the mean target value in the training set is a vague concept. For this reason, in the evaluation metric for this challenge, we are planning to use the mean of the true target values of the test instances.

We realize that such a definition of the evaluation metric can be used to reveal some information that would not be normally available (i.e. the mean target value in the preliminary test set), however, we think that it should not have much influence on the final quality of submissions.

Best regards,
Andrzej Janusz

RE: Evaluation metric

by hieuvq - Friday, May 22, 2020, 05:38:02

Interesting, thanks for your clear explanation.

RE: Evaluation metric

by dymitrruta - Friday, May 22, 2020, 20:57:54

Dear Andrzej,

From your answer and my observations I gather this is not the case for FEDCSIS Challenge. I believe the mean(y), there, is a mean over the entire training and testing set, otherwise you would expose the competition to the risk of a single flat test series hijacking the whole mean to the -infinite scoring which for FEDCSIS Contest is a very likely possibility. Could you please confirm?

Dymitr

RE: Evaluation metric

by andrzej - Saturday, May 23, 2020, 13:37:03

Yes, indeed.

The case of the FedCSIS challenge is different.

Andrzej