I would like to get confirmation on the evaluation metric of this competition. It looks like to me that the evaluation metric of this competition is exactly the same as that of the FedCSIS'20 Challenge. It means that y¯ is the mean value of the target variable COMPUTED from the training data set. Is it true?
Thanks for that question! :-)
In this particular competition, the training instances can be defined in various ways (e.g., at various time points), so the mean target value in the training set is a vague concept. For this reason, in the evaluation metric for this challenge, we are planning to use the mean of the true target values of the test instances.
We realize that such a definition of the evaluation metric can be used to reveal some information that would not be normally available (i.e. the mean target value in the preliminary test set), however, we think that it should not have much influence on the final quality of submissions.
Interesting, thanks for your clear explanation.
From your answer and my observations I gather this is not the case for FEDCSIS Challenge. I believe the mean(y), there, is a mean over the entire training and testing set, otherwise you would expose the competition to the risk of a single flat test series hijacking the whole mean to the -infinite scoring which for FEDCSIS Contest is a very likely possibility. Could you please confirm?