Due to massive deviations between the validation and testing results I suspect some of us may have been misled or made wrong assumptions. Could you please confirm that the preliminary validation set we worked on is a truly random sample of the full testing set? Such huge deviations could only happen if the validation set was a time disjoint subset of the testing ie it covered the period before the rest of the testing set. If this is the case then I was misled when asking specifically about it earlier in the forum. None of modelling that assumed similar prior class distributions between validation and testing set made therefore any sense and affected many contestants?
Quick publishing of the test labels would clarify all the doubts