Which implementation of SVR is or will be used for the final evaluation?
Can we expect it to be the scikit-learn SVR? If so, what version of scikit-learn?
Can we expect the exact method to be very similar to the one in the competition starter notebook?
Asking as there is a slight possibility that different implementations react to hyperparameters (or the data) differently and thus it seems safest to optimize for the specific SVR used by the organizers.
The final evaluation (as well as preliminary) is based on the same SVR implementation as given in the R script.
However, it seems that both R and Python (scikit-learn) implementations of SVR use the same library under the hood (libsvm), so the results should be exactly the same. It was double checked with our inital scripts, which should give exact same results.
Thx for the swift reply, good to know!
Come to think of it, according to timestamps you've replied before I posted my question - check the site's timezone code ;)
I wonder if it would be beneficial to publish the r scores of all 10 individual solutions from the published randomSolution file, evaluated perhaps on the available validationData set. It would allow the competitors to verify that they have the same svr implementation or use the same parameters. There are many more parameters of SVR and for instance there are differences in representation of gamma parameter in other software packages. Cheers
I can't modify any competition data myself, but is it really necessary?
As I stated above, we use the same SVR implementation as given in the R starter script. So you can check any existing SVR implementation with the one available in e1071 package. We can provide more detailed description of our R environment, but on Monday the soonest - we have a long weekend in Poland :-)
Sure I understand your position, I have evaluated the Python code and got the results I wanted. The problem is still that I prefer to code in other software than Python/R, for example Matlab, and I cant get the same checkpoint results in Matlab that I get in Python for SVR. There might be some subtle SVR implementation differences that are hard to track down. Now even though I may have good results in Matlab, I may get much worse result in your evaluation due to SVR implementation differences or other subtle parameters that are not controllable in Python, which is not an ideal situation. Cheers
do you still need those individual scores for the subsets from the randomSolution.txt file?