we currently don't want to publish the code before the end of the challenge. Its results are still quite high on the Leaderboard, and we want to avoid biasing participants toward our way of thinking about this task.
Generally speaking, we did a pretty standard feature extraction from the intermediate data table (the one which describes localized alerts associated with the records from the training and test data), and we run xgboost. The 'extended baseline' did not use the log event data at all - it is still an open question whether this data can give any more value over the localized alerts. I hope that this challenge can give us this answer ;-)