this is indeed a very good question. There were 8 unique firefighters. They all had slightly different excercise scenarios.
I must be missing a point here -- why do you think the information about the number of firefighters is important?
While I don't think that the information about the number of firefighters is important, it would be really helpful to know the ID of firefighers assigned to particular row in the dataset. The fact that the activites in test data are performed by different people changes the game completely. It would be good, for example, to run x-validation procedure not randomly over data but over IDs of firefighters to get more reliable error estimates (so that test fold "is performed" by different firemen).
Those are my thoughts, anyone agrees?
Hi Jan, -
Yes - I would agree that the information about ID of firefighter would be very helpfull. Even more - not only in validation procedure but also for classification as well. From the application point of view, however, general algorithm that needs to be learne only once is more needed.