We present our toolkit to automatically evaluate recognition algorithms. There are few published comparative evaluations of sketch recognition algorithms and those that exist do not provide benchmarking or direct comparisons because standardised data and an evaluation platform is not available. By unifying data collection, labelling and evaluation in one tool, fair, flexible and comprehensive evaluations are possible. Currently we have 6 existing recognizers integrated into this tool. With our initial evaluations of these recognizers we have observed that the context from which training data is taken has an effect on recognition success rates. These results suggest that an evaluation platform such as this is a powerful adjunct for sketch recognition research.