Data Driven Algorithm Development

Data Driven Algorithm Development is a technique where a library of perhaps thousands of examples of input data are processed by the algorithm and the resulting output compared with labeled ground truth annotation. The comparison statistics or score are fed back to the developer to drive the development process toward the performance goals.

For one of perhaps thousands of example inputs for a mobile app image processing algorithm, images along with tap and swipe coordinates from user genstures over content of interest is the input data. The boundary of the document from image partitioning and boxes around text associated with the gestures are the ground truth annotation. Annotation may be manual, or may be algorithm output blessed by a human curator or by a strong verification method.

Example ground truth annotation for document and three gestures:

Example annotation interface for inspecting and labeling data with ground truth expected results: