Jakubowski, K., Eerola, T., Alborno, P., Volpe, G., Camurri, A., & Clayton, M. (2017). Extracting coarse body movements from video in music performance: A comparison of automated computer vision techniques with motion capture data. Frontiers in Digital Humanities, 4, 9.
Summary: Three computer vision techniques were implemented to measure musicians’ movements from videos and validated against motion capture (MoCap) data from the same performances. Overall, all three computer vision techniques exhibited high correlations with MoCap data (median Pearson’s r values of 0.75–0.94); the best performance was achieved using two-dimensional tracking techniques and when the to-be-tracked region of interest was more narrowly defined (e.g. tracking the head as opposed to the full upper body).
We have developed a set of user-friendly tools for quantifying and tracking movement from videos using EyesWeb (as reported in Jakubowski et al., 2017 above). These have been documented and made freely available for use by other researchers on GitHub at: