Vision-based biomechanical markerless motion classification

Main Article Content

Yu Liang Liew
Jeng Feng Chin

Keywords : vision, single camera, markerless, stick model, human motion, motion classification, data mining

This study used stick model augmentation on single-camera motion video to create a markerless motion classification model of manual operations. All videos were augmented with a stick model composed of keypoints and lines by using the programming model, which later incorporated the COCO dataset, OpenCV and OpenPose modules to estimate the coordinates and body joints. The stick model data included the initial velocity, cumulative velocity, and acceleration for each body joint. The extracted motion vector data were normalized using three different techniques, and the resulting datasets were subjected to eight classifiers. The experiment involved four distinct motion sequences performed by eight participants. The random forest classifier performed the best in terms of accuracy in recorded data classification in its min-max normalized dataset. This classifier also obtained a score of 81.80% for the dataset before random subsampling and a score of 92.37% for the resampled dataset. Meanwhile, the random subsampling method dramatically improved classification accuracy by removing noise data and replacing them with replicated instances to balance the class. This research advances methodological and applied knowledge on the capture and classification of human motion using a single camera view.

Article Details

How to Cite
Liew, Y. L., & Chin, J. F. (2023). Vision-based biomechanical markerless motion classification. Machine Graphics and Vision, 32(1), 3–24.

M. Aehnelt, E. Gutzeit, and B. Urban. Using activity recognition for the tracking of assembly processes: Challenges and requirements. In Workshop on Sensor-Based Activity Recognition (WOAR), page 12–21, Rostock, Germany, Mar 2014.

J. K. Aggarwal and Q. Cai. Human motion analysis: A review. Computer Vision and Image Understanding, 73(3):428–440, 1999. (Crossref)

O. Arbelaitz, I. Gurrutxaga, J. Muguerza, and J. M. Pérez. Applying resampling methods for imbalanced datasets to not so imbalanced datasets. In Conference of the Spanish Association for Artificial Intelligence, page 111–120, 2013. (Crossref)

R. De Bin, S. Janitzaa, W. Sauerbrei, and A. L. Boulesteix. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biometrics, 72(1):272–280, 2016. (Crossref)

M. Bosch, F. Zhu, and E. J. Delp. Video coding using motion classification. In 15th IEEE International Conference on Image Processing, page 1588–1591, 2008. (Crossref)

Z. Cao, G. Hidalgo, T. Simon, et al. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021. (Crossref)

Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh. Realtime multi-person 2D pose estimation using part affinity fields. In Proc. 30th IEEE Conference on Computer Vision and Pattern Recognition, page 1302–1310, Jan 2017. (Crossref)

J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik. Human pose estimation with iterative error feedback. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, page 4733–4742, Dec 2016. (Crossref)

C. K. Chan, W. P. Loh, and I. A. Rahim. Human motion classification using 2d stick-model matching regression coefficients. Applied Mathematics and Computation, 283:70–89, 2016. (Crossref)

M. G. Choi, K. Yang, T. Igarashi, et al. Retrieval and visualization of human motion data via stick figures. Computer Graphics Forum, 31(7):2057–2065, 2012. (Crossref)

W. Choi, L. Li, H. Sekiguchi, and K. Hachimura. Recognition of gait motion by using data mining. In International Conference on Control, Automation and Systems, page 1213–1216, 2013. (Crossref)

D. Cournapeau, M. Brucher, F. Pedregosa, et al. scikit-learn. Machine Learning in Python, 2023. [Accessed 15 Jan 2022].

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 1, page 886–893, 2005. (Crossref)

Q. Dang, J. Yin, B. Wang, and W. Zheng. Deep learning based 2D human pose estimation: A survey. Tsinghua Science and Technology, 24(6):663–676, 2019. (Crossref)

L. Devasena C. Effectiveness analysis of ZeroR, RIDOR and PART classifiers for credit risk appraisal. International Journal of Advances in Computer Science and Technology (IJACST), 3(11):6–11, 2014. Special issue of ICCAAC 2014.

R. Ferdinands. Advanced applications of motion analysis in sports biomechanics. In Proc. XXVIII International Symposium of Biomechanics in Sports, page 70–73, Jul 2010.

S. Fong, J. Liang, I. Fister, and S. Mohammed. Gesture recognition from data streams of human motion sensor using accelerated PSO swarm search feature selection algorithm. Journal of Sensors, 2015:205707, 2015. (Crossref)

G. B. Garibotto. 3-D computer vision modeling in video surveillance applications. In C. H. Chen, editor, Handbook of Pattern Recognition and Computer Vision, page 747–765. World Scientific, 2009. (Crossref)

Google. Cloud Tensor Processing Units (TPUs), 2022. [Accessed 15 Jan 2022].

Google. Colaboratory, 2022. [Accessed 15 Jan 2022].

Y. Guo, G. Xu, and S. Tsuji. Tracking human body motion based on a stick figure model. Journal of Visual Communication and Image Representation, 5(1):1-9, 1994. (Crossref)

S. U. Han, S. H. Lee, and F. Peña-Mora. Vision-based motion detection for safety behavior analysis in construction. In Construction Research Congress 2012: Construction Challenges in a Flat World, Proc. 2012 Construction Research Congress, page 1032–1041, 2012. (Crossref)

N. Hasler, B. Rosenhahn, T. Thormahlen, et al. Markerless motion capture with unsynchronized moving cameras. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, page 224–231, 2009. (Crossref)

G. Hidalgo, Z. Cao, T. Simon, et al. OpenPose. [Accessed 15 Jan 2022].

A. Jahan and K. L. Edwards. A state-of-the-art survey on the influence of normalization techniques in ranking: Improving the materials selection process in engineering design. Materials and Design, 65(1):335–342, 2015. (Crossref)

R. M. Kanko, E. K. Laende, G. Strutzenberger, et al. Assessment of spatiotemporal gait parameters using a deep learning algorithm-based markerless motion capture system. Journal of Biomechanics, 122(110414), 2021. (Crossref)

J. S. Kim, Y. W. Kim, Y. K. Woo, and K. N. Park. Validity of an artificial intelligence-assisted motion-analysis system using a smartphone for evaluating weight-bearing activities in individuals with patellofemoral pain syndrome. Journal of Musculoskeletal Science and Technology, 5(1):34–40, 2021. (Crossref)

S. Kloiber, V. Settgast, C. Schinko, et al. Immersive analysis of user motion in VR applications. Visual Computer, 36(10-12):1937–1949, 2020. (Crossref)

S. W. Knox. Survey of Classification Techniques. Wiley Series in Probability and Statistics, 2018. (Crossref)

N. Le, A. Heili, and J. Odobez. Long-term time-sensitive costs for CRF-based tracking by detection. In European Conference on Computer Vision Workshops, Lecture Notes in Computer Science, volume 9914, pages 43-51, 2016. (Crossref)

B. Li, B. Bai, and C. Han. Upper body motion recognition based on key frame and random forest regression. Multimedia Tools and Applications, 79(7-8):5197–5212, 2020. (Crossref)

T. Y. Lin, M. Maire, S. Belongie, et al. Research on face recognition based on CNN. In Microsoft COCO: Common objects in context, volume 8693 of Lecture Notes in Computer Science, page 740–755, 2014. (Crossref)

T.-Y. Lin, G. Patterson, M. R. Ronchi, et al. COCO. Common Objects in Context, 2020. [Accessed 6 Jan 2022].

H. Liu, Z. Ju, X. Ji, C. S. Chan, and M. Khoury. Human Motion Sensing and Recognition. Springer, Berlin Heidelberg, 2017.

D. C. Luvizon, H. Tabia, and D. Picard. Human pose regression by combining indirect part detection and contextual information. Computers and Graphics, 85:15–22, 2019. (Crossref)

OpenCV Team. OpenCV, 2022. [Accessed 15 Jan 2022].

A. N. Mohamed and M. M. Ali. Human motion analysis, recognition and understanding in computer vision: A review. Journal of Engineering Sciences, 41(5):1928–1946, 2013. (Crossref)

N. Nakano, T. Sakura, K. Ueda, et al. Evaluation of 3D markerless motion capture accuracy using OpenPose with multiple video cameras. Frontiers in Sports and Active Living, 2(50):1-9, 2020. (Crossref)

C. G. Nevill-Manning, G. Holmes, and I. H. Witten. The development of Holte’s 1R classifier. In Proc. 1995 2nd New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, page 239–242, Jan 1995. (Crossref)

H. Qian, Y. Mao, W. Xiang, and Z. Wang. Recognition of human activities using SVM multi-class classifier. Pattern Recognition Letters, 31(2):100–111, 2010. (Crossref)

J. Rittscher and A. Blake. Classification of human body motion. In Proc. IEEE International Conference on Computer Vision, volume 1, page 634–639, 1999. (Crossref)

C. Rubino, M. Crocco, V. Murino, and A. Del Bue. Semantic multi-body motion segmentation. In Proc. 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015, page 1145–1152, 2015. (Crossref)

C. Saranya and G. Manikandan. A study on normalization techniques for privacy preserving data mining. International Journal of Engineering and Technology, 5(3):2701–2704, 2013.

P. Schneider, R. Memmesheimer, I. Kramer, and D. Paulus. Gesture recognition in RGB videos using human body keypoints and dynamic time warping. In Lecture Notes in Computer Science, volume 11531, page 281–293, 2019. (Crossref)

C. H. Setjo, B. Achmad, and Faridah. Thermal image human detection using Haar-cascade classifier. In Proc. 2017 7th International Annual Engineering Seminar, pages 1-6, 2017. (Crossref)

K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Proc. 27th International Conference on Neural Information Processing Systems, volume 27 of NIPS Proceedings, page 568–576, 2014.

K. Sun, B. Xiao, D. Liu, and J. Wang. Deep high-resolution representation learning for human pose estimation. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, page 5686–5696, June 2019. (Crossref)

A. Switonski, H. Josinski, and K. Wojciechowski. Dynamic time warping in classification and selection of motion capture data. Multidimensional Systems and Signal Processing, 30(3):1437–1468, 2019. (Crossref)

T. Tsuji, S. Nakashima, H. Hayashi, et al. Markerless measurement and evaluation of general movements in infants. Scientific Reports, 10(1):1–13, 2020. (Crossref)

J. Wang and Z. Li. Research on face recognition based on CNN. In IOP Conference Series: Earth and Environmental Science, volume 170, page 032110, 2018. (Crossref)

L. Wang, Z. Ding, and Y. Fu. Low-rank transfer human motion segmentation. IEEE Transactions on Image Processing, 28(2):1023–1034, 2019. (Crossref)

L. Wang, W. Hu, and T. Tan. Recent developments in human motion analysis. Pattern Recognition, 36(3):585–601, 2003. (Crossref)

J. A. Webb and J. K. Aggarwal. Visually interpreting the motion of objects in space. Computer, 14(8):40–46, 1981. (Crossref)

I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal. The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques". Morgan Kaufmann, 2016. [Accessed 21 Sep 2021].

G. Xia, H. Sun, L. Feng, et al. Human motion segmentation via robust kernel sparse subspace clustering. IEEE Transactions on Image Processing, 27(1):135–150, 2018. (Crossref)

X. Xie, J. W. K. Ho, C. Murphy, et al. Testing and validating machine learning classifiers by metamorphic testing. Journal of Systems and Software, 84(4):544–558, 2011. (Crossref)

Q. Xu, G. Huang, M. Yu, and Y. Guo. Fall prediction based on key points of human bones. Physica A: Statistical Mechanics and Its Applications, 540:123205, 2020. (Crossref)

L. Yang and T. Zhao. Data mining and ergonomic evaluation of firefighter’s motion based on decision tree classification model. In Advanced Research on Computer Science and Information Engineering: International Conference: Proceedings, volume Part 2, page 212–217, 2011. (Crossref)

H. Zhang, W. Du, and H. Li. Kinect Gesture Recognition for Interactive System. Stanford University Term Paper for CS 299, 2012.

D. Zhou and Q. He. PoSeg: Pose-aware refinement network for human instance segmentation. IEEE Access, 8:15007–15016, 2020. (Crossref)

H. Zhou and H. Hu. Human motion tracking for rehabilitation - A survey. Biomedical Signal Processing and Control, 3(1):1-18, 2008. (Crossref)

T. Zhou, H. Fu, C. Gong, et al. Multi-mutual consistency induced transfer subspace learning for human motion segmentation. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, page 10274–10283, 2020. (Crossref)

T. Zult, J. Allsop, J. Tabernero, and S. Pardhan. A low-cost 2-D video system can accurately and reliably assess adaptive gait kinematics in healthy and low vision subjects. Scientific Reports, 9(1):1–11, 2019. (Crossref)



Download data is not yet available.
Recommend Articles