A Fully Automated Approach for Baby Cry Sounds Segmentation in a Realistic Clinical Environment and Boundary Detection of Corresponding Expiratory and Inspiratory Episodes

L. Abou-Abbas, C. Tadj, H. F. Alaie

Detection of cry sounds is generally an important pre-processing step for various applications involving cry analysis such as diagnostic systems electronic monitoring systems,emotion detection and robotics for baby caregivers.Given its complexity, an automatic cry segmentation system is a rather challenging topic.A new framework for automatic cry sound segmentation for application in a cry-based diagnostic system has been proposed.We studied the contribution of various additional time-frequency domain features to increase the robustness of a GMM/HMM-based cry segmentation system in noisy environments.We introduced a fully automated segmentation algorithm to extract cry sound components,audible expiration&inspiration,based on two approaches:statistical analysis based on GMM/HMM classifiers and a post-processing method based on intensity,zero crossing rate,and fundamental frequency feature extraction.The main focus of this paper is to extend the systems developed in our previous works to include a post-processing stage with a set of corrective and enhancing tools to improve the classification performance.This full approach allows us to precisely determine the start&end points of the expiratory and inspiratory components of a cry signal, EXP and INSV in any given sound signal.Experimental results have indicated the effectiveness of the proposed solution. EXP & INSV detection rates of approximately 94.29% and 92.16% respectively were achieved by applying a 10-fold cross-validation technique to avoid over-fitting.