On the use of long-term features in a newborn cry diagnostic system

F. Salehian Matikolaie, C. Tadj

This study proposes using a novel combination of short-term and long-term features from different timescales to develop an automatic newborn cry diagnostic system to differentiate the cry audio signals (CASs) of healthy infants from those with respiratory distress syndrome (RDS). Mel-frequency cepstral coefficients (MFCCs) were used as the short-term features, while the melody and rhythm features obtained from longer timescales were used as the long-term features. We hypothesized that the differences between these groups may occur on several timescales. Finally, a support vector machine model was used to generate the final classification. Among other findings, the best results were obtained from the combination of all three feature sets (the MFCCs and the rhythm and melody features) in the expiration episode; the combination of MFCCs and tilt features improved the classifier performance in the inspiration episode. In terms of F-score measure, in the inspiration experiment, the tilt features alone were the strongest classification features for differentiating infants with RDS from healthy infants. The results indicate that the combination of short-term and long-term features provide a better classification method for differentiating the CASs of healthy infants versus RDS infants. Moreover, the results confirmed the importance of long-term features in expiration and inspiration episodes as diagnostic markers between groups of healthy infants and RDS infants.

Keywords: Long-term features, Melody, Rhythm, Short-term features, Mel-frequency cepstral coefficient, Support vector machine, Newborn infant cry, Expiration and inspiration cry.