SIAST: A Slot Imbalance-Aware Self-Training Scheme for Semi-Supervised Slot Filling

Abstract

Slot filling where labelled data are scarce could leverage the recent advances in self-training methods. However, existing self-training models ignore the prevalent imbalanced slot distribution problem in many slot filling datasets. These methods could exacerbate label imbalance during learning iterations, resulting in poorer performance in minority slot classes, which is crucial in many dialogue systems applications. To solve this, we propose a novel self-training scheme for imbalanced slot filling that aims to learn unbiased margins between slot classes while mitigating potential slot confusion, and adaptively samples pseudo-labelled data to balance the slot distribution of the training set. Experimental results show that our method achieves significant improvement on the minority slots, while also setting the new state-of-the-art for semi-supervised slot filling tasks.

Publication
In IEEE International Conference on Acoustics, Speech and Signal Processing
Yuehuan He
Yuehuan He
Manager, AI and Machine Learning

My research interests include applied machine learning, natural language processing, operation research and optimization.