Can't Inflate Data? Let the Models Unite and Vote: Data-Agnostic Method to Avoid Overfit with Small Data [Konferenzbericht] Paper presented at the International Conference on Educational Data Mining (EDM) (16th, Bengaluru, India, Jul 11-14, 2023).

Autor/inn/en	Shimmei, Machi; Matsuda, Noboru
Titel	Can't Inflate Data? Let the Models Unite and Vote: Data-Agnostic Method to Avoid Overfit with Small Data [Konferenzbericht] Paper presented at the International Conference on Educational Data Mining (EDM) (16th, Bengaluru, India, Jul 11-14, 2023).
Quelle	(2023), (10 Seiten) PDF als Volltext kostenfreie Datei Verfügbarkeit
Sprache	englisch
Dokumenttyp	gedruckt; online; Monographie
Schlagwörter	Artificial Intelligence; Training; Natural Language Processing; Educational Research; Data; Classification; Accuracy; Models; Questioning Techniques; Program Effectiveness + Suchen Sie Ihr Suchwort? Künstliche Intelligenz; Ausbildung; Natürliche Sprache; Bildungsforschung; Pädagogische Forschung; Daten; Classification system; Klassifikation; Klassifikationssystem; Analogiemodell; Befragungstechnik; Fragetechnik
Abstract	We propose an innovative, effective, and data-agnostic method to train a deep-neural network model with an extremely small training dataset, called VELR (Voting-based Ensemble Learning with Rejection). In educational research and practice, providing valid labels for a sufficient amount of data to be used for supervised learning can be very costly and often impractical. The shortage of training data often results in deep neural networks being overfitting. There are many methods to avoid overfitting such as data augmentation and regularization. Though, data augmentation is considerably data dependent and does not usually work well for natural language processing tasks. Moreover, regularization is often quite task specific and costly. To address this issue, we propose an ensemble of overfitting models with uncertainty-based rejection. We hypothesize that misclassification can be identified by estimating the distribution of the class-posterior probability P(y\|x) as a random variable. The proposed VELR method is data independent, and it does not require changes to the model structure or the re-training of the model. Empirical studies demonstrated that VELR achieved classification accuracy of 0.7 with only 200 samples per class on the CIFAR-10 dataset, but 75% of input samples were rejected. VELR was also applied to a question generation task using a BERT language model with only 350 training data points, which resulted in generating questions that are indistinguishable from human-generated questions. The paper concludes that VELR has potential applications to a broad range of real-world problems where misclassification is very costly, which is quite common in the educational domain. [For the complete proceedings, see ED630829.] (As Provided).
Anmerkungen	International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Erfasst von	ERIC (Education Resources Information Center), Washington, DC
Update	2024/1/01