Early marriage remains a pressing issue among adolescents in Lombok, Indonesia, influenced by cultural norms, educational barriers, and economic challenges. This study develops an emotion classification and reason identification framework for a virtual counseling chatbot to support prevention efforts. Five functional emotion categories ‘Enthusiastic’, ‘Gentle’, ‘Analytical’, ‘Inspirational’, and ‘Cautionary’ were defined to capture counseling tones. The system leverages IndoBERT with a two-phase fine-tuning strategy. Phase 1 used a balanced dataset of 2,000 samples and achieved a macro F1-score of 0.95, while Phase 2 refined the model using 10,000 imbalanced pseudo-labeled samples, yielding a macro F1-score of 0.89 and improved sensitivity to minority classes. In addition, a semantic similarity-based reason identification module was implemented to classify user inputs into Education, Economy, Religion, or Culture categories, enhancing context awareness beyond simple keyword matching. Performance evaluation employed accuracy, precision, recall, and F1-score, supported by confusion matrices and training plots for generalization analysis. A descriptive emotion-to-gesture mapping was also designed to link each emotion category with static body pose visualizations, providing a conceptual basis for future multimodal applications.
Keywords: early marriage, emotion classification, gesture mapping, IndoBERT, NLP, virtual chatbot