Enhancing Chatbot Performance through Synthetic Data Augmentation
Loading...
Date
2023-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
The Ohio State University
Abstract
Large language models like GPT-3 have shown remarkable generative capabilities in producing human-like text. However, their slow and costly nature limits their applicability in certain domains, such as specialized chatbot systems. In this paper, we investigate the use of GPT-3 for synthetic data augmentation to improve the performance of a recurrent neural network (RNN) classifier model employed in a chatbot for the Columbus museum COSI (Center of Science and Industry). The goal is to generate user questions that help train the model to classify user inputs under specific response labels, ultimately providing users with the best possible answers from a predefined answer dataset. We experimented with various prompting strategies, including providing GPT-3 with answers, real-world context, and few-shot examples. Our findings indicate that augmenting an existing dataset with GPT-3 generated data can increase the classifier model's accuracy, even without using a few-shot strategy. Furthermore, GPT-3 can effectively create datasets from scratch, although traditional user data still outperforms it. This demonstrates the potential to transfer knowledge from a large model like GPT-3 to a smaller, faster model for practical applications, such as the museum's virtual guide. Additionally, we explored validation strategies to retain only the relevant data, which further improved the results. This paper highlights the promising capability of using large language models like GPT-3 for synthetic data augmentation in improving the performance of lightweight models in real-world applications.
Description
Keywords
GPT-3, chatbot, COSI, data augmentation