Enhancing Chatbot Performance through Synthetic Data Augmentation

Loading...
Thumbnail Image

Date

2023-05

Journal Title

Journal ISSN

Volume Title

Publisher

The Ohio State University

Research Projects

Organizational Units

Journal Issue

Abstract

Large language models like GPT-3 have shown remarkable generative capabilities in producing human-like text. However, their slow and costly nature limits their applicability in certain domains, such as specialized chatbot systems. In this paper, we investigate the use of GPT-3 for synthetic data augmentation to improve the performance of a recurrent neural network (RNN) classifier model employed in a chatbot for the Columbus museum COSI (Center of Science and Industry). The goal is to generate user questions that help train the model to classify user inputs under specific response labels, ultimately providing users with the best possible answers from a predefined answer dataset. We experimented with various prompting strategies, including providing GPT-3 with answers, real-world context, and few-shot examples. Our findings indicate that augmenting an existing dataset with GPT-3 generated data can increase the classifier model's accuracy, even without using a few-shot strategy. Furthermore, GPT-3 can effectively create datasets from scratch, although traditional user data still outperforms it. This demonstrates the potential to transfer knowledge from a large model like GPT-3 to a smaller, faster model for practical applications, such as the museum's virtual guide. Additionally, we explored validation strategies to retain only the relevant data, which further improved the results. This paper highlights the promising capability of using large language models like GPT-3 for synthetic data augmentation in improving the performance of lightweight models in real-world applications.

Description

Keywords

GPT-3, chatbot, COSI, data augmentation

Citation