Improving GPT-2 Throughput for Lossless Text Compression

Loading...
Thumbnail Image

Date

2024-05

Journal Title

Journal ISSN

Volume Title

Publisher

The Ohio State University

Research Projects

Organizational Units

Journal Issue

Abstract

Compression helps with handling the enormous amount (in the hundreds of millions of terabytes) of data generated daily. We can compress data with redundancies at higher rates with better models of data. Knowing large language models' (LLM) impressive performance at modeling text data, they seem well suited for lossless text compression. We implement a lossless text compressor that uses arithmetic coding with GPT-2. A naive GPT-2-based compressor is slow---it compresses 200-500 bytes per second with a GPU compared to almost two million bytes per second by 7-Zip (a general-purpose compressor). Though such a compressor's outputs are about 30% smaller than 7-Zip's, the poor compression speed limits its practicality. Hence, we investigate how various LLM optimizations impact our compressor's performance and speed. We achieve a 1.5x speedup without significant performance degradation. We achieve further speedup (over 2x) if we accept sacrificing performance. Additionally, we find that increasing model size improves compression performance more than increasing context size and that distilled models compress better than pruned models.

Description

Keywords

large language models, lossless text compression, transformer acceleration

Citation