Transformer-based models have dominated the field of natural language processing (NLP) since their introduction in 2017. Tokens for words, morphemes, punctuation, etc., are generated from text entered by the transformer. However, since processors need to pay attention to every token in the input, their context windows need to be larger to handle long-running jobs like book summaries, etc., where the number of tokens in the input could easily exceed the number of tokens in the input. one hundred thousand. To handle inputs of arbitrary length, a group of researchers at Carnegie Mellon University provides a broad strategy for improving model performance by integrating pretrained encoder-decoder converters with an external datastore.
Unlimiformer is a new retrieval-based strategy that expands the input length tolerance of pre-trained language models during testing. Any existing encoder-decoder transformer can be upgraded with Unlimiformer to accept unlimited inputs. Unlimiformer creates a datastore on the hidden states of all input tokens given a long input sequence. Next, the decoder uses its default cross-attention to access the database and focus on the first k input tokens. The datastore supports sublinear searches and can be stored in GPU or CPU memory. A trained model can have its checkpoint upgraded by Unlimiformer without further training. The effectiveness of Unlimiformers can be further improved by tuning.
The maximum length of an input to a transformer is limited by the size of the encoders context window. However, different pieces of information can be significant during the decoding stages, and different centers of attention can focus on multiple aspects of the data. As a result, a fixed context window can be inefficient as it focuses on the tokens that an attention head needs to prioritize. At each stage of decoding, Unlimiformer gives each head the ability to select its own unique context window from the entire input. To formalize this, we inject an Unlimiformer lookup into the decoder before applying cross attention. This causes the model to perform a k-nearest neighbor (kNN) lookup in an external data store, selecting a set of tokens to focus on for each level of decoding and attention.
To further increase the effectiveness of Unlimiformers, researchers are now focusing on training approaches. As a preliminary stage, they consider alternative training methods that require only less processing power than the conventional tuning regimen. They also look into the computationally expensive option of training the Unlimiformer directly.
Study code and models are available for download from GitHub.
Empirically, the team tested Unlimiformer on long-document and multi-document summarization tasks, showing that it could summarize documents with up to 350,000 tokens without truncating inputs. Existing pre-trained models were also fine-tuned using Unlimiformer, allowing them to handle unlimited inputs without the need for weights or recently learned changes to the source code. By adding structure to the data store or fetching embeds in blocks, Unlimiformer can lead to further performance improvements in large language models with augmented fetch, which have shown encouraging results in downstream sequence-to-sequence generation tasks. Embedding the structure in the datastore or fetching embeds in blocks are two ways the researchers believe future work could increase speed. To further enhance the performance of retrieval-enhanced LLMs on challenging downstream tasks, the information retrieval community has developed a broad range of approaches to enhance retrieval. This is why the researchers behind the HuggingFace Transformers library have released a script that allows you to inject Unlimiformer into any model with a single click.
Check out thePaperANDGithub link. Don’t forget to subscribeour 20k+ ML SubReddit,Discord channel,ANDEmail newsletterwhere we share the latest news on AI research, cool AI projects, and more. If you have any questions regarding the above article or if you have missed anything, please do not hesitate to email us atAsif@marktechpost.com
Check out 100s AI Tools in the AI Tools Club
Dhanshree Shenwai is a software engineer and has good experience in FinTech companies covering Finance, Cards & Payments and Banking with keen interest in AI applications. He is enthusiastic about exploring new technologies and advancements in today’s changing world, making everyone’s life easier.
#CMU #researchers #present #Unlimiformer #method #augmenting #pretrained #encoderdecoders #external #datastore #input #unlimited #length