Instruction Tuning: EMNLP 2023 Overview Part 1

Ayush Kumar
2 min readJan 7, 2024

--

What is Instruction Tuning?

Instruction Tuning, in the context of Large Language Models (LLMs), refers to a specialized training method designed to enhance the model’s ability to understand and respond to user instructions or commands more accurately and effectively. This process involves fine-tuning the model on a dataset that comprises various instruction-input pairs and their corresponding desired outputs or actions.

At the recent EMNLP 2023 conference, a major theme that captured significant attention was the enhancement of instruction-tuned Large Language Models (LLMs) via: 1) Enhancing the Quality and Quantity of Instruction Data, and 2) Strategic Selection and Enhancement of Training Data.

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

Problem Statement: This approach identifies informative tasks that can actively tune models, thereby improving their generalization capabilities across various tasks.

Approach Idea:

  • Focuses on the strategic selection of tasks for instruction tuning based on prompt uncertainty.
  • The approach involves actively tuning the instruction-following models by leveraging a combination of supervised learning and reinforcement learning techniques.
  • The process includes iteratively updating the model based on feedback from both correct and incorrect responses to instructions.

Key Result: The quantified results show a significant improvement in instruction-following accuracy. The paper reports specific metrics (like accuracy percentage or improvement over baseline) demonstrating the efficacy of the approach.

DYNOSAUR: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Problem Statement: This paper addresses the issue of efficiently curating instruction-tuning data for LLMs. The specific problem is the manual and costly effort required in traditional data curation methods and the need for a more dynamic and automated approach.

Approach Idea: — Dynosaur automates the curation of instruction-tuning data for large language models (LLMs).

  • It leverages the metadata from existing datasets to generate relevant data fields and corresponding instructions.
  • The process involves using LLMs (like GPT-3.5-turbo) to synthesize instructions based on dataset descriptions, data fields (like title, text, author), and annotations.
  • It employs a dual approach: description-aware generation (considering dataset descriptions) and description-unaware generation (focusing solely on data fields and annotations), to generate a variety of instruction-tuning tasks.
  • Post-processing includes filtering invalid tasks, organizing instruction data, and adding label spaces for classification tasks.

Key Result:

  • Cost Efficiency: Dynosaur reduced the cost of generating instruction-tuning data significantly. For example, generating 800K instruction-tuning samples cost less than $12 USD using GPT-3.5-turbo, compared to around $500 USD using other methods like ALPACA and INSTRUCTION GPT-4 for a smaller dataset of 52K instances.
  • Performance Improvement: Models trained with Dynosaur data showed substantial improvements. For instance, training T5–3B with Dynosaur resulted in a 2.5–22 ROUGE-L score improvement and 2.8–12.8 METEOR score improvement over other datasets on different benchmarks like SUPER-NI and LONGFORM.

You can connect with me on LinkedIn. To read the next part, please go through Part 2.

--

--

Ayush Kumar

Machine Learning Scientist | Traveled places in 7 countries | Applied Researcher | IIT Patna Alumnus | Technical Writing