Text Summarization with Gemma

Overview

The goal of this competition was to create a notebook that demonstrates how to use the Gemma LLM to accomplish one of the following data science oriented tasks:

Explain or teach basic data science concepts.
Answer common questions about the Python programming language.
Summarize Kaggle solution write-ups.
Explain or teach concepts from Kaggle competition solution write-ups.
Answer common questions about the Kaggle platform.

My final choice was the complex world of text summarization using Gemma and LangChain. The key aspects I discuss are:

Establishing a text summarization pipeline using Gemma and LangChain
Providing an overview of the crucial parameters and methods one should keep in mind while working with an LLM
Exploring summarization techniques, such as Stuffing, MapReduce and Refine
Fine-tuning Gemma using Parameter Efficient Fine-Tuning (PEFT)
Future considerations and next steps

This work aims to build a comprehensive understanding of the task and develop a pipeline that can serve as a good starting point for individuals interested in approaching summarization tasks using open-source models like Gemma.

You can find here the Kaggle notebook I published.