LLMs — Parameter Efficient Fine-tuning
In the preceding articles, I journeyed through the fundamentals of Language Models, delving into various types of these model architectures and their primary objectives. I also took the time to explore and grasp the importance of crafting effective prompts and configuring the models accordingly. Additionally, I delved into diverse techniques for refining (fine-tuning) models for single-task and multi-task objectives while considering the metrics used to evaluate their performance.
Fine-tuning LLMs for specific tasks is not without hurdles. A primary challenge is “Catastrophic Forgetting,” where the process erases prior task learning upon adapting to new ones. This process is computationally demanding and memory-intensive, which in turn leads to cost inefficiencies. Managing these inefficiencies becomes crucial as fine-tuning involves multiple LLM instances for different tasks, creating deployment difficulties. Adding to the complexity, large-scale PLMs have a prohibitive price tag for fine-tuning, restricting accessibility. Fine-tuning relies on task-specific datasets. If these datasets are limited or biased, the fine-tuned model’s performance might be compromised and not generalize well to new, unseen data.
However, the upcoming article takes a more detailed plunge into a potential solution — Parameter Efficient Fine-Tuning (PEFT) — and aims to…