Gradient Surgery for Multi-Task Learning

Link
Abstract

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge. Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks to enable more efficient learning. However, the multi-task setting presents a number of optimization challenges, making it difficult to realize large efficiency gains compared to learning tasks independently. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. In this work, we identify a set of three conditions of the multi-task optimization landscape that cause detrimental gradient interference, and develop a simple yet general approach for avoiding such interference between task gradients. We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient. On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. Further, it is model-agnostic and can be combined with previously-proposed multi-task architectures for enhanced performance.

Synth

Problem:: Multi-Task Learning에서 발생하는 그래디언트 간 충돌(Gradient Interference) 문제로 인해 효율성과 성능 저하 발생

Solution:: 그래디언트 투영 기법(PCGrad)을 사용하여 충돌하는 그래디언트를 다른 작업 그래디언트의 법선 평면에 투영함으로써 간섭 최소화

Novelty:: Multi-Task Learning에서 최적화 어려움의 원인을 정의/PCGrad가 해당 원인들을 해소할 수 있음을 수학적으로 보임/Gradient의 방향과 크기 모두를 최초로 고려함 (개별적으로 고려한 연구는 존재했음)

Note:: 직접적으로 학습에 따른 Gradient Conflict를 보이기 보다 Positive Cos의 비율을 보임. PCGrad를 적용해도 0.5 정도에서 진동했음

Summary

Motivation

Multi-Task Learning의 비극적 트리오

file-20250329022338251.png

(a)처럼 Loss-Landscape가 나오도록 설계한 2차원 최적화 문제 Adam과 제안 방식비교

Method

PCGrad(Projecting Conflicting Gradients)

file-20250329023013687.png

(a) 갈등이 발생하는 경우, (b), (c) 처럼 서로의 법선 평면에 투영하도록 조작. (d) 처럼 갈등이 없는 경우는 그대로 사용

이론적 증명

Method 검증

다중 작업 지도 학습 실험

다중 작업 강화 학습 실험

비극적 트리오 실증 분석

file-20250329025411117.png