Measuring the Intrinsic Dimension of Objective Landscapes

Link
Abstract

Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. We slowly increase the dimension of this subspace, note at which dimension solutions first appear, and define this to be the intrinsic dimension of the objective landscape. The approach is simple to implement, computationally tractable, and produces several suggestive conclusions. Many problems have smaller intrinsic dimensions than one might suspect, and the intrinsic dimension for a given dataset varies little across a family of models with vastly different sizes. This latter result has the profound implication that once a parameter space is large enough to solve a problem, extra parameters serve directly to increase the dimensionality of the solution manifold. Intrinsic dimension allows some quantitative comparison of problem difficulty across supervised, reinforcement, and other types of learning where we conclude, for example, that solving the inverted pendulum problem is 100 times easier than classifying digits from MNIST, and playing Atari Pong from pixels is about as hard as classifying CIFAR-10. In addition to providing new cartography of the objective landscapes wandered by parameterized models, the method is a simple technique for constructively obtaining an upper bound on the minimum description length of a solution. A byproduct of this construction is a simple approach for compressing networks, in some cases by more than 100 times.

Synth

Problem:: 신경망이 많은 파라미터를 사용하지만 실제 필요한 파라미터 수 불명확 / 목적 함수 공간의 본질적 복잡성 측정 방법 부재 / 파라미터 효율성과 문제 난이도 비교를 위한 정량적 지표 필요

Solution:: 원본 파라미터 공간(RD) 대신 무작위 부분공간(Rd)에서 신경망 훈련 / 해결책이 처음 나타나는 최소 차원을 "Intrinsic Dimension"으로 정의 / 부분공간 차원을 점진적으로 증가시키며 기준 성능의 90%(dint90)에 도달하는 지점 측정

Novelty:: 다양한 문제와 모델의 본질적 복잡성을 정량적으로 측정하는 간단한 방법 제시 / 모델 크기가 증가해도 Intrinsic Dimension이 거의 일정하게 유지되는 현상 발견 / 다른 학습 유형(지도학습, 강화학습) 간 문제 난이도 비교 가능 / 신경망 압축을 위한 효율적 접근법 제공

Note:: Intrinsic Dimension 개념을 처음으로 도입 / CNN은 Random MNIST 학습시 Intrinsic Dimension이 커지지만 FC는 그렇지 않음 → 문제 난이도를 이걸로 정의할 수 있음

Summary

Motivation

Method

file-20250404014901614.png

파라미터 전체의 차원이 3인 경우 (좌), Intrinsic Dimension이 2인 경우 최적화 경로 (우)

효율적인 투영 방법

Method 검증

MNIST 실험 결과

CIFAR-10 실험 결과

강화학습(RL) 실험 결과