[Computational Statistics] Fisher Information

호주 대학원 생존기/Mathematics

[Computational Statistics] Fisher Information

Bright_Ocean 2021. 8. 20. 00:09

Estimator 의 efficiency 를 설명할 때 필요한 Fisher Information 에 대한 내용을 다룬 포스팅이다.

1. Motivation

parameter estimation problem을 푸는 과정에서 우리는 sample 데이터를 이용한 정보를 활용하게 된다.

그렇다면 알지 못하는 parameter에 대하여 sample data는 얼마나 많은 정보를 제공할 수 있는가?

에 대한 질문을 해볼 수 있다.

random variable $X\sim f(x|\theta)$ 에 관하여 만약 $\theta$ 가 true value 인 경우, log-likelihood의 파라이터에 관한 1차 미분은 0 에 근접한다.

(기본적인 Maximum likelihood estimation 하는 방법이다.)

$l(x|\theta) = log f(x|\theta)$ 라 하자. 그렇다면 $\theta$ 에 관한 일차 미분은 아래와 같다

$$l'(x|\theta) = \frac{\partial}{\partial \theta}log f(x|\theta) = \frac{f'(x|\theta)}{f(x|\theta)}$$

이때, $l'(X|\theta)$ 가 0 에 근접하다는 의미는 expected 된 것이므로 $\theta$에 대한 정보를 제공해주지 못한다

따라서, $[l'(X|\theta)]^2$ 을 random variable $X$ 가 제공하는 정보의 measure로 사용하는 것이 motivation이다.

2. Fisher Information

Fisher information은 아래와 같이 정의된다

$$I(\theta) = E_{\theta}[l'(X|\theta)^2] = \int l'(x|\theta)^2 f(x|\theta) dx$$

$\int f'(x|\theta) dx = \frac{\partial}{\partial \theta}\int f(x|\theta) dx = 0$ 을 참고하면

$E_{\theta}[l'(X|\theta)] = \int l'(X|\theta)f(x|\theta)dx \int \frac{f'(x|\theta)}{f(x|\theta)}f(x|\theta)dx = 0$ 이기 때문에

Fisher information을 아래와 같이 나타낼 수 있다.

$$I(\theta) = Var_{\theta}[l'(X|\theta)]$$

하지만 많이 사용되는 모양은 위의 variance를 사용하지 않는다.

$$\begin{align} l''(X|\theta) &= \frac{\partial}{\partial \theta}\left[ \frac{f'(x|\theta)}{f(x|\theta)} \right]\\ &=\frac{f''(x|\theta)f(x|\theta)-[f'(x|\theta)]^2}{[f(x|\theta)]^2}\\ &=\frac{f''(x|\theta)}{f(x|\theta)}-[l'(x|\theta)]^2 \end{align}$$

이제 $l''(X|\theta)$의 expectation을 생각해보자

$$\begin{align} E_{\theta}[l''(x|\theta)] &= \int \left[ \frac{f''(x|\theta)}{f(x|\theta)}-[l'(x|\theta)]^2 \right] f(x|\theta) dx\\ &=\int f''(x|\theta)dx-E_{\theta}[[l'(x|\theta)]^2] &\\ &= 0-E_{\theta}[[l'(x|\theta)]^2]\\ &=-I(\theta) \end{align}$$

$\int f''(x|\theta)dx = \frac{\partial^2}{\partial \theta^2}\int f(x|\theta)dx = 0$ 을 잘 기억한다면 위의 식은 금방 이해가 된다.

즉, Fisher information은 log-likeihood 를 $\theta$에 관하여 이차 미분해준 형태에 마이너스를 앞에 붙여준 것과 같다.

$$I(\theta) = -E[l''(x|\theta)] = -\int\frac{\partial^2}{\partial \theta^2}[log\ f(x|\theta)]f(x|\theta)dx$$

'호주 대학원 생존기 > Mathematics' 카테고리의 다른 글

[Computational Statistics] Confidence Interval of linear model parameters (선형모델-2) (0)	2021.08.04
[Computational Statistics] Linear model (선형모델) (0)	2021.08.01
[Computational Statistics] Random Vectors (랜덤 벡터) (0)	2021.07.28
[Computational Statistics] Linear algebra for the linear models (선형모델해석을 위한 선형대수) (0)	2021.07.27
[Numerical Analysis] Numerical Integration (0)	2021.07.07

현재글[Computational Statistics] Fisher Information

호주 대학원에서 살아남기 위해 발버둥 치는 이야기

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

호주에서 살아남기 위한 처절한 몸부림을 기록하는 블로그