<Minimum Variance Unbiased Estimator>
Let T(X) be the unbiased estimator of g(θ).
Among available unbiased estimators what satisfy E(T(X))=g(θ), you can find the most favorable estimator with Mean Square Error.
Mean square error, which is usually denoted as MSE is the expected value of squared error which is one of loss functions.
The reason for computing the expected value of the loss function is that T(X) is the function of a random variable X.
So you can find the most favorable estimator of g(θ) by finding unbiased estimator T(X) which also makes MSE the least value.
MSE= E(T(X)-g(θ))^2 is the same value with MSE= var(T(X)-bias^2 as I prove above.
However, bias is always 0 when it comes to the unbiased estimator because E(T(X)) equals g(θ).
So the T(X) that minimizes MSE is T(X) which has the smallest variance.
That is, an estimator that has a smaller value of variance is a better estimator for the function of the parameter.
So this post is to introduce one of the ways to find the minimum variance unbiased estimator.
There are two ways to find MVUE (Minimum variance unbiased estimator).
The first one is Cramer-Rao Lower bound and the second way is complete sufficient statistic.
I will introduce Cramer-Rao lower bound which shows lower bound of unbiased estimator's variance.
<Cramer-Rao lower bound - Information Inequality>
Cramer-Rao lower bound method is needed to find the least variance that the function of the parameter can have.
So the smallest value that the estimator(MVUE) can have is given below.
<I(θ): Fisher's Information>
You may find I(θ) in Information Inequality.
I(θ) is called Fisher's Information.
The bigger I(θ), the more information on the parameter θ that observations x have.
That is because the bigger I(θ), the more the I(θ) depends on the parameter θ.
<Fisher's Information with some famous distributions>
Normal distribution with parameter M
Exponential distribution with parameter λ
Poisson distribution with parameter λ
Uniform distribution with parameter θ
Bernoulli distribution with parameter p
Binomial distribution with parameter p
<Properties of Fisher's Information>
1. Normal Approximation - Relationship between Maximum likelihood and Fisher's Information
The difference between the parameter and estimator which is derived by the maximum likelihood estimator converges to a normal distribution.
That is, as n gets larger, the error approximately converges to the normal distribution.
2.
3.
Ix(θ) is the fisher information of θ within a single trial, so it is called as the unit fisher information.
the expected fisher Information in X^n is the same as n*Ix(θ) where n is the number of trials.
You can easily understand this example by comparing fisher information of a parameter p in Bernoulli distribution and Binomail distribution.
Fisher information of θ in Bernoulli distribution is called unit fisher information because it deals with only a trial.
And Binomial distribution is sum of n independent Bernoulli distribution. That is, binomial distribution deals with n trials of Bernoulli distribution.
So you can understand this property with relationship between two distributions.
<Information Inequality - Cramer-Lao lower bound >
<Example of using Information Inequality for finding MVUE>
<Limitation of Information Inequality>
Even though an estimator does not have a lower bound of variance what Information Inequality shows, the estimator can be the minimum variance unbiased estimator.
Also, if the estimator violates Information Inequality assumption, an estimator that has smaller variance than MVUE computed by Information Inequality can exist.