<Bayesian Statistics>
Bayesian Statistics is the way to update the information about any distribution with prior information after observation.
That is, It allows us to form posterior distribution which includes every information about the parameter by using conditional probability.
And the conditional probability might be familiar with you if you studied about the Bayes Theorem before.
The thing you should know is that , in Bayesian Statistics, the parameter is not regarded as a fixed one. We can regard it as a random variable when using this method for using the prior information about the parameter.
The reason why Bayesian Statistics is important for finding estimator is that we can find the estimators of parameters by using posterior distribution.
<Bayes Theorem>
<The way to find Posterior distribution of a parameter>
We can find the Posterior distribution of a parameter which contains every information about the parameter by combining the prior information about the parameter and the sample observations. We set the prior pdf of parameter which is denoted as π(θ). We set the prior pdf by considering the prior information about the parameter. For example, if the random variable is 'get a head in a coin toss', we use prior information of parameter p that the probability is close to 1/2.
<the Posterior distribution of Ber(p) when Xi is to get a head of a toss>
The random variable here is 'get a head in a coin toss', we use prior information of parameter p that the p is close to 1/2. To set the prior pdf, we can consider several distributions and finally find the Beta(2,2) distribution is appropriate. I will show the graph of Beta(2,2) distribution using R. By using Bayesian Statistics, we can have the posterior distribution which is also Beta function. For this case, that is, If the prior distribution and the posterior distribution are in the same distribution family, the prior distribution is called "Conjugate Prior Distribution." We can show that Beta family of distributions are conjugate family of prior distributions. We should consider another situation which is that we know nothing about the parameter before observing the random samples. Then, the uniform distribution is usually used for the case. In this example, Uni(0,1) is used because p, which is probability to get a success, is between 0 and 1.
This is the Beta distribution with parameter 2,2 I draw using R. You can see that the probability to getting x=0.5 has the highest than others. So if we have a prior information that the parameter is close to 1/2, then we can use Beta(2,2) as prior distribution.
> curve(dbeta(x,2,2),from=0,to=1)
<the Posterior distribution of Ber(p)>
It is more general case than the previous example. Even though we have not much prior information and so we set Beta(alpha, beta) as the prior distribution, We can know that Beta distribution is the conjugate prior distribution of Bernoulli distribution.
<the Posterior distribution of Poi(θ)>
As you can see from this example, Gamma distribution is conjugate prior distribution of Poisson distribution.
<the Posterior distribution of Bin(n,p)>
The Beta distribution is the conjugate prior distribution of Binomial distribution.
<the Posterior distribution of N(M,σ^2)>
<Find Estimator using Bayesian Statistics>
When it comes to finding Estimator, you should take account of the loss function, which is the distance between parameters and estimates. We always need to minimize the loss function. So the estimator that minimizes the loss function always be the estimate. However, as θ is a realized value from Θ, we have to set the estimator which minimizes the mean of the loss function. When we use squared error as loss function, the estimator becomes the mean of the posterior distribution and when we use absolute error as loss function, the estimator becomes the median of the posterior distribution.
<Estimator of P in Ber(P) using Bayesian Statistics >
We use Beta(2,2) as prior distribution when the parameter is assumed to be close to 1/2.
By using Bayesian Statistics, we can find the estimator as (n(sample mean)+2)/(n+4). This method add 2 to both nominator and denominator of the estimator sample mean which is the results of Method of Moments, MLE, MVUE. The estimator calculated from Bayesian Statistics make the estimator get closer to 1/2 than from other methods which are Method of Moments, MLE, MVUE.
<Estimator of P in Ber(P) using Bayesian Statistics - general case >
From this method, we can obtain the value of estimator as the weighted average of sample mean and the mean of the prior distribution. It makes sense because when n gets larger, sample mean gets more credibility. In contrast, when n is small, the influence of the sample mean should be smaller, so for this case, we give more weight on mean of prior distribution.
<Estimator of P in N(M, σ^2) using Bayesian Statistics >
From Bayesian Statistics, we can obtain the value of estimator as the weighted average of sample mean and the mean of the prior distribution. It makes sense because when n gets larger, sample mean gets more credibility. In contrast, when n is small, the influence of the sample mean should be smaller, so for this case, we give more weight on mean of prior distribution. Also we have to consider one more thing here, which is the variance of prior distribution. If the variance of the prior distribution is large enough, the mean of the prior distribution lose it's credibility. Because it means the distributions are not so much dense around the mean. So for the case that the variance of the prior distribution is pretty large, then the sample mean, not the mean of the prior distribution, gain more weight. So combining the two facts, If the sample size is very large and the variance of the prior distribution is very large, we can set the estimator as just 'sample mean.' In contrast, If sample size is small and the variance of the prior distribution is very small, then we can set the estimator as just 'the mean of the prior distribution.'
'Statistics' 카테고리의 다른 글
Transformation Method (0) | 2020.10.29 |
---|---|
Independence of Sample Mean and sample Variance of Normal distribution. (0) | 2020.10.26 |
Cdf technique (0) | 2020.10.21 |
MGF Technique (0) | 2020.10.17 |
Method of moments /Point Estimation (0) | 2020.10.15 |