TLDR: If you have an estimate for \(Z\), you can't just take \(e^{estimate}\) to estimate \(e^Z\)
A bias correction factor of \(e^{\hat\sigma^2/2}\) has to be applied on the "common sense" estimator \(e^{\hat{E(Z)}}\), to correctly estimate \(Y=e^Z\). The right estimate is \(Y=e^Z\ \hat =\ e^{\hat \sigma^2/2}e^{\hat{E(Z)}}\).
A bias correction factor of \(e^{\hat\sigma^2/2}\) has to be applied on the "common sense" estimator \(e^{\hat{E(Z)}}\), to correctly estimate \(Y=e^Z\). The right estimate is \(Y=e^Z\ \hat =\ e^{\hat \sigma^2/2}e^{\hat{E(Z)}}\).
Suppose we model \(Z=\ln(Y)\) instead of \(Y\), so that \(Y=e^Z\). We estimate \(E(Z)=\mu\ \hat=\ \hat \mu= f(X)\) based on independent variables \(X\). (Read the symbol \(\hat =\) as "estimated as".)
Given \(\hat \mu\) estimates \(E(Z)\), a common-sense option to estimate \(E(Y)\) might seem to be \(e^\hat \mu\), since \(Y=e^Z\).
But this will not give the best results - simply because \(E(Y)=E(e^Z)\ne e^{E(Z)}\).
\(E(Y)=e^{\mu+\sigma^2/2}\), where \(\sigma^2\) is the variance of the error \(Z-\hat Z\) - and hence a good estimate of \(E(Y)\) would be \(E(Y)\ \hat=\ e^{\hat \mu+\hat \sigma^2/2}\).
Estimating \(\sigma^2\)
We are used to estimating \(E(Z)\hat=\hat \mu\), which is the just the familiar regression estimate \(\sum \hat \beta_i X_i\). We will need to estimate \(\hat\sigma^2\) now too, to get an accurate point estimate of \(Y=e^Z\).
OLS
If
you are running an Ordinary Least Squares regression, an unbiased
estimate for \(\sigma^2\) is \(\frac{SSE}{n-k}\) where \(n\)=#observations, and \(k\)=#parameters in the model.
Most
statistical packages report these - and if not, you can calculate it as
\(\sum (Z-\hat Z)^2/(n-k)\). SAS reports all these if you use PROC REG,
in fact, in SAS \(\hat \sigma\) is already reported as "Root MSE", and
you can directly take \(\text{Root MSE}^2\) as an estimate of
\(\sigma^2\).
Other Regression Frameworks (Machine Learning - RandomForest, NN, KNN, etc.)
A
generic way of estimating the \(\sigma^2\) is to borrow the assumption
of homoscedasticity from OLS - i.e. that the \(\sigma^2\) does not vary
from person to person.
Under
this assumption, CLT can be used to show that \(\sum (Z-\hat Z)^2/n\)
will converge in probability to \(\sigma^2\), and hence remains a good
estimator - even if it may not be unbiased for small \(n\).
If number of parameters in the model is known, then it is recommended to use \(\sum (Z-\hat Z)^2/(n-k)\), mimicking
the OLS estimator - it will correct for the bias to some extent,
although for large \(n\), the difference between \(1/(n-k)\) and \(1/n\)
will be small.
Proof
If \(Z=\ln(Y)\sim N(\mu,\sigma^2)\), then \(E(Y)=E(e^Z)=e^{\mu+\sigma^2/2}\).Citing the mean of lognormal distribution in Wikipedia may work as "proof" in most cases. Just for completeness, a full mathematical proof is also given below.
\[E(e^Z)=\int_{-\infty}^{\infty}{e^z\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(z-\mu)^2}{2\sigma^2}}}\,dz=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{\color{#3366FF}{(z-\mu)^2-2\sigma^2 z}}{2\sigma^2}}}\,dz\]
\[\begin{array}{rcl}
\color{#3366FF}{(z-\mu)^2-2\sigma^2 z}&=&z^2-2\mu z + \mu^2-2\sigma^2z\\
&=&z^2-2(\mu+\sigma^2) z + \mu^2\\
&=&\left(z-(\mu+\sigma^2)\right)^2 + \mu^2-(\mu+\sigma^2)^2\\
&=&\left(z-(\mu+\sigma^2)\right)^2 - 2\mu\sigma^2-\sigma^4\\
&=&\color{green}{\left(z-(\mu+\sigma^2)\right)^2} - \color{red}{2\sigma^2(\mu+\sigma^2/2)}\\
\end{array}\]
\[\begin{array}{rcl}
E(e^Z)&=&\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{\color{green}{\left(z-(\mu+\sigma^2)\right)^2} - \color{red}{2\sigma^2(\mu+\sigma^2/2)}}{2\sigma^2}}}\,dz\\
&=&\color{red}{e^{\mu+\sigma^2/2}}\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{\left(z-(\mu+\sigma^2)\right)^2}{2\sigma^2}}}\,dz\\
&=&\color{red}{e^{\mu+\sigma^2/2}}\\
\end{array}\]
Augmented Reality in Education
ReplyDeleteAR and Education — In this era of digitalization, educators know that the learning process should be all about creativity and interaction. Augmented Reality alters one’s ongoing perception and makes it more meaningful by adding digital contents to it. And UniteAR using its innovative technology adds extra digital contents which will give students a wider understanding of topics.
Special kids and AR — As far as special kids are considered Augmented Reality is one of the most excellent ways to educate them. For instance, children who have ADHD and Autism may need special guidance in education as the normal educational techniques may not work for them. UniteAR helps them to learn abstract and difficult concepts with the help of AR technology; as a result learning is made simple.
https://www.unitear.com/https://www.unitear.com/