Now, from above, if we take the limit as T becomes infinite, then our standard deviation will go to zero. So we know that our measurement will converge as we take enough samples. But, does it converge to the actual correct answer, or does it converge to something that is off by a bit? Luckily for us (for the Welch method at least), it does converge to the correct answer given infinite samples. This is sometimes called ``asymptotically unbiased''. Of course, this means that for a finite number of samples, there is a bias. Let's quantify this. Most authors that I've seen put this in the form of our old friend spectral leakage but I found Wirsching et al.'s treatment [5] more enlightening. They expand Welch's PSD estimate (for a rectangular window) in a Taylor series:
That is, the expected (mean) value of the power spectral density estimate at a given frequency is equal to the actual power spectral density plus some error terms. Two items of note. First, the error terms drop off as f2. This is pretty favorable. If you think you have a bias problem, take say, 4 times as many samples and use 4 times as many FFT lines. If there is a bias, it should decrease by a factor of 16. Second, the leading error term goes as the second derivative of the PSD. When you think about spectral leakage, this should seem reasonable. If the spectrum is flat, there is nothing to leak. Only when there are peaks or troughs in the spectrum will energy leak. As Wirsching points out, the error term is negative for a peak and positive for a trough, so that the PSD estimate looks flatter than the actual PSD should be.
The spectral leakage formulation, for what it's worth is [4]:
where W is the fourier transform of the window function.
/' $I