Description
See the mailing list "[pydata] Covariance matrix not positive semi-definite."
Currently, a covariance matrix is computed using pairwise available observations ie., if there is missing data at an index but not in the two pairs it still uses those pairs in the pairwise covariance matrix. The result of this computation is not a covariance matrix and can be non positive semi-definite.
What to do in this case? 1) Warn? 2) Raise an error? 3) Only use observations for which all variables are available?
3 is tempting, the resultant covariance matrix will be a true covariance matrix, but it's an inconsistent estimator of the covariance.
My vote is for 2, so that the user is forced to think what they actually want to compute. Ideally, the error message will point to estimators that are appropriate for this situation, but these are not online yet (from statsmodels).