python - Summary not working for OLS estimation -
i having issue statsmodels ols estimation. model runs without issues, when try call summary can see actual results typeerror of axis needing specified when shapes of , weights differ.
my code looks this:
from __future__ import print_function, division import xlrd xl import numpy np import scipy sp import pandas pd import statsmodels.formula.api smf import statsmodels.api sm file_loc = "/users/niklaslindeke/python/dataset_3.xlsx" workbook = xl.open_workbook(file_loc) sheet = workbook.sheet_by_index(0) tot = sheet.nrows data = [[sheet.cell_value(r, c) c in range(sheet.ncols)] r in range(sheet.nrows)] rv1 = [] rv5 = [] rv22 = [] rv1fcast = [] t = [] price = [] time = [] retnor = [] model = [] in range(1, tot): t = data[i][0] ret = data[i][1] ret5 = data[i][2] ret22 = data[i][3] ret1_1 = data[i][4] retn = data[i][5] t = xl.xldate_as_tuple(t, 0) rv1.append(ret) rv5.append(ret5) rv22.append(ret22) rv1fcast.append(ret1_1) retnor.append(retn) t.append(t) df = pd.dataframe({'rvfcast':rv1fcast, 'rv1':rv1, 'rv5':rv5, 'rv22':rv22,}) df = df[df.rvfcast != ""] model = smf.ols(formula='rvfcast ~ rv1 + rv5 + rv22', data = df).fit() print model.summary()
in other words, doesnt work.
the callback following:
print model.summary() --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-394-ea8ea5139fd4> in <module>() ----> 1 print model.summary() /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.6-x86_64.egg/statsmodels/regression/linear_model.pyc in summary(self, yname, xname, title, alpha) 1948 top_left.append(('covariance type:', [self.cov_type])) 1949 -> 1950 top_right = [('r-squared:', ["%#8.3f" % self.rsquared]), 1951 ('adj. r-squared:', ["%#8.3f" % self.rsquared_adj]), 1952 ('f-statistic:', ["%#8.4g" % self.fvalue] ), /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.6-x86_64.egg/statsmodels/tools/decorators.pyc in __get__(self, obj, type) 92 if _cachedval none: 93 # call "fget" function ---> 94 _cachedval = self.fget(obj) 95 # set attribute in obj 96 # print("setting %s in cache %s" % (name, _cachedval)) /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.6-x86_64.egg/statsmodels/regression/linear_model.pyc in rsquared(self) 1179 def rsquared(self): 1180 if self.k_constant: -> 1181 return 1 - self.ssr/self.centered_tss 1182 else: 1183 return 1 - self.ssr/self.uncentered_tss /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.6-x86_64.egg/statsmodels/tools/decorators.pyc in __get__(self, obj, type) 92 if _cachedval none: 93 # call "fget" function ---> 94 _cachedval = self.fget(obj) 95 # set attribute in obj 96 # print("setting %s in cache %s" % (name, _cachedval)) /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.6-x86_64.egg/statsmodels/regression/linear_model.pyc in centered_tss(self) 1159 if weights not none: 1160 return np.sum(weights*(model.endog - np.average(model.endog, -> 1161 weights=weights))**2) 1162 else: # broken gls 1163 centered_endog = model.wendog - model.wendog.mean() /users/niklaslindeke/library/enthought/canopy_64bit/user/lib/python2.7/site-packages/numpy/lib/function_base.pyc in average(a, axis, weights, returned) 522 if axis none: 523 raise typeerror( --> 524 "axis must specified when shapes of , weights " 525 "differ.") 526 if wgt.ndim != 1: typeerror: axis must specified when shapes of , weights differ.
where sorry, have no idea there. , wish after this, perform correction auto-correlation newey-west method, saw following line:
mdl = model.get_robustcov_results(cov_type='hac',maxlags=1)
but when try run model returns error:
valueerror: operands not broadcast shapes (256,766) (256,1,256)
but realize statsmodels.formula isn't compatible get_robustcov function, if so, how test auto-correlation then?
but pressing issue fact cannot produce summary ols.
as requested, here first thirty rows of dataset in df.
print df rv1 rv22 rv5 rvfcast 0 0.01553801 0.01309511 0.01081393 0.008421236 1 0.008881671 0.01301336 0.01134905 0.01553801 2 0.01042178 0.01326669 0.01189979 0.008881671 3 0.009809431 0.01334593 0.01170942 0.01042178 4 0.009418737 0.01358808 0.01152253 0.009809431 5 0.01821364 0.01362502 0.01269661 0.009418737 6 0.01163536 0.01331585 0.01147541 0.01821364 7 0.009469907 0.01329509 0.01172988 0.01163536 8 0.008875018 0.01361841 0.01202432 0.009469907 9 0.01528914 0.01430873 0.01233219 0.008875018 10 0.01210761 0.01412724 0.01238776 0.01528914 11 0.01290773 0.0144439 0.01432174 0.01210761 12 0.01094212 0.01425895 0.01493865 0.01290773 13 0.01041433 0.01430177 0.0156763 0.01094212 14 0.01556703 0.0142857 0.01986616 0.01041433 15 0.0217775 0.01430253 0.01864532 0.01556703 16 0.01599228 0.01390088 0.01579069 0.0217775 17 0.01463037 0.01384096 0.01416622 0.01599228 18 0.03136361 0.01395866 0.01398807 0.01463037 19 0.009462822 0.01295695 0.0106063 0.03136361 20 0.007504367 0.01295204 0.01114677 0.009462822 21 0.007869922 0.01300863 0.01267322 0.007504367 22 0.01373964 0.0129547 0.01314553 0.007869922 23 0.01445476 0.01271198 0.01268 0.01373964 24 0.01216517 0.01249902 0.01202476 0.01445476 25 0.0151366 0.01266783 0.0129083 0.01216517 26 0.01023149 0.01258627 0.0146934 0.0151366 27 0.01141199 0.01284094 0.01490637 0.01023149 28 0.01117856 0.01321258 0.01643881 0.01141199 29 0.01658287 0.01340074 0.01597086 0.01117856
i thank user333800 help!
for future reference if comes across same issue.
the following code:
df = pd.dataframe({'rvfcast':rv1fcast, 'rv1':rv1, 'rv5':rv5, 'rv22':rv22,}) df = df[df.rvfcast != ""] df = df.astype(float) model = smf.ols(formula='rvfcast ~ rv1 + rv5 + rv22', data = df).fit() mdl = model.get_robustcov_results(cov_type='hac',maxlags=1)
gave me:
print mdl.summary() ols regression results ============================================================================== dep. variable: rvfcast r-squared: 0.681 model: ols adj. r-squared: 0.677 method: least squares f-statistic: 120.9 date: wed, 22 apr 2015 prob (f-statistic): 1.60e-48 time: 17:19:19 log-likelihood: 1159.8 no. observations: 256 aic: -2312. df residuals: 252 bic: -2297. df model: 3 covariance type: hac ============================================================================== coef std err t p>|t| [95.0% conf. int.] ------------------------------------------------------------------------------ intercept 0.0005 0.000 2.285 0.023 7.24e-05 0.001 rv1 0.2823 0.104 2.710 0.007 0.077 0.487 rv5 -0.0486 0.193 -0.252 0.802 -0.429 0.332 rv22 0.7450 0.232 3.212 0.001 0.288 1.202 ============================================================================== omnibus: 174.186 durbin-watson: 2.045 prob(omnibus): 0.000 jarque-bera (jb): 2152.634 skew: 2.546 prob(jb): 0.00 kurtosis: 16.262 cond. no. 1.19e+03 ==============================================================================
and can continue on paper :)
Comments
Post a Comment