sorry, it's not great title. simple example though:
(pandas version 0.16.1)
df = pd.dataframe({ 'x':range(1,5), 'y':[1,1,1,9] }) works fine:
df.apply( lambda x: x > x.mean() ) x y 0 false false 1 false false 2 true false 3 true true shouldn't work same?
df.apply( lambda x: x.mean() < x ) --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-467-6f32d50055ea> in <module>() ----> 1 df.apply( lambda x: x.mean() < x ) c:\users\ei\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 3707 if reduce none: 3708 reduce = true -> 3709 return self._apply_standard(f, axis, reduce=reduce) 3710 else: 3711 return self._apply_broadcast(f, axis) c:\users\ei\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce) 3797 try: 3798 i, v in enumerate(series_gen): -> 3799 results[i] = func(v) 3800 keys.append(v.name) 3801 except exception e: <ipython-input-467-6f32d50055ea> in <lambda>(x) ----> 1 df.apply( lambda x: x.mean() < x ) c:\users\ei\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\ops.pyc in wrapper(self, other, axis) 586 return notimplemented 587 elif isinstance(other, (np.ndarray, pd.index)): --> 588 if len(self) != len(other): 589 raise valueerror('lengths must match compare') 590 return self._constructor(na_op(self.values, np.asarray(other)), typeerror: ('len() of unsized object', u'occurred @ index x') for counter-example, these both work:
df.mean() < df df > df.mean()
edit
finally found bug - issue 9369
as indicated in issue -
left = 0 > s works (e.g. python scalar). think being treated 0-dim array (its np.int64) (and not scalar when called.) i'll mark bug. feel free dig in
the issue occurs when using comparison operators numpy datatype (like np.int64 or np.float64, etc) on left side of comparison operator . simple fix maybe @santon noted in answer, convert number python scalar, rather using numpy scalar.
old :
i tried in pandas 0.16.2.
i did following on original df -
in [22]: df['z'] = df['x'].mean() < df['x'] in [23]: df out[23]: x y z 0 1 1 false 1 2 1 false 2 3 1 true 3 4 9 true in [27]: df['z'].mean() < df['z'] --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-27-afc8a7b869b4> in <module>() ----> 1 df['z'].mean() < df['z'] c:\anaconda3\lib\site-packages\pandas\core\ops.py in wrapper(self, other, axis) 586 return notimplemented 587 elif isinstance(other, (np.ndarray, pd.index)): --> 588 if len(self) != len(other): 589 raise valueerror('lengths must match compare') 590 return self._constructor(na_op(self.values, np.asarray(other)), typeerror: len() of unsized object seems bug me, can compare boolean means int , vice versa fine, issue comes when using boolean mean boolean (though not think makes sense take mean() boolean) -
in [24]: df['z'] < df['x'] out[24]: 0 true 1 true 2 true 3 true dtype: bool in [25]: df['z'] < df['x'].mean() out[25]: 0 true 1 true 2 true 3 true name: z, dtype: bool in [26]: df['x'].mean() < df['z'] out[26]: 0 false 1 false 2 false 3 false name: z, dtype: bool i tried , reproduced issue in pandas 0.16.1 , can reproduced using -
in [10]: df['x'].mean() < df['x'] --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-10-4e5dab1545af> in <module>() ----> 1 df['x'].mean() < df['x'] /opt/anaconda/envs/np18py27-1.9/lib/python2.7/site-packages/pandas/core/ops.pyc in wrapper(self, other, axis) 586 return notimplemented 587 elif isinstance(other, (np.ndarray, pd.index)): --> 588 if len(self) != len(other): 589 raise valueerror('lengths must match compare') 590 return self._constructor(na_op(self.values, np.asarray(other)), typeerror: len() of unsized object in [11]: df['x'] < df['x'].mean() out[11]: 0 true 1 true 2 false 3 false name: x, dtype: bool seems bug has been fixed in pandas version 0.16.2 (except when mixing booleans integer). suggest upgrade pandas version using -
pip install pandas --upgrade
Comments
Post a Comment