Question

我正在尝试根据在单独数组中定义的某些先前定义的条件来切片数据帧。当循环遍历该数组以找到数据帧的相关切片时，我遇到了一个问题。第一次迭代工作正常，但循环在第二次迭代期间中断，抛出TypeError: len() of unsized object。

以下是一个示例数据框：

    std     sterr   Z       smooth
0   5.1     2.28    0       7.640484
1   5.13    2.29    0.1     7.532409
2   5.15    2.3     0.21    7.406423
3   5.17    2.31    0.31    7.267842
4   5.19    2.32    0.42    7.121988
5   5.21    2.33    0.52    6.974179
6   5.23    2.34    0.62    6.829734
7   5.25    2.35    0.73    6.693973
8   5.27    2.36    0.83    6.584009
9   5.29    2.37    0.94    6.49429
10  5.31    2.38    1.04    6.427032

这是循环的代码：

turnz = df.ix[np.array(turn_iloc), 'Z']
c = 0.
print "turn points", np.array(turnz)
for i, zi in enumerate(np.array(turnz)):
    z0 = c
    print z0, zi, type(z0), type(zi)
    x = df.loc[((z0<=df['Z'])& (df['Z']<=zi)), 'Z']
    y = df.loc[((z0<=df['Z'])& (df['Z']<=zi)), 'smooth']
    print len(x), len(y)
    print type(x), type(y)
    c = zi

这些是打印输出：

turn points [ 1.04  2.19  2.5   4.06]
0.0 1.04 <type 'float'> <type 'numpy.float64'>
11 11
<class 'pandas.core.series.Series'> <class 'pandas.core.series.Series'>
1.04 2.19 <type 'numpy.float64'> <type 'numpy.float64'>

之后，它会抛出错误。但是，如果我尝试使用循环外的这些打印值来切割数据帧，它可以正常工作。

print "IS IT",df.loc[((1.04<=df['Z'])& (df['Z']<=2.19)), 'Z']

打印

IS IT 10    1.04
11    1.14
12    1.25
13    1.35
14    1.46
15    1.56
16    1.67
17    1.77
18    1.87
19    1.98
20    2.08
21    2.19
Name: Z, dtype: float64

我错过了什么？

如果有帮助的话，下面是完整的追溯：

TypeError                                 Traceback (most recent call last)
<ipython-input-18-b6d427f5dae7> in <module>()
      9     z0 = c
     10     print z0, zi, type(z0), type(zi)
---> 11     x = df.loc[((z0<=df['Z'])& (df['Z']<=zi)), 'Z']
     12     y = df.loc[((z0<=df['Z'])& (df['Z']<=zi)), 'smooth']
     13     print len(x), len(y)

C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\ops.pyc in wrapper(self, other, axis)
    739             return NotImplemented
    740         elif isinstance(other, (np.ndarray, pd.Index)):
--> 741             if len(self) != len(other):
    742                 raise ValueError('Lengths must match to compare')
    743             return self._constructor(na_op(self.values, np.asarray(other)),

TypeError: len() of unsized object

观察

事实证明我的数据框在使用numpy浮点数时遇到了麻烦。将z0和zi转换为float可以解决问题！

Answer 1

这个问题的关键在于第一次迭代的打印输出：

public class WarningTimer extends Cancellable {...}

第一次迭代有0.0 1.04 <type 'float'> <type 'numpy.float64'>作为输入，它可以工作。但是，其余值是numpy数组的元素，格式为numpy浮点数。这是数据框架不接受的。

float

诀窍。

现在，问题是......为什么？如果我们打印z0 = float(z0) zi = float(zi)和turnz.dtype dtype，则两者都是df['Z']，所以它们似乎是相同的。但是python以不同的方式处理它们，如this answer

中所述

切片数据帧时TypeError：未限制对象的len（） - 不同的浮点格式？

1 个答案: