假设我有一个数据框d
,其中有一列包含Python数组作为值。
>>> d = pd.DataFrame([['foo', ['bar']], ['biz', []]], columns=['a','b'])
>>> print d
a b
0 foo [bar]
1 biz []
现在,我想过滤掉那些包含空数组的行。
我尝试过各种版本,但到目前为止还没有运气:
尝试将其检查为“真实”值:
>>> d[d['b']]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2682, in __getitem__
return self._getitem_array(key)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2726, in _getitem_array
indexer = self.loc._convert_to_indexer(key, axis=1)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1314, in _convert_to_indexer
indexer = check = labels.get_indexer(objarr)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3259, in get_indexer
indexer = self._engine.get_indexer(target._ndarray_values)
File "pandas/_libs/index.pyx", line 301, in pandas._libs.index.IndexEngine.get_indexer
File "pandas/_libs/hashtable_class_helper.pxi", line 1544, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'list'
尝试进行显式长度检查。似乎len()
应用于序列,而不是数据值。
>>> d[ len(d['b']) > 0 ]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: True
直接与空数组进行比较,就像我们可以将它与一个空字符串进行比较(顺便说一句,如果我们使用字符串而不是数组,它确实可以工作)。
>>> d[ d['b'] == [] ]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
res = na_op(values, other)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1143, in na_op
result = _comp_method_OBJECT_ARRAY(op, x, y)
File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1120, in _comp_method_OBJECT_ARRAY
result = libops.vec_compare(x, y, op)
File "pandas/_libs/ops.pyx", line 128, in pandas._libs.ops.vec_compare
ValueError: Arrays were different lengths: 2 vs 0
答案 0 :(得分:3)
使用字符串访问器.str
检查熊猫系列中列表的长度:
d[d.b.str.len()>0]
输出:
a b
0 foo [bar]
答案 1 :(得分:2)
空列表将使用False
计算为all
。如果一行中有其他Falsey值,这将不起作用(除非您也希望删除这些行)。
d[d.all(1)]
a b
0 foo [bar]
如果您只想使用列b
进行过滤,则可以使用astype
:
d[d.b.astype(bool)]
a b
0 foo [bar]
答案 2 :(得分:0)
plugin-proposal-class-properties更好,但是就其他人的知识而言,另一种选择是使用元组而不是列表,然后直接检查空元组。
import sys
import math
import random
run = ("Yes")
while run == ("Yes"):
Function = (input("What type of function would you like to do? >"))
if Function == ("Equations"):
x = (input("What is your first number? >"))
y = (input("What is the answer to your equation? >"))
Equation_Function = (input("What is the function in your equation? >"))
if Equation_Function == ("Addition"):
Variable = float(y) - float(x)
print("The value of the variable is", Variable)
if Equation_Function == ("Subtraction"):
Variable = float(y) - float(x)
if float(Variable) + float(x) != float(y):
Variable = float(Variable) - float(Variable) - float(Variable)
print("The value of the variable is", Variable)
elif():
print("The value of the variable is", Variable)
哪个给:
d[d['b'] != ()]
这不适用于列表;看到原始问题中的最后一个错误。