不可用的类型:' list'错误

时间:2016-10-28 08:12:41

标签: python pandas for-loop dictionary

我收到以下代码的错误

def cleaning(CURRENT,STRING,NEXT):
    data.ix[data[NEXT].str.contains(STRING,na=False),CURRENT] =...
    data[NEXT][data[NEXT].str.contains(STRING,na=False)]
d = ['lower','Less']
c = a[5:]
for x,y in zip(range(len(c)),d):
    cleaning(c[x],d,c[x+1])
    cleaning(c[x],d,c[x+2])

这里,data是一个pandas DataFrame。 但是对于相同的功能,我在以下代码中没有错误

a = ['UBC','LBC', 'HC', 'FC', 'P:C/F','P', 'A', 'Sex']
b = ['upper','lower','hair','footwear']
for x,y in zip(range(len(a)),b):
    cleaning(a[x],y,a[x+1])
    cleaning(a[x],y,a[x+2])

我知道这是因为我们无法使用列表作为词典中的键,但我不确定这里是怎么发生的,为什么它在一个循环中起作用而不是其他

1 个答案:

答案 0 :(得分:1)

您正在传递d列表,作为STRING参数:

d = ['lower','Less']
# ...
    cleaning(c[x],d,c[x+1])
    #             ^

您的第二个示例有效,您传入的是y,这是b列表中的单个元素:

b = ['upper','lower','hair','footwear']
for x,y in zip(range(len(a)),b):
    # ^ one element from b   ^
    cleaning(a[x],y,a[x+1])
    #             ^

pandas.Series.str.contains方法默认接受正则表达式,re.compile使用字典作为缓存来保存已编译的模式。因为您传入了一个列表,所以会收到错误:

>>> pandas.Series(['aa', 'bb', 'cc']).str.contains(['a'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/site-packages/pandas/core/strings.py", line 1458, in contains
    regex=regex)
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/site-packages/pandas/core/strings.py", line 222, in str_contains
    regex = re.compile(pat, flags=flags)
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 194, in compile
    return _compile(pattern, flags)
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 237, in _compile
    p, loc = _cache[cachekey]
TypeError: unhashable type: 'list'

修复方法是传入y而不是d

for x, y in zip(range(len(c)) ,d):
    cleaning(c[x], y, c[x + 1])
    cleaning(c[x], y, c[x + 2])

你可能想要提出更好的变量名称;单字母名称很难区分,很容易导致这些错误。