Question

我有一个pandas数据帧

import pandas as pd

df=pd.DataFrame({'Location': [ 'NY', 'SF', 'NY', 'NY', 'SF', 'SF', 'TX', 'TX', 'TX', 'DC'],
                 'Class': ['H','L','H','L','L','H', 'H','L','L','M'],
                 'Address': ['12 Silver','10 Fak','12 Silver','1 North','10 Fak','2 Fake', '1 Red','1 Dog','2 Fake','1 White'],
                 'Score':['4','5','3','2','1','5','4','3','2','1',]})

我想添加2个存储在词典中的标签。请注意，第二个字典不包含键＆＃39; <＆＃39;

df['Tag1'] =''
df['Tag2'] =''

tagset1 = {'A':['NY|SF'],
          'B':['DC'],
          'C':['TX'],
          }
for key in tagset1:
    df.loc[df.Location.str.contains(tagset1[key][0]) & (df.Tag1 == ''),'Tag1'] = key


tagset2= {'B':['H|M'],
          'C':['L'],
          }
for key in tagset2:
    df.loc[df.Class.str.contains(tagset2[key][0]) & (df.Tag2 == ''),'Tag2'] = key

print (df)

如果我想结合两个词典以使代码更具可读性和效率，我应该在newtagset['A'][1]中使用''填写A的位置，还是有另一种方法可以使迭代器忽略或迭代列表中的位置时跳过位置newtagset['A'][1]？

newtagset = {'A':['NY|SF', '',],
          'B':['DC','H|M',],
          'C':['TX','L',],
          }


for key in newtagset:
    df.loc[df.Location.str.contains(newtagset[key][0]) & (df.Tag1 == ''),'Tag1'] = key

for key in newtagset:
    df.loc[df.Class.str.contains(newtagset[key][1]) & (df.Tag2 == ''),'Tag2'] = key

print (df)

我发现的大多数解决方案都使用itertools Skip multiple iterations in loop python这是唯一的方法吗？

Answer 1

简单continue没有错。

for key, value in newtagset.items():    # I found dict.items cleaner
    if not value[1]:
        continue
    df.loc...

有点偏离主题：

& (df.Tag1 == '')是多余的。只有当你在值上有巧合时我才会有用，但这会导致不可预测的行为，因为dict不是有序的。

通过跳过缺失值Python 3，有效地迭代字典列表值

1 个答案: