我在熊猫中有一个数据帧,该数据帧有两列,每一行是一个字符串列表,如何检查唯一行上的这两列中是否有单词匹配(标志列是所需的输出)
public function index()
{
.......
$extras = $product->extras;
return view('yourView', compact('extras'));
}
我尝试过
A B flag
hello,hi,bye bye, also 1
but, as well see, pandas 0
但是我遇到了这个错误
df['A'].str.contains(df['B'])
答案 0 :(得分:2)
您可以将每个值分别通过split和set
转换为单独的单词,并通过&
检查交集,然后将值转换为布尔值-空集将转换为False
s并最后一次转换到int
s-Falses
是0
s,True
s是1
s。
zipped = zip(df['A'], df['B'])
df['flag'] = [int(bool(set(a.split(',')) & set(b.split(',')))) for a, b in zipped]
print (df)
A B flag
0 hello,hi,bye bye,also 1
1 but,as well see,pandas 0
类似的解决方案:
df['flag'] = np.array([set(a.split(',')) & set(b.split(',')) for a, b in zipped]).astype(bool).astype(int)
print (df)
A B flag
0 hello,hi,bye bye, also 1
1 but,as well see, pandas 0
编辑:,
之前可能存在一些空格,因此将map
与str.strip
添加在一起,并使用filter
删除空字符串:
df = pd.DataFrame({'A': ['hello,hi,bye', 'but,,,as well'],
'B': ['bye ,,, also', 'see,,,pandas']})
print (df)
A B
0 hello,hi,bye bye ,,, also
1 but,,,as well see,,,pandas
zipped = zip(df['A'], df['B'])
def setify(x):
return set(map(str.strip, filter(None, x.split(','))))
df['flag'] = [int(bool(setify(a) & setify(b))) for a, b in zipped]
print (df)
A B flag
0 hello,hi,bye bye ,,, also 1
1 but,,,as well see,,,pandas 0