从多个列pandas中查找

时间:2018-03-20 09:44:57

标签: python python-3.x pandas dataframe

我有2个数据帧df1& df2如下:

DF1:

If i Mod 2 = 0 Then ' This is to select the New Range '

            j = j + 1 ' Used to name the tabels '

            Set TableRange = Range(Cells(StartRow, 6), Cells(EndRow - 1, 13)) ' Selectst the table range '

            TableRange.Select ' Selects the tabels '

            On Error GoTo AddTable:

                ActiveSheet.ListOnjects.Items(1).Unlist

AddTable:

            ActiveSheet.ListObjects.Add(xlSrcRange, TableRange, , xlYes).Name = "Table" & j ' Formats the selected range as a table and names the table '
            Range("Table" & j & "[#All]").Select ' Selects the table '
            ActiveSheet.ListObjects("Table" & j).TableStyle = "TableStyleLight15" ' Sets the table style '

        End If

DF2:

a
T11552
T11559
T11566
T11567
T11569
T11594
T11604
T11625

我想要一个新的数据帧df3,我想在df2中搜索df1的值。如果df2中存在该值,我想在df1中添加新列,返回True / False,如下所示。

DF3:

a   b
T11552  T11555
T11560  T11559
T11566  T11562
T11568  T11565
T11569  T11560
T11590  T11594
T11604  T11610
T11621  T11625
T11633  T11631
T11635  T11634
T13149  T13140

2 个答案:

答案 0 :(得分:3)

assign用于DataFrame的新df3 = df1.assign(v = lambda x: x['a'].isin(np.unique(df2.values.ravel()))) #alternative solution #df3 = df1.assign(v = lambda x: np.in1d(x['a'], np.unique(df2[['a','b']].values.ravel()))) #if need specify columns in df2 for check df3 = df1.assign(v = lambda x: x['a'].isin(np.unique(df2[['a','b']].values.ravel()))) print (df3) a v 0 T11552 True 1 T11559 True 2 T11566 True 3 T11567 False 4 T11569 True 5 T11594 True 6 T11604 True 7 T11625 True 并将所有值转换为按isin展平数组,以提高性能,只检查ravel值和还可以unique检查:

#!/bin/bash

declare -A years months days # declare associative arrays

while read -r date file; do
    IFS=- read -r year month day <<<"$date" # split date on -

    # set keys in associative arrays
    years[$year]=  
    months[$month]=
    days[$day]=
done < file

# use keys to make arrays of values
year=( "${!years[@]}" )
month=( "${!months[@]}" )
day=( "${!days[@]}" )

答案 1 :(得分:0)

试试这个:

df3 = df1[['a']].copy()
df3['v'] = df3['a'].isin(set(df2.values.ravel()))

以上代码将:

  1. 使用列&#39; a&#39;创建新数据框。来自df1
  2. 创建一个布尔列&#39; v&#39;测试列的每个值的存在&#39; a&#39;通过df2numpy.ravelset中的值进行比较。