如何基于来自不同数据框的列中的条目在数据框上应用熊猫过滤器(无联接)

时间:2020-04-27 12:38:15

标签: python pandas data-analysis

作为一个例子,我有一个数据帧(df_1),其中一列包含一些文本数据。第二个数据帧(df_2)包含一些数字。如何检查文本中是否包含第二个数据框中的数字?

df_1

                       Note
0  The code to this is 1003
1  The code to this is 1004

df_2

   Code_Number
0         1006
1         1003

所以我想检查df_1中[Note]中的条目是否包含df_2中[Code_Number]中的条目

我尝试使用以下代码:df_1[df_1['Note'].str.contains(df_2['Code_Number'])],并且我知道不能使用联接,因为我没有要联接的键。

应用过滤后,我寻找的最终结果是:

   Note              
0  The code to this is 1003    

3 个答案:

答案 0 :(得分:1)

执行此操作:

df_1.loc[df_1['Note'].apply(lambda x: any(str(number) in x for number in df_2['Code_Number']))]

答案 1 :(得分:1)

尝试一下,看看是否能满足您的用例:使用itertools' product获取两列的交叉笛卡尔坐标,并根据条件进行过滤:

uitouch.View.ViewWithTag(Constants.Tag) != null

答案 2 :(得分:0)

Firstly, you have to create 1 column in your df1 where the notes are with a list of numbers that are present in the Notes and then Compare the List column of numbers with the List column of the df2 where the numbers are present(both should be in list format)



#Extract Numbers from Notes
a_string = "0abcadda1 11 def 23 10007"

numbers = [int(word) for word in a_string.split() if word.isdigit()]

print(numbers)


list_test = "103,23"

#Finding common element from both lists the list
L1 = [2,3,4]
L2 = [1,2]
[i for i in L1 if i in L2]


S1 = set(L1)
S2 = set(L2)
print(S1.intersection(S2))

#If you want to find out the common element

def common_data(list1, list2):
    result = False

    # traverse in the 1st list 
    for x in list1:

        # traverse in the 2nd list 
        for y in list2:

            # if one common 
            if x == y:
                result = True
                return result

    return result


# driver code 

a = [1, 2, 3, 4, 5]
b = [5, 6, 7, 8, 9]
print(common_data(a, b))

a = [1, 2, 3, 4, 5]
b = [6, 7, 8, 9]
print(common_data(a, b))