Question

问题有点复杂，要求查看快照以获得更好的视图/理解我有2列“ Col-A”，“ Col-B” [https://i.stack.imgur.com/bw1hx.jpg][1]”的数据框。我也有一个包含多列的CSV文件数据。[https://i.stack.imgur.com/v72mM.jpg][1]

我的数据框的“ Col-B”数据将与CSV文件标题匹配，例如，“ Col-B”的第一行项目是“ Password”，因此我将在CSV文件中使用名为“ Password”的列。[https://i.stack.imgur.com/hTCZa.jpg][1]

现在我的代码应该执行的是，如果我的数据框“ Col-B”是“密码”，则应该在“ Col-A”中搜索我的CSV文件的密码列，并且首先找到的字符串是我的输出。下面是我尝试的代码。

import pandas as pd
import numpy as np

data = pd.read_excel("C:/Users/606736.CTS/Desktop/Keyword.xlsx", 
sheet_name='Sheet2')
CSV_file = pd.read_excel("C:/Users/606736.CTS/Desktop/Keyword.xlsx",
sheet_name='Sub-Cat') 

data['Col-C']= np.nan # for adding a new column

# Below code works perfectly fine for searching any one of the column 
# in the CSV-file, in the below code I am searching on "Password" Col, 
# but I want the code to take the column dynamically based on the 'Col-B' 
# of my dataframe.
# if col-B of my dataframe is "CPU", then 'CPU' column of the CSV-file 
# should be searched.
for i in data['Col-B']:
    for Key1 in CSV_file[i]:
        data.loc[(data['Col-A'].apply(lambda x: Key1 in x.split(' ')) & 
        (data['Col-C'].isna()), 'Col-C')] = Key1
data.head(3)

Answer 1

如果您的数据帧较大，这将需要很长时间才能运行

patterns.txt

Answer 2

这对我来说很好

for i in data['Col-B']:
    for Key1 in CSV_file[i]:
       data.loc[(data['Col-A'].apply(lambda x: Key1 in x.split(' ')) & 
       (data['Col-B']==i), 'Col-C')] = Key1

熊猫根据条件搜索字符串

2 个答案: