问题有点复杂,要求查看快照以获得更好的视图/理解 我有2列“ Col-A”,“ Col-B” [https://i.stack.imgur.com/bw1hx.jpg][1]”的数据框。 我也有一个包含多列的CSV文件数据。[https://i.stack.imgur.com/v72mM.jpg][1]
我的数据框的“ Col-B”数据将与CSV文件标题匹配, 例如,“ Col-B”的第一行项目是“ Password”,因此我将在CSV文件中使用名为“ Password”的列。[https://i.stack.imgur.com/hTCZa.jpg][1]
现在我的代码应该执行的是,如果我的数据框“ Col-B”是“密码”,则应该在“ Col-A”中搜索我的CSV文件的密码列,并且首先找到的字符串是我的输出。下面是我尝试的代码。
import pandas as pd
import numpy as np
data = pd.read_excel("C:/Users/606736.CTS/Desktop/Keyword.xlsx",
sheet_name='Sheet2')
CSV_file = pd.read_excel("C:/Users/606736.CTS/Desktop/Keyword.xlsx",
sheet_name='Sub-Cat')
data['Col-C']= np.nan # for adding a new column
# Below code works perfectly fine for searching any one of the column
# in the CSV-file, in the below code I am searching on "Password" Col,
# but I want the code to take the column dynamically based on the 'Col-B'
# of my dataframe.
# if col-B of my dataframe is "CPU", then 'CPU' column of the CSV-file
# should be searched.
for i in data['Col-B']:
for Key1 in CSV_file[i]:
data.loc[(data['Col-A'].apply(lambda x: Key1 in x.split(' ')) &
(data['Col-C'].isna()), 'Col-C')] = Key1
data.head(3)
答案 0 :(得分:0)
如果您的数据帧较大,这将需要很长时间才能运行
patterns.txt
答案 1 :(得分:0)
这对我来说很好
for i in data['Col-B']:
for Key1 in CSV_file[i]:
data.loc[(data['Col-A'].apply(lambda x: Key1 in x.split(' ')) &
(data['Col-B']==i), 'Col-C')] = Key1