我要搜索数据框中存在的特定单词。如果数据框中存在单词,则需要将数据框的子集导出到Excel中。 这里的问题是每次调用列名称时。所有数据框的列名都相同。
df = pd.DataFrame({
'Name': ['Ann', 'Juh', 'Jeo', 'Sam'],
'Age': [43,29,42,59],
'Task1': ['Drafting a letter', 'Sending', 'Pasting', 'Sending'],
'Task2': ['Sending', 'Paking', 'Sending', 'Pasting'],
'Task3': ['Packing', 'Letter Drafting', 'Paking', 'Letter Drafting']
})
writer = pd.ExcelWriter("C:..\\pp.xlsx", engine='xlsxwriter')
row = 0
b = ['Sending','paking']
for var in b: ----> # Here 'b' is searchable keywords.
lower_df = df.apply(lambda x: x.astype(str).str.lower())
margin = df[lower_df.iloc[:,3:5].astype(str).apply(lambda x: x.str.contains(var.lower())).any(axis=1)]
margin['search_term'] = var ---> #Create the column with search keyword
if len(margin) > 0: ---> #If no data found need to eliminate
margin.to_excel(writer,startrow=row)
row = row + len(margin.index) +1
writer.save()
如果我使用header=False
,它将删除所有列名,但我想保留数据帧的开头。
答案 0 :(得分:1)
您可以更改逻辑-将每个DataFrame附加到列表dfs
和最后concat
的最后DataFrame
上:
writer = pd.ExcelWriter("pp.xlsx", engine='xlsxwriter')
b = ['Sending','paking']
dfs = []
for var in b: # Here 'b' is searchable keywords.
lower_df = df.apply(lambda x: x.astype(str).str.lower())
mask = (lower_df.iloc[:,3:5]
.astype(str)
.apply(lambda x: x.str.contains(var.lower()))
.any(axis=1))
margin = df[mask].copy()
margin['search_term'] = var #Create the column with search keyword
#print (margin)
dfs.append(margin)
pd.concat(dfs).to_excel(writer, index=False)
writer.save()