如何在DataFrame中有效搜索值并将其放入另一个DataFrame中

时间:2016-01-17 14:33:05

标签: python csv pandas dataframe

我有两个CSV文件:

File1中

id  text_feature  value
1   feature2      20
1   feature3      5
2   feature2      20
...

文件2

id  feature2  feature3
1   1         1
2   1         0
...

根据这些文件,我想得到以下文件(即用values代替1' s和0')

文件3

id  feature2  feature3
1   20        5
2   20        0
...

这是我尝试解决任务的方法,但需要很长时间(我的CSV文件大约有20,000个条目):

import pandas as pd

def find_value(df_data, df_row, column_name):
    value = 0
    for index, row in df_data.iterrows():
        f = row['feature'].replace(' ','')
        if row['id'] == df_row['id'] and f == column_name:
            value = row['volume']
            break
    return value

df_data = pd.read_csv("File1.csv")
df_textfeatures = pd.read_csv("File2.csv")

for index, row in df_textfeatures.iterrows():
    for column_name, column in df_textfeatures.transpose().iterrows():
        row[column_name] = find_value(df_data, row, column_name)

1 个答案:

答案 0 :(得分:2)

您可以直接转动dataframe调用的文件1:

d = file1.pivot_table(index='id',columns='text_feature',values='value')

返回:

text_feature  feature2  feature3
id                              
1                   20         5
2                   20       NaN

要获得所需内容,您可以使用0:

填充NaN
d.fillna(0)

返回:

text_feature  feature2  feature3
id                              
1                   20         5
2                   20         0

编辑:

然后必须重置索引以将索引设置为列:

d.reset_index()

返回:

text_feature  id  feature2  feature3
0              1        20         5
1              2        20         0