将具有选定列标题的CSV读取为Python中的一个CSV文件(按行读取)

时间:2019-10-07 12:33:12

标签: python pandas

我有一个问题。我想遍历包含例如名称为“ usr666”,然后仅通过选定的列标题将它们加载到pandas数据框中,然后将它们合并为一个文件,如以下示例所示:

BT_usr666.csv: 
number|size|person|car    |
---------------------------
31     |2   |Ringo |Tesla  |
82     |3   |Paul  |Audi   |
93     |2   |John  |BMW    |
74     |3   |George|MG     |


RS_usr666.csv:

number|color|person|doors|car    |
---------------------------------
33    |black|Mick  |2    |Porsche|
12    |red  |Keith |4    |Saab   |
55    |blue |Ron   |6    |Volvo  |

into FINAL_usr666.csv

person|car    |
---------------
Ringo |Tesla  |
Paul  |Audi   |
John  |BMW    |
George|MG     |
Mick  |Porsche|
Keith |Saab   |
Ron   |Volvo  |

有什么想法吗?

2 个答案:

答案 0 :(得分:1)

这可以做到

这将在“。”中搜索文件。即当前目录并查找以usr666开头的文件,然后执行您要求的操作

import pandas as pd
import os
x=pd.DataFrame()
for filename in sorted(os.listdir(".")):
    if filename.startswith("usr666"):
        y=pd.read_csv(filename)
        selected=y[["person","car"]]
        x=x.append(selected)
        x.to_csv('file1.csv',index=True)

答案 1 :(得分:1)

您可以尝试以下脚本。

代码

import glob
import os

import pandas as pd

def get_final_df(files):
    df = pd.DataFrame()

    your_columns = ['person', 'car']

    for file in files:
        temp_df = pd.read_csv(file, usecols = your_columns)
        df = df.append(temp_df, ignore_index=True)

    return df

if __name__ == '__main__':
    wd = os.getcwd() # I've set this as working dir, you can change the path to your files.
    files = [file for file in glob.glob(os.path.join(wd, '*')) if 'usr666' in file]
    final_df = get_final_df(files)
    final_df.to_csv('final_df.csv', index=False) # Write to file