Question

尝试仅将.txt文件的某些列输出到.csv

PANDAS文档和this answer使我走了这么远：

import pandas as pd

read_file = pd.read_csv (r'death.txt')

header = ['County', 'Crude Rate']

read_file.to_csv (r'death.csv', columns=header, index=None)

但是我收到一个错误：

KeyError: "None of [Index(['County', 'Crude Rate'], dtype='object')] are in the [columns]"

这令人困惑，因为我正在使用的.txt文件是数百行（来自政府数据库）的以下内容：

"Notes" "County"    "County Code"   Deaths  Population  Crude Rate
    "Autauga County, AL"    "01001" 7893    918492  859.3
    "Baldwin County, AL"    "01003" 30292   3102984 976.2
    "Barbour County, AL"    "01005" 5197    499262  1040.9

我注意到前三列标题用引号引起来，而后三列则没有。我已经尝试过在列顺序中添加引号（例如““ County”“），但是没有运气。根据错误，我意识到在键入标题和在脚本中如何读取它们之间，列标题之间存在一些差异。

我们将帮助您理解这种差异。

Answer 1

您正在使用默认选项读取文件

read_file = pd.read_csv (r'death.txt')

更改为

read_file = pd.read_csv (r'death.txt', sep="\t")

检查

df.columns
Index(['Notes', 'County', 'County Code', 'Deaths', 'Population', 'Crude Rate'], dtype='object')

和....
您应该先过滤列，然后保存。
现在，如果您的栏没问题：

read_file[['County', 'Crude Rate']].to_csv (r'death.csv', index=None)

熊猫文本文件转换为CSV

1 个答案: