Question

我正在阅读csv，但是当我仔细查看列名时，第一列名称旁边有一个奇怪的符号，有人可以帮我摆脱这个符号吗？

现在列名称的外观（不确定＆＃39;年份＆＃39;旁边的符号是什么意思：

['ï»¿year', 'sch', 'city', 'prop_id']

我希望列名看起来如何：

['year', 'sch', 'city', 'prop_id']

到目前为止我的代码：

import pandas as pd

path = ('file_path')

cameron_county = pd.read_table(path + '/2016_GCC_prelim_appraisal_info_20160630.txt',
                             encoding = 'latin1',error_bad_lines = False)

print(cameron_county.head(1))
print(cameron_county.columns)

提前谢谢你。

Answer 1

这看起来像unciode BOM 尝试

ï»¿

请参阅：https://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding

EF BB BF是utf-8 BOM十六进制代码的CP1252表示：StringBuilder

Answer 2

导入后解决方案可能如下所示：

columns = pd.Index(['ï»¿year', 'sch', 'city', 'prop_id'])
columns.str.replace(r'[^a-zA-Z0-9_-]', '')

Index([u'year', u'sch', u'city', u'prop_id'], dtype='object')

为什么数据框中的列名称旁边有符号？

2 个答案: