我的文件包含公司,记录ID,销售等列。当我在将文件加载到pandas数据框后首次尝试查找其数据类型时,它列出了float / int几列。所以我将它们更改为字符串,如下所示:
data = pd.read_csv(filepath)
print(data.dtypes)
Company Code object
SiteCode int64
Product Name object
RECORD ID int64
Tank ID int64
Date int64
Sale Volume float64
Deliveries Volume int64
Dip Volume float64
然后在将输出写入文件之前,将它们的类型更改为字符串;
> data['RECORD ID'] = data['RECORD ID'].astype(str)
> data['Tank ID'] = data['Tank ID'].astype(str)
我到了;
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'RECORD ID'
如果我评论recordid并让Tankid进行转换,那么我得到TankId的相同键错误,为什么?
这是示例csv; SIRA RECORD ID
和TANK ID
列给出了问题
Company Code,SiteCode,Product Name,SIRA RECORD ID,Tank ID,Date,Sale Volume,Deliveries Volume,Dip Volume
XXX,20995,27PMAXDSL,3535352,4,20191004,4383.49,12902,16000
XXX,20995,02ULP,3535351,3,20191004,8221.573,15996,9987.32
XXX,20995,02ULP,3535350,2,20191004,7303.1,8201,11200
答案 0 :(得分:1)
如果需要将所有列更好地字符串化,请在dtype
中使用参数read_csv
:
data = pd.read_csv(filepath, dtype=str)
答案 1 :(得分:0)
尝试以下代码。索引列表直接传递给熊猫,而不直接传递给熊猫
SET,FW.O,AS,num:+18700000,num:+12355,#(5th field matched with FWmatch num:+123)
SET,IT,AS,num:+22211111,num:+12355,#(4th field matched with ITmatch num:+222)
SET,FW.O,AS,num:+177232,num:+12355,#(5th field matched with FWmatch num:+123)
答案 2 :(得分:0)
代替手动输入列名,请尝试以下方法作为测试:
for col in data.columns:
data[col] = data[col].astype(str)
这也应该适用于整个数据框
data = data.astype('str')