Pandas数据帧无法将列数据类型从对象转换为字符串以进行进一步操作

时间:2017-08-18 03:05:53

标签: python pandas type-conversion

这是我的工作代码,它是从网站下载excel文件。大约需要40秒。

运行此代码后,您会注意到Key1,Key2和Key3列是对象dtypes。我清理了数据框,使key1和key2只有字母数字值。仍然是熊猫将它作为对象dtype。我需要连接(如在MS Excel中)Key1和Key2来创建一个名为deviceid的单独列。我意识到我不能加入这两列,因为它们是对象dtypes。我如何转换为字符串以便我可以创建我的新列?

import pandas as pd
import urllib.request
import time

start=time.time()
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170816_da_bcsf.xls"
cnstsfxls = urllib.request.urlopen(url)
xlsf = pd.ExcelFile(cnstsfxls)
dfsf = xlsf.parse("Sheet1",skiprows=3)
dfsf.drop(dfsf.index[len(dfsf)-1],inplace=True)
dfsf.drop(dfsf[dfsf['Device Type'] == 'UN'].index, inplace=True)
dfsf.drop(dfsf[dfsf['Device Type'] == 'UNKNOWN'].index, inplace=True)
dfsf.drop(['Constraint Name','Contingency Name', 'Constraint Type','Flowgate Name'],axis=1, inplace=True)
end=time.time()
print("The entire process took - ", end-start, " seconds.")

1 个答案:

答案 0 :(得分:0)

我可能在这里忽略了这一点。但是,如果你要做的是构建一个列,例如deviceid = RCH417 Key1 = RCHKey2 = 417,那么dfsf['deviceid'] = dfsf['Key1'] + dfsf['Key2']即使两个列都是类型对象。

试试这个:

# Check value types
dfsf.dtypes

# Add your desired column
dfsf['deviceid'] = dfsf['Key1']  + dfsf['Key2']

# Inspect columns of interest
keep = ['Key1', 'Key2', 'deviceid']
df_keys = dfsf[keep]
print(df_keys.dtypes)

enter image description here

print(df_keys.head())

enter image description here