我想知道是否存在一种通过定界符分割列然后删除扩展列的方法。目前,这是我正在尝试执行的操作,但未按我想要的方式工作。
import pandas as pd
df = {'ID': [3009, 129,119,120,121 ],
'temp': ['75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0'],
'Prob': [1,1,0.8,0.8056,0.9]}
df = pd.DataFrame(df)
ID Prob temp
0 3009 1.0000 75.0~54.0
1 129 1.0000 75.0~54.0
2 119 0.8000 75.0~54.0
3 120 0.8056 75.0~54.0
4 121 0.9000 75.0~54.0
5 122 0.8050 75.0~54.0
df['temp','temp2'] = = df['temp'].str.split('~', expand=True)
我的目标是用定界符将其拆分,然后在现有数据帧(df)中添加新列:
ID Prob temp temp2
0 3009 1.0000 75.0 54.0
1 129 1.0000 75.0 54.0
2 119 0.8000 75.0 54.0
3 120 0.8056 75.0 54.0
4 121 0.9000 75.0 54.0
5 122 0.8050 75.0 54.0
这样我就可以删除temp2列
答案 0 :(得分:2)
您可以为拆分创建索引(这样就不必处理temp2
列):
df['temp'] = df['temp'].str.split('~', expand=True)[0]
print(df)
打印:
ID temp Prob
0 3009 75.0 1.0000
1 129 75.0 1.0000
2 119 75.0 0.8000
3 120 75.0 0.8056
4 121 75.0 0.9000
答案 1 :(得分:1)
如果要从数据框中删除列,可以尝试使用str.split()
,然后使用.drop()
:
import pandas as pd
import numpy as np
data = {'ID': [3009, 129,119,120,121 ],
'temp': ['75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0'],
'Prob': [1,1,0.8,0.8056,0.9]}
df = pd.DataFrame(data)
df['temp~'] = df['temp'].str.split('~')
df['temp_1'] = df['temp~'].str.get(0)
df = df.drop(columns=['temp~'])
print(df)
输出:
ID temp Prob temp_1
0 3009 75.0~54.0 1.0000 75.0
1 129 75.0~54.0 1.0000 75.0
2 119 75.0~54.0 0.8000 75.0
3 120 75.0~54.0 0.8056 75.0
4 121 75.0~54.0 0.9000 75.0