我有一张包含很多行的表,我需要更改所有第一个" "空间为" h"第二个空间是" m"并在最后添加char" s"
import pandas as pd
d = {'RA' : ['11 50 10.4747', "11 50 10.2641","11 50 10.0534", "11 50 09.8428"],'DEC':["+26 01 09.559","+26 01 10.770", "+26 01 11.980","+26 01 13.191"]}
df=pd.DataFrame(d)
for i in range(len(df)):
RA = df['RA'][i]
RA = str.replace(RA, " ", "h", 1)
RA = str.replace(RA, " ", "m", 1)
RA += "s"
df['RA'][i] = RA
DEC = df['DEC'][i]
DEC = str.replace(DEC, " ", "d", 1)
DEC = str.replace(DEC, " ", "m", 1)
DEC += "s"
df['DEC'][i] = DEC
我已经制作了这段代码,但是我的使用速度很慢。 使用此代码,我从以下位置更改了数据框:
DEC RA
+26 01 09.559 11 50 10.4747
+26 01 10.770 11 50 10.2641
+26 01 11.980 11 50 10.0534
+26 01 13.191 11 50 09.8428
To This:
DEC RA
0 +26d01m09.559s 11h50m10.4747s
1 +26d01m10.770s 11h50m10.2641s
2 +26d01m11.980s 11h50m10.0534s
3 +26d01m13.191s 11h50m09.8428s
我可以做任何功能吗? 我尝试使用df.replace,但它取代了所有" "在表格中......
现在谢谢
答案 0 :(得分:4)
定义一个处理分裂和重组的函数。您可以使用str.split
,然后分别使用一些超级简单的字符串连接。
def split_combine(v, letters=list('dms')):
v = v.str.split(expand=True)
return (
v[0] + letters[0]
+ v[1] + letters[1]
+ v[2] + letters[2]
)
现在,使用适当的参数调用它。
df['DEC'] = split_combine(df.DEC, list('dms'))
df['RA'] = split_combine(df.RA, list('hms'))
df
DEC RA
0 +26d01m09.559s 11h50m10.4747s
1 +26d01m10.770s 11h50m10.2641s
2 +26d01m11.980s 11h50m10.0534s
3 +26d01m13.191s 11h50m09.8428s
答案 1 :(得分:2)
尝试:
d['DEC'] = d['DEC'].str.replace(' ','d',1).str.replace(' ','m',1) + 's'
d['RA'] = d['RA'].str.replace(' ','h',1).str.replace(' ','m',1) + 's'
或定义一个函数:
def repl(series, replace_letters):
return series.str.replace(' ',replace_letters[0],1).str.replace(' ',replace_letters[1],1) + replace_letters[2]
并在两列上调用函数:
d['DEC'] = repl(d['DEC'],'dms')
d['RA'] = repl(d['RA'],'hms')
结果都是
DEC RA
0 +26d01m09.559s 11h50m10.4747s
1 +26d01m10.770s 11h50m10.2641s
2 +26d01m11.980s 11h50m10.0534s
3 +26d01m13.191s 11h50m09.8428s
答案 2 :(得分:1)
您可以使用:
df['RA'] = df['RA'].str.split(' ').str[0] + 'h' + df['RA'].str.split(' ').str[1] + 'm' + df['RA'].str.split(' ').str[2]+ 's'
df['DEC'] = df['DEC'].str.split(' ').str[0] + 'd' + df['DEC'].str.split(' ').str[1] + 'm' + df['DEC'].str.split(' ').str[2]+ 's'
输出:
DEC RA
0 +26d01m09.559s 11h50m10.4747s
1 +26d01m10.770s 11h50m10.2641s
2 +26d01m11.980s 11h50m10.0534s
3 +26d01m13.191s 11h50m09.8428s