在数据框列的多个位置插入字符

时间:2018-03-26 19:40:04

标签: python string pandas dataframe replace

我有一张包含很多行的表,我需要更改所有第一个" "空间为" h"第二个空间是" m"并在最后添加char" s"

import pandas as pd

d = {'RA' : ['11 50 10.4747', "11 50 10.2641","11 50 10.0534", "11 50 09.8428"],'DEC':["+26 01 09.559","+26 01 10.770", "+26 01 11.980","+26 01 13.191"]}
df=pd.DataFrame(d)
for i in range(len(df)):
    RA = df['RA'][i]
    RA = str.replace(RA, " ", "h", 1)
    RA = str.replace(RA, " ", "m", 1)
    RA += "s"
    df['RA'][i] = RA
    DEC = df['DEC'][i]
    DEC = str.replace(DEC, " ", "d", 1)
    DEC = str.replace(DEC, " ", "m", 1)
    DEC += "s"
    df['DEC'][i] = DEC

我已经制作了这段代码,但是我的使用速度很慢。 使用此代码,我从以下位置更改了数据框:

     DEC             RA
+26 01 09.559  11 50 10.4747 
+26 01 10.770  11 50 10.2641  
+26 01 11.980  11 50 10.0534 
+26 01 13.191  11 50 09.8428

To This:

         DEC              RA
0  +26d01m09.559s  11h50m10.4747s
1  +26d01m10.770s  11h50m10.2641s
2  +26d01m11.980s  11h50m10.0534s
3  +26d01m13.191s  11h50m09.8428s

我可以做任何功能吗? 我尝试使用df.replace,但它取代了所有" "在表格中......

现在谢谢

3 个答案:

答案 0 :(得分:4)

定义一个处理分裂和重组的函数。您可以使用str.split,然后分别使用一些超级简单的字符串连接。

def split_combine(v, letters=list('dms')):
    v = v.str.split(expand=True)
    return (
       v[0] + letters[0] 
     + v[1] + letters[1] 
     + v[2] + letters[2]
    )

现在,使用适当的参数调用它。

df['DEC'] = split_combine(df.DEC, list('dms'))
df['RA'] = split_combine(df.RA, list('hms'))

df
              DEC              RA
0  +26d01m09.559s  11h50m10.4747s
1  +26d01m10.770s  11h50m10.2641s
2  +26d01m11.980s  11h50m10.0534s
3  +26d01m13.191s  11h50m09.8428s

答案 1 :(得分:2)

尝试:

 d['DEC'] = d['DEC'].str.replace(' ','d',1).str.replace(' ','m',1) + 's'
 d['RA'] = d['RA'].str.replace(' ','h',1).str.replace(' ','m',1) + 's'

或定义一个函数:

def repl(series, replace_letters):
    return series.str.replace(' ',replace_letters[0],1).str.replace(' ',replace_letters[1],1) + replace_letters[2]

并在两列上调用函数:

d['DEC'] = repl(d['DEC'],'dms')
d['RA'] = repl(d['RA'],'hms')

结果都是

           DEC           RA
0   +26d01m09.559s  11h50m10.4747s
1   +26d01m10.770s  11h50m10.2641s
2   +26d01m11.980s  11h50m10.0534s
3   +26d01m13.191s  11h50m09.8428s

答案 2 :(得分:1)

您可以使用:

df['RA'] = df['RA'].str.split(' ').str[0] + 'h' +  df['RA'].str.split(' ').str[1] + 'm' + df['RA'].str.split(' ').str[2]+ 's'
df['DEC'] = df['DEC'].str.split(' ').str[0] + 'd' +  df['DEC'].str.split(' ').str[1] + 'm' + df['DEC'].str.split(' ').str[2]+ 's'

输出:

              DEC              RA
0  +26d01m09.559s  11h50m10.4747s
1  +26d01m10.770s  11h50m10.2641s
2  +26d01m11.980s  11h50m10.0534s
3  +26d01m13.191s  11h50m09.8428s