熊猫:在字符串列中填充占位符

时间:2018-08-08 14:29:29

标签: python pandas string-formatting

我正在使用如下所示的pandas DataFrame:

df = pd.DataFrame(
    [['There are # people', '3', np.nan], ['# out of # people are there', 'Five', 'eight'],
     ['Only # are here', '2', np.nan], ['The rest is at home', np.nan, np.nan]])

导致:

    0                            1     2
0   There are # people           3     NaN
1   # out of # people are there  Five  eight
2   Only # are here              2     NaN
3   The rest is at home          NaN   NaN

我想用第1列和第2列中变化的字符串替换#占位符,结果是:

0   There are 3 people
1   Five out of eight people are there
2   Only 2 are here
3   The rest is at home

我该如何实现?

3 个答案:

答案 0 :(得分:2)

使用字符串格式

df=df.replace({'#':'%s',np.nan:'NaN'},regex=True)

l=[]

for x , y in df.iterrows():
    if  y[2]=='NaN' and y[1]=='NaN':
        l.append(y[0])
    elif y[2]=='NaN':
        l.append(y[0] % (y[1]))
    else:
        l.append(y[0] % (y[1], y[2]))
l
Out[339]: 
['There are 3 people',
 'Five out of eight people are there',
 'Only 2 are here',
 'The rest is at home']

答案 1 :(得分:0)

通用替换函数,以防您可能需要添加更多值: 如果字符串中的给定字符使用值列表替换所有实例(在您的情况下为两个,但可以处理更多实例)

def replace_hastag(text, values, replace_char='#'):
    for v in values:
        if v is np.NaN:
            return text
        else:
            text = text.replace(replace_char, str(v), 1)
    return text


df['text'] = df.apply(lambda r: replace_hastag(r[0], values=[r[1], r[2]]), axis=1)

结果

In [79]: df.text
Out[79]:
0                    There are 3 people
1    Five out of eight people are there
2                       Only 2 are here
3                   The rest is at home
Name: text, dtype: object

答案 2 :(得分:0)

一种更简洁的方法。

java.lang.RuntimeException: Unable to get connnection 
jdbc:mondrianataSource=Report DB; EnableXmla=false; overwrite=false; 
Locale=en_US; Catalog=mondrian:/My_Cube_Schema; UseContentChecksum=true
at com.pentaho.analyzer.service.impl.OlapConnectionManagerImpl.createConnection(SourceFile:152)
at com.pentaho.analyzer.service.impl.OlapConnectionManagerImpl.createConnection(SourceFile:75)
at com.pentaho.analyzer.service.impl.a.getConnection(SourceFile:55)

cols = df.columns
df[cols[0]] = df.apply(lambda x: x[cols[0]].replace('#',str(x[cols[1]]),1) if x[cols[1]]!=np.NaN else x,axis=1)
print(df.apply(lambda x: x[cols[0]].replace('#',str(x[cols[2]]),1) if x[cols[2]]!=np.NaN else x,axis=1))

如果您需要为更多列这样做

Out[12]:
0                    There are 3 people
1    Five out of eight people are there
2                       Only 2 are here
3                   The rest is at home
Name: 0, dtype: object