我想基于一个短语拆分一堆字符串,并获取第二个元素。但是,在无法分割字符串的情况下,我想保留第一个元素。这是显示我当前方法的示例,默认情况下,我总是提取第二个元素:
import pandas as pd
df = pd.DataFrame({"a" : ["this is a (test), it is", "yet another"]})
df["a"].str.split("\(test\)", 1).str[1]
如您所见,这(错误地)给了我
0 , it is
1 NaN
Name: a, dtype: object
我想要的输出应该是
0 , it is
1 yet another
Name: a, dtype: object
答案 0 :(得分:3)
将Series.fillna
添加到原始列a
:
df['b'] = df["a"].str.split("\(test\)", 1).str[1].fillna(df["a"])
#alternative
#df['b'] = df["a"].str.split("\(test\)", 1).str[1].combine_first(df["a"])
print (df)
a b
0 this is a (test), it is , it is
1 yet another yet another