我有一个df:
string_pos = {'string': [ 'aabb', 'ddcc', ],
'position_1': [0, 1],
'position_2': [3, 4]}
输出:
string position_1 position_2
0 aabb 0 3
1 ddcc 1 4
然后我写我想会添加一个新列的子串为'string'列的子句:
df['short_string'] = df.string.str[df['position_1'], df['position_2']]
但是它返回:
string position_1 position_2 short_string
0 aabb 0 3 NaN
1 ddcc 1 4 NaN
我正在尝试获取:
string position_1 position_2 short_string
0 aabb 0 3 aab
1 ddcc 1 4 dcc
答案 0 :(得分:1)
我认为您需要按DataFrame.apply
和lambda function
逐行处理:
df['short_string'] = df.apply(lambda x: x['string'][x['position_1']:x['position_2']], axis=1)
或将列表理解与zip
一起使用:
zipped = zip(df['string'], df['position_1'], df['position_2'])
df['short_string'] = [a[b:c] for a,b,c in zipped]
print (df)
string position_1 position_2 short_string
0 aabb 0 3 aab
1 ddcc 1 4 dcc