Question

我有一个充满法语单词，结尾和新结尾的Dataframe。我想创建一个第4列，替换为单词：

word   |ending|new ending|what i want|
--------------------------------------
placer |cer   |ceras     |placeras   |
placer |cer   |cerait    |placerait  |
placer |cer   |ceront    |placeront  |
finir  |ir    |iras      |finiras    |

所以它基本上是在第1列中用第3列中的内容替换第2列中的等价物。

有什么想法吗？

Answer 1

以下是使用.loc访问者的一种方式：

import pandas as pd

df = pd.DataFrame({'word': ['placer', 'placer', 'placer'],
                   'ending': ['cer', 'cer', 'cer'],
                   'new_ending': ['ceras', 'cerait', 'ceront']})

df['result'] = df['word']
df['lens'] = df['ending'].map(len)

df.loc[pd.Series([i[-j:] for i, j in zip(df['word'], df['lens'])]) == df['ending'], 'result'] = \
pd.Series([i[:-j] for i, j in zip(df['word'], df['lens'])]) + df['new_ending']

df = df[['word', 'ending', 'new_ending', 'result']]

#      word ending new_ending     result
# 0  placer    cer      ceras   placeras
# 1  placer    cer     cerait  placerait
# 2  placer    cer     ceront  placeront

Answer 2

使用apply()：

df['new_word'] = df.apply(
    lambda row: row['word'].replace(row['ending'], row['new ending']),
    axis=1
)
#     word ending new ending   new_word
#0  placer    cer      ceras   placeras
#1  placer    cer     cerait  placerait
#2  placer    cer     ceront  placeront
#3   finir     ir       iras    finiras

正如@jpp所指出的，这种方法的一个警告是，如果结尾存在于字符串的中间，它将无法正常工作。

在这种情况下，请参阅this post有关如何在字符串末尾替换的内容。

Answer 3

这是另一种解决方案：

df.word.replace(df.ending, '', regex=True).str.cat(df["new ending"].astype(str))

和输出：

0     placeras
1    placerait
2    placeront

用python数据帧中的新结尾替换单词的结尾

3 个答案: