Question

我有一个数据集，其中对话列有ID列表：

框架名称：movie_conversations

A   B   Movie   Dialogue
u0  u2  m0      ['L985','L984','L925']

和Mapper到对话列数据集如下所示：

框架名称：conversation_mapping

Dialogue_No A   Movie   Name    Dialogue_Str
L985        u0  m0     BIANCA   I hope so.
L984        u2  m0     CAMERON  She okay?
L925        u0  m0     BIANCA   Let's go.

我想将所有三个字符串放在一个字符串中，并添加到第一个数据帧的新列中。

所以看起来应该是这样的：

A   B   Movie   Dialogue               Dialogue_
u0  u2  m0      ['L985','L984','L925'] I hope so.<t>She okay?<t>Let's go.

所以我想，让我写一个lambda函数：

movie_conversation.Dialogue_Str = movie_conversation.Dialogue.apply(lambda x : word = list() for index in x word.append(conversations_mapping.loc[conversations_mapping.Dialogue_No == index_,'Dialogue_Str'].iloc[0]))

上面的代码不起作用：。

基本上，我想实现这样的功能：

index = ['L985','L984','L925']
a = ""
count = 0
for index_ in range(len(index)): 
    if ( len(index)  == count + 1 ):
        a += str(conversations_mapping.loc[conversations_mapping.Dialogue_No == index[index_],'Dialogue_Str'].iloc[0]) 
    else: 
        a += str(conversations_mapping.loc[conversations_mapping.Dialogue_No == index[index_],'Dialogue_Str'].iloc[0]) + '<t>'

    count += 1

我应该使用lambda函数还是可以实现其他任何方式？

Answer 1

不需要lambda。就此而言，也不适用。首先，生成对话数与对话的映射。

dialogue_mapper = dict(
    conversation_mapping[['Dialogue_No', 'Dialogue_Str']].values
)

现在，使用列表推导，使用dict.get将对话框列表替换为对话框。

movie_conversations['Dialogue'] = [
    '<t>'.join([dialogue_mapper.get(k) for k in v]) 
    for v in movie_conversations.Dialogue
]

movie_conversations

    A   B Movie                            Dialogue
0  u0  u2    m0  I hope so.<t>She okay?<t>Let's go.

根据另一个数据帧转换一列列表

1 个答案: