如何替换DataFrame(Python)中的字符串列表中的字符串?

时间:2019-12-26 01:37:50

标签: python pandas dataframe

我有一个数据框,其中包含两个单独列中的列表列表。

import pandas as pd
data = pd.DataFrame()
data["Website"] = [["google.com", "amazon.com"], ["google.com"], ["aol.com", "no website"]]
data["App"] = [["Ok Google", "Alexa"], ["Ok Google"], ["AOL App", "Generic Device"]]

Thats how the Dataframe looks like

我需要用第二列(此处为“ Generic Device”)中的相应字符串替换第一列中的某些字符串(此处:“无网站”)。替换字符串在列表中的索引与需要替换的字符串相同。

目前尚无法解决的问题: 我尝试了几种形式的 str.replace(x,y)用于列表和DataFrame,但没有任何效果。一个简单的 replace(x,y)无效,因为我需要替换几个不同的字符串。我想我无法理解索引的问题。 我已经用Google搜索和stackoverflow两个小时了,还没有找到解决方案。

非常感谢!对不起,糟糕的错误或新手错误,我仍在学习。

-最大

4 个答案:

答案 0 :(得分:2)

尝试此操作,您可以在数组中定义可替换值并执行。

def f(x,items):
    for rep in items:
        if rep in list(x.Website):
            x.Website[list(x.Website).index(rep)]=list(x.App)[list(x.Website).index(rep)]    
    return x

items = ["no website"]
data = data.apply(lambda x: f(x,items),axis=1)

输出:

                     Website                        App
0   [google.com, amazon.com]         [Ok Google, Alexa]
1               [google.com]                [Ok Google]
2  [aol.com, Generic Device]  [AOL App, Generic Device]

答案 1 :(得分:2)

定义替换功能并用于矢量化

def replacements(websites, apps):
    " Substitute items in list replace_items that's found in websites "
    replace_items = ["no website", ] # can add to this list of keys 
                                     # that trigger replacement

    for i, k in enumerate(websites):
        # Check each item in website for replacement
        if k in replace_items:
            # This is an item to be replaced
            websites[i] = apps[i]  # replace with corresponding item in apps

    return websites

# Create Dataframe
websites = [["google.com", "amazon.com"], ["google.com"], ["aol.com", "no website"]]
app = [["Ok Google", "Alexa"], ["Ok Google"], ["AOL App", "Generic Device"]]
data = list(zip(websites, app))
df = pd.DataFrame(data, columns = ['Websites', 'App'])

# Perform replacement
df['Websites'] = df.apply(lambda row: replacements(row['Websites'], row['App']), axis=1)
print(df)

输出

                   Websites                        App
0   [google.com, amazon.com]         [Ok Google, Alexa]
1               [google.com]                [Ok Google]
2  [aol.com, Generic Device]  [AOL App, Generic Device]

答案 2 :(得分:1)

首先,节日快乐!

我不确定自己的预期输出是什么,也不确定您以前尝试过的输出,但是我认为这可能有效:

data["Website"] = data["Website"].replace("no website", "Generic Device")

我真的希望这会有所帮助!

答案 3 :(得分:1)

您可以创建如下函数:

def f(replaced_value, col1, col2):
    def r(s):
        while replaced_value in s[col1]:
            s[col1][s[col1].index(replaced_value)] = s[col2][s[col1].index(replaced_value)]
        return s
    return r

并使用apply

df=df.apply(f("no website","Website","App"), axis=1)
print(df)