我有一个数据框,其中包含两个单独列中的列表列表。
import pandas as pd
data = pd.DataFrame()
data["Website"] = [["google.com", "amazon.com"], ["google.com"], ["aol.com", "no website"]]
data["App"] = [["Ok Google", "Alexa"], ["Ok Google"], ["AOL App", "Generic Device"]]
Thats how the Dataframe looks like
我需要用第二列(此处为“ Generic Device”)中的相应字符串替换第一列中的某些字符串(此处:“无网站”)。替换字符串在列表中的索引与需要替换的字符串相同。
目前尚无法解决的问题: 我尝试了几种形式的 str.replace(x,y)用于列表和DataFrame,但没有任何效果。一个简单的 replace(x,y)无效,因为我需要替换几个不同的字符串。我想我无法理解索引的问题。 我已经用Google搜索和stackoverflow两个小时了,还没有找到解决方案。
非常感谢!对不起,糟糕的错误或新手错误,我仍在学习。
-最大
答案 0 :(得分:2)
尝试此操作,您可以在数组中定义可替换值并执行。
def f(x,items):
for rep in items:
if rep in list(x.Website):
x.Website[list(x.Website).index(rep)]=list(x.App)[list(x.Website).index(rep)]
return x
items = ["no website"]
data = data.apply(lambda x: f(x,items),axis=1)
输出:
Website App
0 [google.com, amazon.com] [Ok Google, Alexa]
1 [google.com] [Ok Google]
2 [aol.com, Generic Device] [AOL App, Generic Device]
答案 1 :(得分:2)
定义替换功能并用于矢量化
def replacements(websites, apps):
" Substitute items in list replace_items that's found in websites "
replace_items = ["no website", ] # can add to this list of keys
# that trigger replacement
for i, k in enumerate(websites):
# Check each item in website for replacement
if k in replace_items:
# This is an item to be replaced
websites[i] = apps[i] # replace with corresponding item in apps
return websites
# Create Dataframe
websites = [["google.com", "amazon.com"], ["google.com"], ["aol.com", "no website"]]
app = [["Ok Google", "Alexa"], ["Ok Google"], ["AOL App", "Generic Device"]]
data = list(zip(websites, app))
df = pd.DataFrame(data, columns = ['Websites', 'App'])
# Perform replacement
df['Websites'] = df.apply(lambda row: replacements(row['Websites'], row['App']), axis=1)
print(df)
输出
Websites App
0 [google.com, amazon.com] [Ok Google, Alexa]
1 [google.com] [Ok Google]
2 [aol.com, Generic Device] [AOL App, Generic Device]
答案 2 :(得分:1)
首先,节日快乐!
我不确定自己的预期输出是什么,也不确定您以前尝试过的输出,但是我认为这可能有效:
data["Website"] = data["Website"].replace("no website", "Generic Device")
我真的希望这会有所帮助!
答案 3 :(得分:1)
您可以创建如下函数:
def f(replaced_value, col1, col2):
def r(s):
while replaced_value in s[col1]:
s[col1][s[col1].index(replaced_value)] = s[col2][s[col1].index(replaced_value)]
return s
return r
并使用apply
:
df=df.apply(f("no website","Website","App"), axis=1)
print(df)