我想遍历数据帧的每一行,如果列与列表中的字符串之间存在匹配,我会在新列中添加一个元素。 在此示例中,我想添加一个新列以对产品进行分类..因此,如果该列的某行与列表之一匹配,则该类别可以是“饮料”或“食品”,如果不匹配,则该类别将是其他。
ui <- material_page(
useShinyjs(),
title = NULL,
# Define tabs
material_tabs(
tabs = c(
"First Tab" = "first_tab",
"Second Tab" = "second_tab"
)
),
# Define tab content
material_tab_content(
tab_id = "first_tab",
tags$h1("First Tab Content")
),
material_tab_content(
tab_id = "second_tab",
tags$h1("Second Tab Content")
)
)
server <- function(input, output, session) {
#Below does not work
# observe({
# if(session$sendCustomMessage(type = "shinymaterialJS", "$('li.tab a.active[href$=\"#second_tab\"]')")){
# print("Hello")
# }
#
# })
}
shinyApp(ui = ui, server = server)
输出为:
list_drinks={'Water','Juice','Tea'}
list_food={'Apple','Orange'}
data = {'Price': ['1', '5','3'], 'Product': ['Juice','book', Pen]}
for (i,j) in itertools.zip_longest(list_drinks,list_food):
for index in data.index:
if(j in data.loc[index,'product']):
data["Category"] = "Food"
elif(i in data.loc[index,'product']):
data["Category"] ="drinks"
else:
data["Category"]="Other"
我的问题主要是我不知道如何匹配列表和行之间的模式。我也尝试过:
Price Product Category
1 Juice drinks
5 book Other
3 Pen Other
,但无效。
答案 0 :(得分:1)
无需循环。您可以将.isin()
与np.select()
结合使用,以根据条件返回结果。参见以下代码:
import pandas as pd
import numpy as np
list_drinks=['Water','Juice','Tea']
list_food=['Apple','Orange']
data = {'Price': ['1', '5','3'],
'Product': ['Juice','book','Pen']}
df = pd.DataFrame(data)
df['Category'] = np.select([(df['Product'].isin(list_drinks)),
(df['Product'].isin(list_food))],
['drinks',
'food'], 'Other')
df
Out[1]:
Price Product Category
0 1 Juice drinks
1 5 book Other
2 3 Pen Other
下面,我将代码分解为更多细节,以便您了解其工作原理。根据您的评论,我也做了些微改动。我使用列表推导和in
检查列表中的值是否在数据框的值的子字符串中。为了提高匹配率,我还将.lower()
的所有小写字母进行了比较:
import pandas as pd
import numpy as np
list_drinks=['Water','Juice','Tea']
list_food=['Apple','Orange']
data = {'Price': ['1', '5','3'],
'Product': ['green Juice','book','oRange you gonna say banana']}
df = pd.DataFrame(data)
c1 = (df['Product'].apply(lambda x: len([y for y in list_drinks if y.lower() in x.lower()]) > 0))
c2 = (df['Product'].apply(lambda x: len([y for y in list_food if y.lower() in x.lower()]) > 0))
r1 = 'drinks'
r2 = 'food'
conditions = [c1,c2]
results= [r1,r2]
df['Category'] = np.select(conditions, results, 'Other')
df
Out[1]:
Price Product Category
0 1 green Juice drinks
1 5 book Other
2 3 oRange you gonna say banana food
答案 1 :(得分:1)
这是另一种选择-
import itertools
import pandas as pd
list_drinks={'Water','Juice','Tea'}
list_food={'Apple','Orange'}
data = pd.DataFrame({'Price': ['1', '5','3'], 'Product': ['Juice','book', 'Pen']})
category = list()
for prod in data['Product']:
if prod in list_food:
category.append("Food")
elif prod in list_drinks:
category.append("drinks")
else:
category.append("Other")
data['Category']= category
print(data)
输出-
Price Product Category
1 Juice drinks
5 book Other
3 Pen Other