Question

我偶然发现了类似的问题（Get last "column" after .str.split() operation on column in pandas DataFrame），并使用了一些代码。但是，这不是我想要的输出。

raw_data = {
    'category': ['sweet beverage, cola,sugared', 'healthy,salty snacks', 'juice,beverage,sweet', 'fruit juice,beverage', 'appetizer,salty crackers'],
    'product_name': ['coca-cola', 'salted pistachios', 'fruit juice', 'lemon tea', 'roasted peanuts']}                                                      
df = pd.DataFrame(raw_data)

目标是从每一行中提取各种类别，并仅使用最后两个类别来创建新列。我有这个代码，它有效，我将感兴趣的类别作为一个新列。

df['my_col'] = df.categories.apply(lambda s:s.split(',')[-2:])

output
my_col 
[cola,sugared]
[healthy,salty snacks]
[beverage,sweet]
...

但是，它显示为列表。我怎么能不将它显示为列表？这可以实现吗？谢谢大家!!!!!

Answer 1

我认为您需要str.split，选择最后列表并上传str.join：

augmented/

编辑：

在我看来，pandas df['my_col'] = df.category.str.split(',').str[-2:].str.join(',') print (df) category product_name my_col 0 sweet beverage, cola,sugared coca-cola cola,sugared 1 healthy,salty snacks salted pistachios healthy,salty snacks 2 juice,beverage,sweet fruit juice beverage,sweet 3 fruit juice,beverage lemon tea fruit juice,beverage 4 appetizer,salty crackers roasted peanuts appetizer,salty crackers text functions更推荐为带有puru python字符串函数的str，因为还可以使用apply和NaN。

None

AttributeError：＆＃39; float＆＃39;对象没有属性＆＃39; split＆＃39;

Answer 2

您还可以在join中使用lambda split的结果：

df['my_col'] = df.category.apply(lambda s: ','.join(s.split(',')[-2:]))
df

结果：

                       category       product_name                    my_col
0  sweet beverage, cola,sugared          coca-cola              cola,sugared
1          healthy,salty snacks  salted pistachios      healthy,salty snacks
2          juice,beverage,sweet        fruit juice            beverage,sweet
3          fruit juice,beverage          lemon tea      fruit juice,beverage
4      appetizer,salty crackers    roasted peanuts  appetizer,salty crackers

在str.split操作之后创建具有最后2个值的新列

2 个答案: