Question

我有一个数据框，在其中创建新列并填充其值。根据条件，如果再次遇到该行，则新列需要附加一些值。

例如对于给定的数据帧：

df

id   Stores                  is_open
1   'Walmart', 'Target'      true
2   'Best Buy'               false
3   'Target'                 true
4   'Home Depot'             true

现在，如果我想添加一个新列作为Ticker，它可以是给定逗号分隔的商店的逗号分隔的股票行情或列表字符串（无论哪种情况，这是更可取的选择，并且比较容易。

例如，沃尔玛的股票代号为wmt，目标股票为tgt。我从另一个基于匹配键的数据帧中获取的wmt和tgt数据，因此我尝试添加如下内容，但即使它们具有值且只有一个值后跟逗号，也未分配所有数据分配给Tickers列而不是多个：

df['Tickers'] = '' 
for _, row in df.iterrows():
        stores = row['Stores']
        list_stores = stores(',')
        if len(list_stores) > 1:
            for store in list_stores:
                tmp_df = second_df[second_df['store_id'] == store]

                ticker = tmp_df['Ticker'].values[0] if len(tmp_df['Ticker'].values) > 0 else None

                if ticker:
                    df.loc[
                      df['Stores'].astype(str).str.contains(store), 'Ticker'] += '{},'.format(ticker)

预期输出：

id   Stores                  is_open      Ticker
1   'Walmart', 'Target'      true         wmt, tgt
2   'Best Buy'               false        bby
3   'Target'                 true         tgt
4   'Home Depot'             true         nan

如果有人可以帮助我，我将非常感谢。

Answer 1

您可以将apply方法与axis=1一起使用，以传递行并执行计算。请参见下面的代码：

import pandas as pd
mydict = {'id':[1,2],'Store':["'Walmart','Target'","'Best Buy'"], 'is_open':['true', 'false']}
df = pd.DataFrame(mydict, index=[0,1])
df.set_index('id',drop=True, inplace=True)

到目前为止的df：

                 Store is_open
id                            
1   'Walmart','Target'    true
2           'Best Buy'   false

查找数据框：

df2 = pd.DataFrame({'Store':['Walmart', 'Target','Best Buy'], 'Ticker':['wmt','tgt','bby']})

      Store Ticker
0   Walmart    wmt
1    Target    tgt
2  Best Buy    bby

这是添加列的代码：

def add_column(row):

    items = row['Store'].split(',')

    tkr_list = []

    for string in items:
        mystr = string.replace("'","")

        tkr = df2.loc[df2['Store']==mystr,'Ticker'].values[0]

        tkr_list.append(tkr)

    return tkr_list


df['Ticker']=df.apply(add_column, axis=1)

这是df的结果：

                 Store is_open      Ticker
id                                        
1   'Walmart','Target'    true  [wmt, tgt]
2           'Best Buy'   false       [bby]

如何在熊猫数据框中将值附加到单元格

1 个答案: