Question

嗨，我有一个这样的DataFrame 它是一个销售信息表，其中包含产品品牌的行索引和Price，Week和Timestamp的列索引。

timeperiod = pd.date_range(start='4/15/2019', periods=3,dtype='datetime64[ns]', freq='D')
df = pd.DataFrame({'Price':[[2000,2000,2000],[1000,1000,1000]],'Week':[[0,0,1],[0,0,1]],
                   'Timestamp': [timeperiod,timeperiod]},index = ['Apple','Huawei'])

上面的代码输出为：

         Price              Timestamp                                         Week
Apple   [2000, 2000, 2000]  DatetimeIndex(['2019-04-15', '2019-04-16', '20...   [0, 0, 1]
Huawei  [1000, 1000, 1000]  DatetimeIndex(['2019-04-15', '2019-04-16', '20...   [0, 0, 1]

现在，我想将数据框展平为三列 [价格，时间戳和周] ，并带有一系列数字的索引 [0,1,2]（自我在列表中有3个元素），并存储在两个数据帧中，这两个数据帧以原始数据帧索引Apple和Huawei命名。

所以结果应该是

Apple = pd.DataFrame({'Price':[2000,2000,2000],'Week':[0,0,1],
                   'Timestamp': timeperiod})
Huawei = pd.DataFrame({'Price':[1000,1000,1000],'Week':[0,0,1],
                   'Timestamp': timeperiod})

Apple:
   Price  Timestamp  Week
0   2000 2019-04-15     0
1   2000 2019-04-16     0
2   2000 2019-04-17     1

Huawei:
   Price  Timestamp  Week
0   1000 2019-04-15     0
1   1000 2019-04-16     0
2   1000 2019-04-17     1

Answer 1

使用其他答案中的this函数，我们可以将您的列逐一取消嵌套，然后再次将它们串联在一起：

df = pd.concat([explode_list(df, col)[col] for col in df.columns], axis=1)

输出：

        Price  Week  Timestamp
Apple    2000     0 2019-04-15
Apple    2000     0 2019-04-16
Apple    2000     1 2019-04-17
Huawei   1000     0 2019-04-15
Huawei   1000     0 2019-04-16
Huawei   1000     1 2019-04-17

最后，如果您想为每个唯一索引使用单独的数据帧，我们可以使用groupby：

dfs = [d for _, d in df.groupby(df.index)]

dfs[0]
print('\n')
dfs[1]

输出：

       Price  Week  Timestamp
Apple   2000     0 2019-04-15
Apple   2000     0 2019-04-16
Apple   2000     1 2019-04-17

        Price  Week  Timestamp
Huawei   1000     0 2019-04-15
Huawei   1000     0 2019-04-16
Huawei   1000     1 2019-04-17

链接答案中使用的功能：

def explode_list(df, col):
    s = df[col]
    i = np.arange(len(s)).repeat(s.str.len())
    return df.iloc[i].assign(**{col: np.concatenate(s)})

Answer 2

def explode(series): 
    return pd.DataFrame(dict(series.iteritems()))

for index, row in df.iterrows(): 
    print(index)
    print(explode(row))

如何在Pandas Dataframe中扁平化列表

2 个答案: