获取多个列并使用pandas将它们放到相同的索引中

时间:2018-04-19 14:21:52

标签: python database pandas dataframe database-normalization

我有一堆丑陋的客户数据,我正在努力规范化。它基本上是这样的:

Customer   Order1   Order2   Order3 ... OrderN

  John      This     That     The   ...  Other
 Shelly    Thing1   Thing2   Thing3 ... ThingN
   .         .        .        .          .
   .         .        .        .          .

所以我想将其改为:

Customer   Order

 John      This
 John      That
 John      The
 Shelly    Thing1
 Shelly    Thing2

等等。

我不知道该怎么做。

任何帮助都会很棒!

4 个答案:

答案 0 :(得分:3)

正好一个stack和两个reset_index来电。

df
  Customer  Order1  Order2  Order3  OrderN
0     John    This    That     The   Other
1   Shelly  Thing1  Thing2  Thing3  ThingN

(df.set_index('Customer')
   .stack()
   .reset_index(level=1, drop=True)
   .reset_index(name='Order')
)

  Customer   Order
0     John    This
1     John    That
2     John     The
3     John   Other
4   Shelly  Thing1
5   Shelly  Thing2
6   Shelly  Thing3
7   Shelly  ThingN

答案 1 :(得分:2)

pd.melt是您正在寻找的那个:

# Assuming all the other columns are orders except for the Customer column
value_list = [col for col in df.columns if col != 'Customer']

pd.melt(df, id_vars=['Customer'], value_vars=value_list,
        value_name='Order').drop('variable', axis=1)

  Customer   Order
0    John    this
1  Shelly  thing1
2    John    that
3  Shelly  thing2
4    John    that
5  Shelly  thing2

答案 2 :(得分:2)

我认为使用stack会稍好一些

df.set_index('Customer').stack().reset_index(level=0)
Out[1219]: 
       Customer       0
Order1     John    This
Order2     John    That
Order3     John     The
OrderN     John   Other
Order1   Shelly  Thing1
Order2   Shelly  Thing2
Order3   Shelly  Thing3
OrderN   Shelly  ThingN

答案 3 :(得分:2)

使用理解

...
val url = "<ul><li><a href='https://www.google.nl' style='display:block !important; margin-left:30px !important;'>Awesome clickable link</a></li></ul>"
...