Pandas:透视数据框,列到行

时间:2017-03-28 13:39:29

标签: python pandas

我有一个像这样定义的DataFrame:

from collections import OrderedDict
from pandas import DataFrame
import pandas as pd
import numpy as np

table = OrderedDict((
    ('year', [1900, 1900, 1900, 1900, 1901, 1901, 1901, 1901]),
    ('variable',['PRCP', 'PRCP', 'TAVG', 'TAVG', 'PRCP', 'PRCP', 'TAVG', 'TAVG']),
    ('month',  [1, 2, 1, 2, 1, 2, 1, 2]),
    ('first_day',  [5, 8, 7, 3, 9, 2, 4, 1]),
    ('second_day',  [5, 8, 7, 3, 9, 2, 5, 8]),
    ('third_day',  [1, 7, 5, 7, 3, 5, 8, 9])
))
df = DataFrame(table)

DataFrame现在是这样的:

   year variable  month  first_day  second_day  third_day
0  1900     PRCP      1          5           5          1
1  1900     PRCP      2          8           8          7
2  1900     TAVG      1          7           7          5
3  1900     TAVG      2          3           3          7
4  1901     PRCP      1          9           9          3
5  1901     PRCP      2          2           2          5
6  1901     TAVG      1          4           5          8
7  1901     TAVG      2          1           8          9

我想转动DataFrame,使它看起来像这样:

   variable  year  month  day  value
0      PRCP  1900      1    1      5
1      PRCP  1900      1    2      5
2      PRCP  1900      1    3      1
3      PRCP  1900      2    1      8
4      PRCP  1900      2    2      8
5      PRCP  1900      2    3      7
6      PRCP  1901      1    1      7
7      PRCP  1901      1    2      7
8      PRCP  1901      1    3      5
9      PRCP  1901      2    1      3
10     PRCP  1901      2    2      3
11     PRCP  1901      2    3      7
12     TAVG  1900      1    1      9
13     TAVG  1900      1    2      9
14     TAVG  1900      1    3      3
15     TAVG  1900      2    1      2
16     TAVG  1900      2    2      2
17     TAVG  1900      2    3      5
18     TAVG  1901      1    1      4
19     TAVG  1901      1    2      5
20     TAVG  1901      1    3      8
21     TAVG  1901      2    1      1
22     TAVG  1901      2    2      8
23     TAVG  1901      2    3      9 

我想我想通过旋转来实现这一点,但我还没有使用pivot()pivot_table()函数来解决这个问题。任何人都可以提出一种方法来做到这一点,无论是否使用枢轴?提前感谢任何想法。

1 个答案:

答案 0 :(得分:2)

您可以rename使用melt,但dict列只有d = {'first_day':1,'second_day':2,'third_day':3} df = pd.melt(df.rename(columns=d), id_vars=['variable','year','month'], var_name='day') df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True) print (df) variable year month day value 0 PRCP 1900 1 1 5 1 PRCP 1900 1 2 5 2 PRCP 1900 1 3 1 3 PRCP 1900 2 1 8 4 PRCP 1900 2 2 8 5 PRCP 1900 2 3 7 6 PRCP 1901 1 1 9 7 PRCP 1901 1 2 9 8 PRCP 1901 1 3 3 9 PRCP 1901 2 1 2 10 PRCP 1901 2 2 2 11 PRCP 1901 2 3 5 12 TAVG 1900 1 1 7 13 TAVG 1900 1 2 7 14 TAVG 1900 1 3 5 15 TAVG 1900 2 1 3 16 TAVG 1900 2 2 3 17 TAVG 1900 2 3 7 18 TAVG 1901 1 1 4 19 TAVG 1901 1 2 5 20 TAVG 1901 1 3 8 21 TAVG 1901 2 1 1 22 TAVG 1901 2 2 8 23 TAVG 1901 2 3 9

day

dict d = {'first_day':1,'second_day':2,'third_day':3} df = pd.melt(df, id_vars=['variable','year','month'], var_name='day') df.day = df.day.map(d) df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True) print (df) variable year month day value 0 PRCP 1900 1 1 5 1 PRCP 1900 1 2 5 2 PRCP 1900 1 3 1 3 PRCP 1900 2 1 8 4 PRCP 1900 2 2 8 5 PRCP 1900 2 3 7 6 PRCP 1901 1 1 9 7 PRCP 1901 1 2 9 8 PRCP 1901 1 3 3 9 PRCP 1901 2 1 2 10 PRCP 1901 2 2 2 11 PRCP 1901 2 3 5 12 TAVG 1900 1 1 7 13 TAVG 1900 1 2 7 14 TAVG 1900 1 3 5 15 TAVG 1900 2 1 3 16 TAVG 1900 2 2 3 17 TAVG 1900 2 3 7 18 TAVG 1901 1 1 4 19 TAVG 1901 1 2 5 20 TAVG 1901 1 3 8 21 TAVG 1901 2 1 1 22 TAVG 1901 2 2 8 23 TAVG 1901 2 3 9 d = {'first_day':1,'second_day':2,'third_day':3} df = df.rename(columns=d).set_index(['variable','year','month']) .stack() .reset_index(name='value') .rename(columns={'level_3':'day'}) print (df) variable year month day value 0 PRCP 1900 1 1 5 1 PRCP 1900 1 2 5 2 PRCP 1900 1 3 1 3 PRCP 1900 2 1 8 4 PRCP 1900 2 2 8 5 PRCP 1900 2 3 7 6 TAVG 1900 1 1 7 7 TAVG 1900 1 2 7 8 TAVG 1900 1 3 5 9 TAVG 1900 2 1 3 10 TAVG 1900 2 2 3 11 TAVG 1900 2 3 7 12 PRCP 1901 1 1 9 13 PRCP 1901 1 2 9 14 PRCP 1901 1 3 3 15 PRCP 1901 2 1 2 16 PRCP 1901 2 2 2 17 PRCP 1901 2 3 5 18 TAVG 1901 1 1 4 19 TAVG 1901 1 2 5 20 TAVG 1901 1 3 8 21 TAVG 1901 2 1 1 22 TAVG 1901 2 2 8 23 TAVG 1901 2 3 9 .syntastic_cpp_config

-I/usr/lib/openmpi/include

map的另一个解决方案:

 size = 0
 bb_strings = re.findall(r'var model = ({.*})', ad)
 bp = {}
 if bb_strings:
 bp = json.loads(bb_strings[0])
     for bl in bp['AVAILABLE_SIZES']:
     size +=1
     footlocker.append(('size%s' %size, bl))