pandas - convert Panel into DataFrame using lookup table for column headings

时间:2016-08-31 17:29:05

标签: python-3.x pandas dataframe panel

Is there a neat way to do this, or would I be best off making a look that creates a new dataframe, looking into the Panel when constructing each column?

I have a 3d array of data that I have put into a Panel, and I want to reorganise it based on a 2d lookup table using 2 of the axes so that it will be a DataFrame with labels taken from my lookup table using the nearest value. In a kind of double vlookup type of a way.

The main thing I am trying to achieve is to be able to quickly locate a time series of data based on the label. If there is a better way, please let me know!

my data is in a panel that looks like this, with items axis latitude and minor axis longitude.

    data
Out[920]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 53 (items) x 29224 (major_axis) x 119 (minor_axis)
Items axis: 42.0 to 68.0
Major_axis axis: 2000-01-01 00:00:00 to 2009-12-31 21:00:00
Minor_axis axis: -28.0 to 31.0

and my lookup table is like this:

    label_coords
Out[921]: 
             lat       lon
label                     
2449   63.250122 -5.250000
2368   62.750122 -5.750000
2369   62.750122 -5.250000
2370   62.750122 -4.750000

I'm kind of at a loss. Quite new to python in general and only really started using pandas yesterday.

Many thanks in advance! Sorry if this is a duplicate, I couldn't find anything that was about the same type of question.

Andy

1 个答案:

答案 0 :(得分:0)

想出了一个基于循环的解决方案,并认为我可以发布以防其他人有这种类型的问题

我更改了标签坐标数据框的读取方式,以便标签为列,然后使用数据透视功能:

label_coord = label_coord.pivot('lat','lon','label')

然后生成一个数据框,其中标签是值,lat / lon是索引/列

然后使用此循环,其中数据是问题中的面板:

data_labelled = pd.DataFrame()
for i in label_coord.columns: #longitude
    for j in label_coord.index: #latitude
        lbl = label_coord[i][j]
        shut_nump['%s'%lbl]=data[j][i]