我已将pandas面板转换为xarray,但不能像pandas面板那样轻松添加新项目,长轴和短轴。代码如下:
import numpy as np
import pandas as pd
import xarray as xr
panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'],
major_axis=pd.date_range('1/1/2000', periods=4),
minor_axis=['a', 'b', 'c', 'd','e'])
如果我想添加一个新项目,我可以:
panel.four=pd.DataFrame(np.ones((4,5)),index=pd.date_range('1/1/2000', periods=4), columns=['a', 'b', 'c', 'd','e'])
panel.four
a b c d e
2000-01-01 1.0 1.0 1.0 1.0 1.0
2000-01-02 1.0 1.0 1.0 1.0 1.0
2000-01-03 1.0 1.0 1.0 1.0 1.0
2000-01-04 1.0 1.0 1.0 1.0 1.0
我难以增加xarray中的项目,主/短轴
px=panel.to_xarray()
#px gives me
<xarray.DataArray (items: 3, major_axis: 5, minor_axis: 4)>
array([[[-0.440081, -0.888226, 0.158702, 2.107577],
[ 0.917835, -0.174557, 0.501626, 0.116761],
[ 0.406988, 1.95184 , -1.345948, 2.960774],
[-1.905529, 0.25793 , 0.076162, 1.954012],
[ 0.499675, 1.87567 , -1.698771, -1.143766]],
[[ 0.070269, -1.151737, -0.344155, -0.506383],
[-2.199357, -0.040909, 0.491984, -0.333431],
[-0.113155, -0.668475, 2.366683, -0.421863],
[-0.567336, -0.302224, 1.638386, -0.038545],
[ 0.55067 , -0.409266, -0.27916 , -0.942144]],
[[ 1.269171, -0.151471, -0.664072, 0.269168],
[-0.486492, 0.59632 , -0.191977, 0.22537 ],
[ 0.069231, -0.345793, -0.450797, -2.982 ],
[-0.42338 , -0.849736, 0.965738, -0.544596],
[-1.455378, -0.256441, -1.204572, -0.347749]]])
Coordinates:
* items (items) object 'one' 'two' 'three'
* major_axis (major_axis) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...
* minor_axis (minor_axis) object 'a' 'b' 'c' 'd'
#how should I add a fourth item, increase/delete major axis, minor axis?
答案 0 :(得分:0)
xarray分配不如pandas面板那么优雅。假设我们想在上面的数据数组中添加第四项。以下是它的工作原理:
pxc.drop(['four'], dim='items')
操作是在项目还是主/短轴上,类似的逻辑占优势。删除使用
rows
答案 1 :(得分:0)
xarray.DataArray
内部基于单个NumPy数组,因此无法有效地调整大小或附加到其中。您最好的选择是使用xarray.concat
创建一个新的,更大的DataArray。
如果要将项目添加到pd.Panel
,您可能正在查找的数据结构为xarray.Dataset
。这些最容易从与索引相当的多索引DataFrame构建:
# First, make a DataFrame with a MultiIndex
>>> df = panel.to_frame()
>>> df.head()
one two three
major minor
2000-01-01 a 0.278958 0.676034 -1.544726
b -0.918150 -2.707339 -0.552987
c 0.023479 0.175528 -0.817556
d 1.798001 -0.142016 1.390834
e 0.256575 0.265369 -1.829766
# Now, convert the DataFrame with a MultiIndex to xarray
>>> ds = df.to_xarray()
>>> ds
<xarray.Dataset>
Dimensions: (major: 4, minor: 5)
Coordinates:
* major (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
* minor (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
one (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
two (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
three (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
# You can assign a DataFrame if it has the right column/index names
>>> ds['four'] = pd.DataFrame(np.ones((4,5)),
... index=pd.date_range('1/1/2000', periods=4, name='major'),
... columns=pd.Index(['a', 'b', 'c', 'd', 'e'], name='minor'))
# or just pass a tuple directly:
>>> ds['five'] = (('major', 'minor'), np.zeros((4, 5)))
>>> ds
<xarray.Dataset>
Dimensions: (major: 4, minor: 5)
Coordinates:
* major (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
* minor (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
one (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
two (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
three (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
four (major, minor) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ...
five (major, minor) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
有关从pandas.Panel过渡到xarray的更多信息,请阅读xarray文档中的以下部分: http://xarray.pydata.org/en/stable/pandas.html#transitioning-from-pandas-panel-to-xarray