Question

我们说我有一个非常简单的数据框：

import pandas as pd
df = pd.DataFrame(np.full((6), 1))

现在我将定义一个函数，该函数生成一个随机长度的numpy数组，并将给定值添加到尾部：

import numpy as np
def func(row):
    l = np.full((np.random.random_integer(5)), 1)
    return np.hstack(l, row)

当我尝试将该功能应用于df以获得2-D array：

时

df.apply(func, axis=1),

我收到了这样的错误：

ValueError: Shape of passed values is (6, 2), indices imply (6, 1)

您知道问题是什么以及如何解决？先感谢您！

Answer 1

首先你需要np.random.random_integers，其次hstack需要一个元组，所以传递一个元组，第三，你需要返回它可以对齐的东西，所以在这种情况下Series：

In [213]:
df = pd.DataFrame(np.full((6), 1))
def func(row):
    l = np.full((np.random.random_integers(5)), 1)
    return pd.Series(np.hstack((l, row)))

In [214]:    
df.apply(func, axis=1)

Out[214]:
     0    1    2    3    4    5
0  1.0  1.0  1.0  NaN  NaN  NaN
1  1.0  1.0  NaN  NaN  NaN  NaN
2  1.0  1.0  NaN  NaN  NaN  NaN
3  1.0  1.0  1.0  NaN  NaN  NaN
4  1.0  1.0  1.0  1.0  1.0  NaN
5  1.0  1.0  1.0  1.0  1.0  1.0

请注意，我收到了大量有关上述内容的警告：

C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\numpy\core\numeric.py:301: FutureWarning: in the future, full(3, 1) will return an array of dtype('int32')
  format(shape, fill_value, array(fill_value).dtype), FutureWarning)
C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\numpy\core\numeric.py:301: FutureWarning: in the future, full(2, 1) will return an array of dtype('int32')
  format(shape, fill_value, array(fill_value).dtype), FutureWarning)
C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\numpy\core\numeric.py:301: FutureWarning: in the future, full(1, 1) will return an array of dtype('int32')
  format(shape, fill_value, array(fill_value).dtype), FutureWarning)
C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\numpy\core\numeric.py:301: FutureWarning: in the future, full(4, 1) will return an array of dtype('int32')
  format(shape, fill_value, array(fill_value).dtype), FutureWarning)
C:\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\numpy\core\numeric.py:301: FutureWarning: in the future, full(5, 1) will return an array of dtype('int32')
  format(shape, fill_value, array(fill_value).dtype), FutureWarning)

从df调用属性values获取np数组：

df.apply(func, axis=1).values

将函数应用于python-pandas中的数据框时的ValueError

1 个答案: