如何将pandas数据帧与numpy数组与广播相乘

时间:2015-08-12 16:07:31

标签: python numpy pandas numpy-broadcasting

我有一个形状(4,3)的数据框如下:

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: x = pd.DataFrame(np.random.randn(4, 3), index=np.arange(4))

In [4]: x
Out[4]: 
          0         1         2
0  0.959322  0.099360  1.116337
1 -0.211405 -2.563658 -0.561851
2  0.616312 -1.643927 -0.483673
3  0.235971  0.023823  1.146727

我希望将数据帧的每一列与numpy数组形状(4,)相乘:

In [9]: y = np.random.randn(4)

In [10]: y
Out[10]: array([-0.34125522,  1.21567883, -0.12909408,  0.64727577])

在numpy中,以下广播技巧有效:

In [12]: x.values * y[:, None]
Out[12]: 
array([[-0.32737369, -0.03390716, -0.38095588],
       [-0.25700028, -3.11658448, -0.68303043],
       [-0.07956223,  0.21222123,  0.06243928],
       [ 0.15273815,  0.01541983,  0.74224861]])

但是,它在pandas数据帧的情况下不起作用,我收到以下错误:

In [13]: x * y[:, None]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-21d033742c49> in <module>()
----> 1 x * y[:, None]
...
ValueError: Shape of passed values is (1, 4), indices imply (3, 4)

有什么建议吗?

谢谢!

2 个答案:

答案 0 :(得分:6)

我找到了另一种在pandas dataframe和numpy数组之间进行乘法的方法。

public class Test {

  // variables for the class
  int[] Array = new int[30];
  int ArrayNumUser[] = new int[30];
  int x;
  public static Test R = new Test();
  Timer timer;
  static Scanner scan = new Scanner(System.in);
  static String verifica;
  char p;

  // Constructor timer
  public Test(int seconds) {
    timer = new Timer();
    timer.schedule(new NumUser(), seconds * 1000);
  }

  public Test() {
  }

  // Generate numbers with an array
  public String GenerateNum() {

    for (int i = 0; i < Array.length; i++) {
      x = (int) (Math.random() * (10 - 0) + 0);
      Array[i] = x;
    }
    System.out
        .println("Rember this numbers, you got 3 minutes to complite the challenge");
    return Arrays.toString(Array);
  }

  // Save the numbers in an array
  public class NumUser extends TimerTask {

    public void run() {
      int y;
      System.out.println("Insert the numbers you remember..");

      for (int i = 0; i < ArrayNumUser.length; i++) {
        System.out.println("Insert the number: " + (i + 1));
        y = scan.nextInt();
        ArrayNumUser[i] = y;
      }

      // Check the numbers are equals to the arrays given
      System.out.println("Enter 'x' to confirm");
      String ver = scan.next();
      char confirm = ver.charAt(0);
      if (confirm == 'x') {
        int points = 0;

        for (int i = 0; i < Array.length; i++) {
          int p = Array[i];
          int t = ArrayNumUser[i];
          if (p == t) {
            points++;
          } else
            points--;
        }

        System.out.println("Yours points: " + points);
        System.out.println(Arrays.toString(Array));
        System.out.println(Arrays.toString(ArrayNumUser));
      }
    }
  }

  public static void main(String[] args) {
    // TODO Auto-generated method stub
    R.GenerateNum();
    new Test(5);
  }
}

答案 1 :(得分:4)

我认为您最好使用 df.apply()方法。在你的情况下:

x.apply(lambda x: x * y)