Pandas:两列中值之间的随机整数

时间:2018-04-08 18:34:54

标签: python-3.x pandas numpy random

如何创建一个新列,用于计算特定行中两列值之间的随机整数。

示例df:

import pandas as pd
import numpy as np

data = pd.DataFrame({'start': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                     'end': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]})
data = data.iloc[:, [1, 0]]

结果:

enter image description here

现在我正在尝试这样的事情:

data['rand_between'] = data.apply(lambda x: np.random.randint(data.start, data.end))

data['rand_between'] = np.random.randint(data.start, data.end)

但它当然不起作用,因为data.start是一个系列而不是一个数字。 我如何使用numpy.random和列中的数据作为矢量化操作?

2 个答案:

答案 0 :(得分:3)

您已结束,需要按行为流程数据指定System.out.println("Row " + indexofMaxRow + " has the sum of " + maxRow),并将 int[][] arr = { {3, -1, 4, 0}, {5, 9, -2, 6}, {5, 3, 7, -8} }; int maxSum = 0; int i = 0; int tempSum = 0; for(int j=0; j<arr[0].length; j++) { //System.out.println("i: " + i + " j:" + j); tempSum += Math.abs(arr[i][j]); //if j is checking last element of the row, then go to next row, set tempSum to 0 if(j==arr[0].length-1) { j=-1; i++; //j is set to -1 because it will get incremented to 0 because of for loop System.out.println("Sum of row: " + i + ": " + tempSum); if(tempSum > maxSum) maxSum = tempSum; tempSum=0; } //check if i completed all rows if(i == arr[0].length-1) { break; } } System.out.println("Max sum: " + maxSum); 更改为axis=1以使用标量:

data.start/end

另一种可能的解决方案:

x.start/end
data['rand_between'] = data.apply(lambda x: np.random.randint(x.start, x.end), axis=1)

答案 1 :(得分:1)

如果你想真正对它进行矢量化,你可以生成0到1之间的随机数,并用你的最小/最大数字标准化它:

(
    data['start'] + np.random.rand(len(data)) * (data['end'] - data['start'] + 1)
).astype('int')

Out: 
0     1
1    18
2    18
3    35
4    22
5    27
6    35
7    23
8    33
9    81
dtype: int64