我写了如下代码。我首先生成了大小为3,5的统一随机变量。然后,我将该2d数组中的每个元素用作均值并生成新列表。我想做的是创建10个新的2d数组,同时使用列表中每个元素上相同形状的3,5。例如
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
mean_route1 = pd.DataFrame(np.random.uniform(0, 10, size=(3,5)))
print(mean_route1)
N=10
for m in np.nditer(mean_route1):
m3 = np.random.poisson(lam = m, size=N)
print(m3)
输出如下:
0 1 2 3 4
0 7.740569 5.435856 6.682996 5.213202 2.100649
1 6.174332 0.059057 2.951913 1.341994 2.734486
2 7.780503 7.277458 7.406986 8.498494 0.070157
[ 5 5 7 7 9 5 9 12 7 5]
[ 4 4 3 4 12 3 9 6 6 1]
[8 8 1 9 3 5 8 7 4 6]
[5 6 9 6 4 4 9 7 4 5]
[2 3 3 3 0 2 4 1 4 1]
[4 6 9 3 8 4 3 7 8 5]
[0 0 0 0 0 0 0 0 0 0]
[2 1 3 4 2 2 0 1 3 3]
[2 1 2 2 1 0 1 0 1 1]
[2 1 3 5 5 3 5 4 1 3]
[ 5 5 7 6 6 6 10 10 5 7]
[ 7 6 7 9 4 14 6 7 8 9]
[ 8 10 1 9 10 7 9 9 9 13]
[14 4 8 10 6 3 10 7 12 4]
[0 0 0 0 1 0 0 0 0 0]
例如:接下来,我想做的是10个这样的数组:((:,0)列在新的第一个数组上。
0 1 2 3 4
0 5 4 8 5 2
1 4 0 2 2 2
2 5 7 8 14 0
(:, 1)在新的第二个数组上,...,(:,10)在新的第10个数组上。
我该怎么做?我是Python和stackoverflow的新手,所以如果出现错误,我表示歉意。
答案 0 :(得分:2)
(暂时)忘记数据帧,使用numpy,我们可以做到:
In [87]: mean_route1 = np.random.uniform(0,10,size=15)
In [88]: alist = []
In [89]: for m in mean_route1:
...: alist.append(np.random.poisson(lam=m, size=10))
...:
In [90]: arr = np.array(alist)
In [91]: arr
Out[91]:
array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 4, 2, 3, 2, 6, 7, 3, 7, 7, 5],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 7, 9, 8, 1, 6, 5, 6, 11, 6, 1],
[16, 7, 9, 6, 6, 11, 11, 16, 9, 12],
[ 3, 5, 2, 0, 2, 6, 4, 5, 3, 3],
[ 5, 5, 8, 7, 9, 10, 5, 10, 7, 8],
[ 5, 5, 4, 4, 2, 5, 1, 2, 1, 2],
[ 4, 2, 6, 7, 2, 6, 5, 0, 1, 4],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 5, 5, 5, 4, 3, 2, 5, 7, 4, 5],
[ 1, 1, 1, 1, 2, 0, 2, 0, 1, 3],
[ 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[ 0, 0, 6, 1, 3, 2, 0, 1, 1, 2],
[ 9, 10, 10, 8, 9, 9, 9, 6, 12, 9]])
这是一个(15,10)形状数组,每个15个lam
值都有10个样本。如果您愿意,我们可以将其重塑为(3,5,10),尽管这不会更改值。
In [92]: arr.reshape(3,5,10)
Out[92]:
array([[[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 4, 2, 3, 2, 6, 7, 3, 7, 7, 5],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 7, 9, 8, 1, 6, 5, 6, 11, 6, 1],
[16, 7, 9, 6, 6, 11, 11, 16, 9, 12]],
[[ 3, 5, 2, 0, 2, 6, 4, 5, 3, 3],
[ 5, 5, 8, 7, 9, 10, 5, 10, 7, 8],
[ 5, 5, 4, 4, 2, 5, 1, 2, 1, 2],
[ 4, 2, 6, 7, 2, 6, 5, 0, 1, 4],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]],
[[ 5, 5, 5, 4, 3, 2, 5, 7, 4, 5],
[ 1, 1, 1, 1, 2, 0, 2, 0, 1, 3],
[ 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[ 0, 0, 6, 1, 3, 2, 0, 1, 1, 2],
[ 9, 10, 10, 8, 9, 9, 9, 6, 12, 9]]])
从(15,)而不是(3,5)开始,我可以进行简单的迭代,而不会带来nditer
的麻烦。 (除非您确实需要一些特殊功能,否则我不鼓励使用nditer
。它并不快。)
我可以像这样的循环从(3,5,10)数组构造10个数据帧:
In [94]: import pandas as pd
In [95]: for i in range(3):
...: print(pd.DataFrame(_92[:,:,i])) # Out[92] array
...:
0 1 2 3 4 # 1st column
0 0 4 0 7 16
1 3 5 5 4 0
2 5 1 0 0 9
0 1 2 3 4 # 2nd column
0 0 2 0 9 7
1 5 5 5 2 0
2 5 1 0 0 10
0 1 2 3 4
0 0 3 0 8 9
1 2 8 4 6 0
2 5 1 0 6 10
我可以一次调用具有所有poisson
值的mean_route1
:
In [97]: np.random.poisson(lam=mean_route1, size=(10,15))
Out[97]:
array([[ 0, 2, 0, 4, 11, 5, 9, 2, 8, 0, 10, 0, 0, 1, 5],
[ 0, 4, 0, 3, 9, 3, 11, 3, 4, 0, 4, 0, 2, 0, 7],
[ 0, 4, 0, 4, 6, 1, 7, 4, 2, 0, 5, 1, 0, 0, 5],
...
[ 0, 9, 0, 6, 12, 3, 3, 5, 3, 0, 6, 1, 1, 1, 6]])
或换位到我在Out[91]
中得到的(15,10):
In [98]: np.random.poisson(lam=mean_route1, size=(10,15)).T
Out[98]:
array([[ 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[ 1, 4, 5, 6, 7, 1, 6, 2, 0, 2],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 4, 5, 4, 6, 3, 9, 1, 10, 3, 4],
....
[10, 8, 5, 13, 7, 10, 5, 10, 7, 9]])
或具有lam
的(3,5)数组:
In [100]: np.random.poisson(lam=mean_route1.reshape(3,5), size=(10,3,5))
Out[100]:
array([[[ 0, 1, 0, 2, 9],
[ 1, 7, 2, 6, 0],
[ 3, 0, 0, 1, 10]],
[[ 0, 5, 0, 7, 8],
[ 2, 6, 2, 8, 0],
[ 5, 2, 0, 1, 11]],
[[ 0, 7, 0, 7, 11],
[ 2, 7, 2, 4, 0],
[ 7, 1, 1, 1, 10]],
....
[ 7, 1, 1, 3, 12]]])
同样,制作数据帧,这次在第一个维度上进行迭代:
In [101]: for i in range(3):
...: print(pd.DataFrame(_100[i,:,:]))
...:
0 1 2 3 4
0 0 1 0 2 9
1 1 7 2 6 0
2 3 0 0 1 10
0 1 2 3 4
0 0 5 0 7 8
1 2 6 2 8 0
2 5 2 0 1 11
0 1 2 3 4
0 0 7 0 7 11
1 2 7 2 4 0
2 7 1 1 1 10
答案 1 :(得分:1)
看看是否可以在这里为我提供帮助,我已经成功创建了d(这是每个数据框的输入),现在应该使用d中的每个子列表创建数据框。我也会尽力而为,但是就目前而言,这还远远不够:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
mean_route1 = pd.DataFrame(np.random.uniform(0, 10, size=(3,5)))
print(mean_route1)
N=10
a = []
c = []
for m in np.nditer(mean_route1):
m3 = list(np.random.poisson(lam = m, size=N))
print(m3)
a.append(m3)
这是每个列表的输出:
[4, 6, 8, 12, 4, 10, 8, 7, 9, 13]
[12, 11, 12, 8, 9, 4, 7, 10, 11, 6]
[2, 1, 2, 0, 4, 3, 2, 3, 0, 3]
[4, 4, 7, 2, 9, 3, 9, 5, 10, 11]
[6, 9, 11, 6, 10, 14, 14, 6, 10, 7]
[5, 7, 4, 8, 4, 7, 9, 3, 6, 2]
[3, 3, 4, 7, 5, 7, 5, 4, 2, 3]
[6, 3, 6, 4, 7, 3, 4, 1, 4, 2]
[1, 1, 1, 1, 0, 2, 4, 2, 0, 1]
[6, 5, 7, 6, 5, 8, 10, 6, 8, 4]
[3, 2, 3, 4, 5, 3, 2, 1, 1, 5]
[5, 5, 5, 2, 6, 11, 8, 13, 6, 11]
[4, 6, 4, 4, 4, 4, 7, 6, 8, 6]
[7, 5, 11, 3, 8, 7, 5, 10, 3, 7]
[12, 5, 7, 10, 8, 4, 5, 6, 8, 4]
现在,我用所有值创建一个大列表,但按照您请求的顺序,有点像“转置”列表。
for b in range(10):
for i in range(len(a)):
c.append(a[i][b])
print(c)
输出:
[4, 12, 2, 4, 6, 5, 3, 6, 1, 6, 3, 5, 4, 7, 12, 6, 11, 1, 4, 9, 7, 3, 3, 1, 5, 2, 5, 6, 5, 5, 8, 12, 2, 7, 11, 4, 4, 6, 1, 7, 3, 5, 4, 11, 7, 12, 8, 0, 2, 6, 8, 7, 4, 1, 6, 4, 2, 4, 3, 10, 4, 9, 4, 9, 10, 4, 5, 7, 0, 5, 5, 6, 4, 8, 8, 10, 4, 3, 3, 14, 7, 7, 3, 2, 8, 3, 11, 4, 7, 4, 8, 7, 2, 9, 14, 9, 5, 4, 4, 10, 2, 8, 7, 5, 5, 7, 10, 3, 5, 6, 3, 4, 1, 2, 6, 1, 13, 6, 10, 6, 9, 11, 0, 10, 10, 6, 2, 4, 0, 8, 1, 6, 8, 3, 8, 13, 6, 3, 11, 7, 2, 3, 2, 1, 4, 5, 11, 6, 7, 4]
以15s为一组将这个大列表用于新数据帧:
d = []
for i in range(10):
d.append(c[(i)*15:((i+1)*15)])
print(d)
输出:
[[4, 12, 2, 4, 6, 5, 3, 6, 1, 6, 3, 5, 4, 7, 12], [6, 11, 1, 4, 9, 7, 3, 3, 1, 5, 2, 5, 6, 5, 5], [8, 12, 2, 7, 11, 4, 4, 6, 1, 7, 3, 5, 4, 11, 7], [12, 8, 0, 2, 6, 8, 7, 4, 1, 6, 4, 2, 4, 3, 10], [4, 9, 4, 9, 10, 4, 5, 7, 0, 5, 5, 6, 4, 8, 8], [10, 4, 3, 3, 14, 7, 7, 3, 2, 8, 3, 11, 4, 7, 4], [8, 7, 2, 9, 14, 9, 5, 4, 4, 10, 2, 8, 7, 5, 5], [7, 10, 3, 5, 6, 3, 4, 1, 2, 6, 1, 13, 6, 10, 6], [9, 11, 0, 10, 10, 6, 2, 4, 0, 8, 1, 6, 8, 3, 8], [13, 6, 3, 11, 7, 2, 3, 2, 1, 4, 5, 11, 6, 7, 4]]
最后要创建每个数据框,这就是我要做的:
df1 = pd.DataFrame({'row1':d[0][:5],'row2':d[0][5:10],'row3':d[0][10:15]}).T
print(df1)
0 1 2 3 4
row 1 4 12 2 4 6
row 2 5 3 6 1 6
row 3 3 5 4 7 12
可能对组成d
的15个子列表的列表d
中的每个索引值重复此过程。这感觉远非理想,但这是我设法解决问题的方式。
答案 2 :(得分:1)
这是使用numpy
功能的解决方案。
mean_route1 = pd.DataFrame(np.random.uniform(0, 10, size=(3,5)))
print(mean_route1)
N=10
a = [np.random.poisson(lam = m, size=N) for m in np.nditer(mean_route1)]
b = np.stack(a)
c = [pd.DataFrame(np.reshape(arr, (3, 5))) for arr in b.T]
如果您打印a
,b
和c
,则会看到:
a
由您称为m3
的行组成,不同之处在于:是ndarray的列表。列表中有15个元素,每个元素都是一个ndarray
生成的长度为10的np.random.poisson
。b
是a
的堆栈。一个二维数组,其行是a
中的数组。c
是您的预期结果,是数据目录。通过转置b
(b.T
是转置矩阵)并在转置的b
(原始b
的一列)的每一行上进行迭代来创建。每行被重塑为(3,5)矩阵,并转换为熊猫数据帧,并附加到c
。例如,如果a
是:
[array([5, 4, 6, 6, 3, 0, 2, 7, 5, 3]),
array([ 3, 2, 5, 9, 6, 6, 8, 14, 3, 4]),
array([ 1, 4, 2, 2, 10, 3, 4, 1, 5, 1]),
array([ 8, 8, 3, 2, 4, 12, 3, 3, 2, 4]),
array([5, 4, 1, 5, 8, 0, 4, 3, 5, 1]),
array([ 3, 7, 7, 6, 12, 12, 10, 4, 2, 9]),
array([4, 0, 3, 2, 5, 1, 3, 4, 0, 7]),
array([6, 8, 4, 6, 2, 7, 4, 4, 7, 7]),
array([3, 7, 3, 4, 9, 4, 6, 5, 3, 3]),
array([0, 3, 0, 0, 2, 1, 1, 0, 1, 0]),
array([0, 0, 2, 0, 1, 1, 0, 1, 0, 3]),
array([4, 7, 7, 7, 7, 7, 2, 7, 8, 7]),
array([11, 15, 11, 10, 7, 4, 5, 9, 14, 10]),
array([10, 7, 9, 8, 7, 9, 8, 13, 8, 8]),
array([7, 4, 4, 6, 9, 5, 6, 5, 8, 6])]
c
(c[0]
)中的第一个数据帧是:
0 1 2 3 4
0 5 3 1 8 5
1 3 4 6 3 0
2 0 4 11 10 7