我有一个由5个熊猫数组组成的列表,每个数组的大小为4 x 3。 我想将它们的第三行更改为0、1、2的行,无论列表中数组的索引是什么。 我编写了一个代码,一个for循环,并且在打印结果时看起来像是在工作:
A=pd.DataFrame(np.random.randn(4,3))
AList=5*[A]
AList[0].iloc[2,:]
for kkk0 in range(0,len(AList)):
AList[kkk0].iloc[2,:]=kkk0*np.ones((1,3))
print(AList[kkk0])
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 0.000000 0.000000 0.000000
3 1.664176 -0.696303 0.239165
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 1.000000 1.000000 1.000000
3 1.664176 -0.696303 0.239165
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 2.000000 2.000000 2.000000
3 1.664176 -0.696303 0.239165
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 3.000000 3.000000 3.000000
3 1.664176 -0.696303 0.239165
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165
但是荒谬现在开始了: 当我尝试在for循环结束后查看AList的内容时,我发现List的所有组成PAndas数组的第三行均为4s!
AList
Out[3]:
[ 0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165,
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165,
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165,
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165,
0 1 2
0 -0.168639 0.300507 2.823529
1 0.608844 0.017578 -0.342164
2 4.000000 4.000000 4.000000
3 1.664176 -0.696303 0.239165]
有什么想法吗?
答案 0 :(得分:0)
A=pd.DataFrame(np.random.randn(4,3))
AList=5*[A]
# concat your list of frames
df = pd.concat(AList)
# use loc to assign values
# use numpy's transpose with arange since you know the size of each padnas frame
df.loc[2, :] = np.transpose([np.arange(0,5)]*3)
# use numpy's split to split you frame back into a list of frames
AList_new = np.split(df, len(AList))
[ 0 1 2
0 1.687788 -0.770912 -0.027720
1 -1.868220 -0.475117 -0.266580
2 0.000000 0.000000 0.000000
3 -0.537249 0.414133 1.623596,
0 1 2
0 1.687788 -0.770912 -0.027720
1 -1.868220 -0.475117 -0.266580
2 1.000000 1.000000 1.000000
3 -0.537249 0.414133 1.623596,
0 1 2
0 1.687788 -0.770912 -0.027720
1 -1.868220 -0.475117 -0.266580
2 2.000000 2.000000 2.000000
3 -0.537249 0.414133 1.623596,
0 1 2
0 1.687788 -0.770912 -0.027720
1 -1.868220 -0.475117 -0.266580
2 3.000000 3.000000 3.000000
3 -0.537249 0.414133 1.623596,
0 1 2
0 1.687788 -0.770912 -0.027720
1 -1.868220 -0.475117 -0.266580
2 4.000000 4.000000 4.000000
3 -0.537249 0.414133 1.623596]
答案 1 :(得分:0)
这根本不是荒谬的。观察到的行为的原因是,您创建的列表具有相同对象的5倍。尽管您使用不同的索引来访问AList
,但是您始终访问同一对象,因此,如果最后打印该对象,则该对象在第2行中的最后一个值为4。
如果执行@It_is_Chris的逻辑,则将5个对象串联起来,然后再次拆分。这是一种冗长的方式来产生副本,您也可以通过最小的代码更改就可以做到这一点:
import numpy as np
A=pd.DataFrame(np.random.randn(4,3))
# instead of creating a list with 5 identical
# objects using 5 * [A], create 5 copies
AList=[A.copy() for _ in range(5)]
AList[0].iloc[2,:]
for kkk0 in range(0,len(AList)):
AList[kkk0].iloc[2,:]=kkk0*np.ones((1,3))
print(AList[kkk0])
AList
输出:
[ 0 1 2
0 0.319473 -0.503133 -0.394476
1 -1.032836 -1.212072 -0.771076
2 0.000000 0.000000 0.000000
3 0.173137 0.387402 -1.256148,
0 1 2
0 0.319473 -0.503133 -0.394476
1 -1.032836 -1.212072 -0.771076
2 1.000000 1.000000 1.000000
3 0.173137 0.387402 -1.256148,
0 1 2
0 0.319473 -0.503133 -0.394476
1 -1.032836 -1.212072 -0.771076
2 2.000000 2.000000 2.000000
3 0.173137 0.387402 -1.256148,
0 1 2
0 0.319473 -0.503133 -0.394476
1 -1.032836 -1.212072 -0.771076
2 3.000000 3.000000 3.000000
3 0.173137 0.387402 -1.256148,
0 1 2
0 0.319473 -0.503133 -0.394476
1 -1.032836 -1.212072 -0.771076
2 4.000000 4.000000 4.000000
3 0.173137 0.387402 -1.256148]