遍历数据框行并根据列表中的变量分配给列表?

时间:2019-05-01 10:52:37

标签: python-3.x pandas list dataframe

我有一个数据框,它是-> https://drive.google.com/file/d/1qcQRwmFIkTJHPaknXjV1vNlDScw1Fxf6/view?usp=sharing

 Kyphosis  Age  Number  Start  prob_Age  prob_Number  prob_Start
50   absent   68       5     10  0.993964     0.208729    0.916693
51   absent    9       2     17  0.997321     0.904427    0.047178
52  present  139      10      6  0.004772     0.001366    0.964974
53   absent    2       2     17  0.997710     0.904427    0.047178
54   absent  140       4     15  0.004711     0.779213    0.072759
55   absent   72       5     15  0.993830     0.208729    0.072759
56   absent    2       3     13  0.997710     0.829827    0.090356
57  present  120       5      8  0.005786     0.208729    0.939803
58   absent   51       7      9  0.994754     0.072175    0.927241
59   absent  102       3     13  0.006362     0.829827    0.090356
60  present  130       4      1  0.005290     0.779213    0.996493
61  present  114       7      8  0.006029     0.072175    0.939803
62   absent   81       4      1  0.993617     0.779213    0.996493
63   absent  118       3     16  0.005872     0.829827    0.060197
64   absent  118       4     16  0.005872     0.779213    0.060197
65   absent   17       4     10  0.996844     0.779213    0.916693
66   absent  195       2     17  0.001558     0.904427    0.047178
67   absent  159       4     13  0.003517     0.779213    0.090356
68   absent   18       4     11  0.996783     0.779213    0.909644
69   absent   15       5     16  0.996966     0.208729    0.060197
70   absent  158       5     14  0.003580     0.208729    0.083307
71   absent  127       4     12  0.005449     0.779213    0.092836
72   absent   87       4     16  0.993547     0.779213    0.060197
73   absent  206       4     10  0.001135     0.779213    0.916693
74   absent   11       3     15  0.997205     0.829827    0.072759
75   absent  178       4     15  0.002387     0.779213    0.072759
76  present  157       3     13  0.003643     0.829827    0.090356
77   absent   26       7     13  0.996282     0.072175    0.090356
78   absent  120       2     13  0.005786     0.904427    0.090356
79  present   42       7      6  0.995277     0.072175    0.964974
80   absent   36       4     13  0.995648     0.779213    0.090356

我有这些列表:

A=0,S=0,N=0
X3=[A,S]
X7=[N,A,A,A,S,S]
X5=[S,N,A,A,S,A,S]
X4=[N,S,N,A,A,S,A,S]
X9=[N,S,N,A,A,S,A,S]
X10=[A,A,A,S,S]
list=[ X7,  X7,  X5,  X7,  X7,  X7,  X7,  X5,  X7,  X7,  X5,  X5,  X3,  X7,  X7,  X7, X10, X10,  X7,  X7, X10,  X7,  X7, X10,  X7, X10,  X9,  X7,  X7,  X4,X7]

现在,我的目标是通过df,将column的值放入 每条记录的“ prob_Age”,“ prob_Number”,“ prob_Start”分别放入“列表”

我尝试了以下代码:

A=0,S=0,N=0
X3=[N,A,S]
X7=[A,S]
X5=[A,S]
X4=[N,A,S]
X9=[A,A,S]
X10=[A,A,S]
list=[ X7,  X7,  X5,  X7,  X7,  X7,  X7,  X5,  X7,  X7,  X5,  X5,  X3,  X7,  X7,  X7, X10, X10,  X7,  X7, X10,  X7,  X7, X10,  X7, X10,  X9,  X7,  X7,  X4,X7]
list1=[]
for i in df.iterrows():
    A=df['prob_Age']
    S=df['prob_Number']
    N=df['prob_Start']
    print(list)

预期产量

list=[ [0.993964,0.916693],  [0.997321,0.047178],  [0.004772,0.964974],  [0.997710,0.047178],  [0.004711,0.072759], 
      [0.993830,0.072759],  [0.997710,0.090356],  [0.005786,0.939803],  [0.994754,0.927241],  [0.006362,0.090356],  
      [0.005290,0.996493],  [0.006029,0.939803],  [0.993617,0.779213,0.996493],  [0.005872,0.060197],  [0.005872,0.060197],
      [0.996844,0.916693], [0.001558,0.001558,0.047178], [0.003517,0.090356],  [ 0.996783,0.909644],  [0.996966,0.060197], 
      [0.003580,0.003580,0.083307],  [0.005449,0.092836],  [0.993547,0.060197], [0.001135,0.001135,0.916693], [0.997205,0.072759], 
      [ 0.002387,0.002387,0.072759],  [0.003643,0.003643 ,0.090356],  [0.996282 ,0.090356],  [0.005786,0.090356],  [0.995277,0.072175,0.964974],[0.995648,0.090356]]

我得到了答案,谢谢大家:

list=[]
c=0
for _, x in df.iterrows():
    A, S, N = x[['prob_Age', 'prob_Start', 'prob_Number']].values
    X3=[N,A,S]
    X7=[A,S]
    X5=[A,S]
    X4=[N,A,S]
    X9=[A,A,S]
    X10=[A,A,S]
    l=[ X7,  X7,  X5,  X7,  X7,  X7,  X7,  X5,  X7,  X7,  X5,  X5,  X3,  X7,  X7,  X7, X10, X10,  X7,  X7, X10,  X7,  X7, X10,  X7, X10,  X9,  X7,  X7,  X4, X7]
    list.append(l[c])
    c=c+1
print(list)

3 个答案:

答案 0 :(得分:1)

首先,list是python中的内置函数,因此您不应真正将其用作变量名。其次,尽管您在每次迭代中都更改了变量A,S,N(但实际上并没有改变,因为在每次迭代中您都为其分配了相同的值),但是您并没有更改任何列表的值。因此,要获得每次迭代所需的输出,您应该执行以下操作:

for _, x in df.iterrows():
    A, S, N = x[['prob_Age', 'prob_Number', 'prob_Start']].values
    X3=[N,A,S]
    X7=[A,S]
    X5=[A,S]
    X4=[N,A,S]
    X9=[A,A,S]
    X10=[A,A,S]
    l=[ X7,  X7,  X5,  X7,  X7,  X7,  X7,  X5,  X7,  X7,  X5,  X5,  X3,  X7,  X7,  X7, X10, X10,  X7,  X7, X10,  X7,  X7, X10,  X7, X10,  X9,  X7,  X7,  X4,X7]
    print(l)

现在,根据您的最终目标是什么,我很确定有更好的解决方案。

编辑 这可能会更好一点:

inds = [
    'X7', 'X7', 'X5', 'X7', 'X7', 'X7', 'X7', 'X5', 'X7', 'X7',
    'X5', 'X5', 'X3', 'X7', 'X7', 'X7', 'X10', 'X10', 'X7', 'X7',
    'X10', 'X7', 'X7', 'X10', 'X7', 'X10', 'X9', 'X7', 'X7', 'X4', 'X7'
]
def fill_in(idx, row):
    A, S, N = row[['prob_Age', 'prob_Number', 'prob_Start']].values
    d = {
        'X3': [N,A,S],
        'X7': [A,S],
        'X5': [A,S],
        'X4': [N,A,S],
        'X9': [A,A,S],
        'X10': [A,A,S]
    }
    return d[inds[idx]]

l = [fill_in(i, x) for i, x in df.iterrows()]

答案 1 :(得分:0)

这可能不是最好的方法。

#A=0,S=0,N=0 This is not required
#X3=[N,A,S]
#X7=[A,S]
#X5=[A,S]
#X4=[N,A,S]
#X9=[A,A,S]
#X10=[A,A,S]
l =[ 'X7',  'X7',  'X5',  'X7',  'X7',  'X7',  'X7',  'X5'........]
# as pointed out above list should not be used as a variable
# change l values into string, there are easy ways to do this.
idx = 0

for i,row in df.iterrows():    # have to add the row because there are two values index and row to unpack
   A=row['prob_Age']
   S=row['prob_Number']
   N=row['prob_Start']
   X3=[N,A,S]
   X7=[A,S]
   X5=[A,S]
   X4=[N,A,S]
   if str(list[idx]) == 'X7':   # the str part is not required if the l is changed.
        list[idx] = X7
   elif str(list[idx]) == 'X3':
        list[idx] = X3
   elif str(list[idx]) == 'X5':
        list[idx] = X5
   elif str(list[idx]) == 'X4':
        list[idx] = X4
    ### Put more conditions
   idx +=1

答案 2 :(得分:0)

问题尚不完全清楚,但是如果我理解正确,并且希望每行包含每个列表的列表(val 1,val 2,val 3),则此方法有效。我模拟了类似的df(示例中为“ {data}”:https://imgur.com/a/MhTuUbh)进行测试。


list_to_fill = []

length = len(data)

row = 0
col = 4
cell = data.iat[row, col]

for r in range (length):

    temp_row_list = []
    for i in range(3):
        cell = data.iat[row, col]
        temp_row_list.append(cell)
        col = col + 1    
    list_to_fill.append(temp_row_list)

    col = 4
    row = row + 1

print ('final list =', list_to_fill)

给予您

最终列表= [[4,5,1],[7,3,2],[9,0,8]]