Python Pandas与每次迭代嵌套的结果不一致

时间:2014-05-19 19:47:11

标签: python pandas

这是我的第一篇文章,我很抱歉,因为我正在努力理解。

我正在使用Pandas和Python进行数据分析并遇到一个我不理解的问题(因此我很难解释)。

我在'for each loop'中调用一个函数,它试图根据指定的列开始和结束位置将值插入Pandas DataFrame。功能如下:

cfg_broll_temp.ix[cfg_broll_temp['aged_debt'] == colb,scolb:ecolb] = tmp_brollb

错误如下:

ValueError: Must have equal len keys and valuewhen setting with an iterable

我已将此跟踪到我正在调用的函数中进行的迭代。我正在基于项目列表运行每个循环,并为每个循环运行嵌套的每个项目逐步执行Pandas系列(cfg_broll_sr [srowb:erowb])以生成结果列表(tmp_brollb)我在失败的声明中更新到DataFrame。

我的问题是,尽管我调用此函数时具有相同的启动条件,但它似乎生成了一个不一致长度的列表。例如使用srowb = 1和erowb = 18的值,它将生成一个列表(tmp_brollb),其中len(tmp_brollb)= 17或len(tmp_brollb)= 18

我正在用头撞墙,因为我正在努力隔离造成这种情况的任何变化。

我遇到问题的具体代码块是:

# Set starting conditions

srowb = 1 # starting row
if np.isnan(variable) :
    erowb = 18 
else:
    erowb = int(variable)

# iterate through list
tmp_brollb = [] # Generate empty array
z = 0 # reset starting value
for itemr in cfg_broll_sr[srowb:erowb]:
    z = min(z + itemr,1) # prevent more that 100%
    tmp_brollb.append(z)

# Increment

srowb = srowb + 1
erowb = min(erowb + 1,20)

有关代码的更多细节:

# Set maximum number of rows and starting row
rowsb = 18
srowb = 1 # starting row

# check if a event exists and set maximum row to cease value
if np.isnan(event_month) :
    erowb = 18 
else:
    erowb = int(event_month) 

# Set starting conditions for range insertion 
basecolb = 7 # starting column for roll data
scolb = basecolb # Getting an expanding range of columns from 1 to 18
ecolb = basecolb + erowb

# item tuple for step through
dcolsb = ('item1','item2','item3')

# step through and update values
for colb in dcolsb :

    # Generate accumulated arrays
    tmp_brollb = None # clear variable
    tmp_brollb = [] # Generate empty array
    z = 0 # reset starting value
    for itemr in cfg_broll_sr[srowb:erowb]:
        z = min(z + itemr,1) # prevent more that 100%
        tmp_brollb.append(z)

    # insert accumulated values into selected row and column range
    cfg_broll_temp.ix[cfg_broll_temp['item_column'] == colb,scolb:ecolb] = tmp_brollb
    # carry last value from tmp_brollb forward to end of column range
    cfg_broll_temp.ix[cfg_broll_temp['item_column'] == colb,ecolb:] = tmp_brollb[-1]

    # increment all positional variables
    srowb = srowb + 1
    erowb = min(erowb + 1,20)
    scolb = scolb + 1
    ecolb = min(ecolb + 1,25)

如果我打印以下内容,我会得到:

    print "inconsistent length issue", len(tmp_brollb), erowb - srowb, srowb, erowb


run strategy 17
inconsistent length issue 18 17 1 18 # len(tmp_brollb) inconsistent despite same erowb - srowb value
inconsistent length issue 17 17 2 19
inconsistent length issue 16 17 3 20
inconsistent length issue 15 16 4 20
inconsistent length issue 14 15 5 20
inconsistent length issue 13 14 6 20
inconsistent length issue 12 13 7 20
inconsistent length issue 11 12 8 20
inconsistent length issue 10 11 9 20
inconsistent length issue 9 10 10 20
inconsistent length issue 8 9 11 20
inconsistent length issue 7 8 12 20
inconsistent length issue 6 7 13 20
run strategy 17 again via parent iterator
inconsistent length issue 17 17 1 18 # len(tmp_brollb) inconsistent despite same erowb - srowb value
inconsistent length issue 17 17 2 19
inconsistent length issue 16 17 3 20
inconsistent length issue 15 16 4 20
inconsistent length issue 14 15 5 20
inconsistent length issue 13 14 6 20
inconsistent length issue 12 13 7 20
inconsistent length issue 11 12 8 20
inconsistent length issue 10 11 9 20
inconsistent length issue 9 10 10 20
inconsistent length issue 8 9 11 20
inconsistent length issue 7 8 12 20
inconsistent length issue 6 7 13 20

任何帮助都会非常感激,因为我现在有4小时的头痛并且在不断增长。

0 个答案:

没有答案