迭代时更新大熊猫中的数据框

时间:2019-09-26 07:41:50

标签: python-3.x pandas numpy

我需要迭代数据框中的值。在数据框中,我在dataframe中具有oldvalue和new value列。我想

old value    newvalue  date       casenumber
aab          baa       1/1/2019     001
acb          bca       2/2/2019     002
abc          cba       1/7/2109     003
acd          dca       2/8/2019     004
aab          bca       2/23/2019    005
acb          baa       4/6/2019     006
abc          dca       4/9/2019     007
aab          baa       1/23/2019    008

我想对oldvalue中的值进行迭代,以了解从aab到其他值有多少从oldvalue传递到新值。

预期输出:-

         jan   feb  march  April
aab-baa   2                 1    
aab-bca         1
acb-bca         1
acb-baa                     4
abc-cba    1
abc-dca                      4
acd-dca         1

我用来获取输出的代码:

df = pd.read_excel(r"")
f8 = df[df['Old Value'] == 'aab'] every time i am changing   the old value manually  
f9 = f8[f8['New Value'] == 'baa'] 
f1 = f8[f8['New Value'] == 'bca'] 
f2 = f8[f8['New Value'] == 'cba']
f3 = f8[f8['New Value'] == 'dca']
f4 = f8[f8['New Value'] == 'abc']

d1 = pd.concat([f9, f1])
d2 = pd.concat([f2, f3])
d3 = pd.concat([d1, d2])
d4= pd.concat([d3, f4])

df10=d4[['Case Number','Old Value','New Value']]
f9= df10.set_index(["New Value", "Old Value"]).count(level="New Value") 


df = pd.read_excel(r"")
f8 = df[df['Old Value'] == 'aab'] 
f9 = f8[f8['New Value'] == 'baa'] 
f1 = f8[f8['New Value'] == 'bca'] 
f2 = f8[f8['New Value'] == 'cba']
f3 = f8[f8['New Value'] == 'dca']
f4 = f8[f8['New Value'] == 'abc']
d1 = pd.concat([f9, f1])
d2 = pd.concat([f2, f3])
d3 = pd.concat([d1, d2])
d4= pd.concat([d3, f4])
df10=d4[['Case Number','Old Value','New Value']]
f9= df10.set_index(["New Value", "Old Value"]).count(level="New Value")

放出

             jan   feb  march  April
    aab-baa   2                  1    
    aab-bca         1
    acb-bca         1
    acb-baa                      4
    abc-cba   1
    abc-dca                      4
    acd-dca         1

1 个答案:

答案 0 :(得分:1)

首先将<?php $args = array( 'cat' => [48,43,49,46,47,44,51,50,42], //change here array 'order' => 'ASC', 'posts_per_page' => 3 //showposts deprecated now ); query_posts($args); ?> <?php while (have_posts()) : the_post(); ?> <?php the_title(); ?> <?php endwhile; ?> <?php wp_reset_query(); ?> // you should reset your query 转换为日期时间,然后用crosstabSeries.dt.month进行整形以得到正确的顺序,为所有丢失的月份添加DataFrame.reindex(如有必要),然后转换列到几个月,最后将date转换为前2列:

MultiIndex

可以将df['date'] = pd.to_datetime(df['date']) df = (pd.crosstab([df['old value'],df['newvalue']], df['date'].dt.month) .reindex(columns=range(1, 13), fill_value=0) .rename(columns = lambda x: pd.to_datetime(x, format='%m').strftime('%b')) .reset_index() .rename_axis(None, axis=1)) print (df) old value newvalue Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov \ 0 aab baa 2 0 0 0 0 0 0 0 0 0 0 1 aab bca 0 1 0 0 0 0 0 0 0 0 0 2 abc cba 1 0 0 0 0 0 0 0 0 0 0 3 abc dca 0 0 0 1 0 0 0 0 0 0 0 4 acb baa 0 0 0 1 0 0 0 0 0 0 0 5 acb bca 0 1 0 0 0 0 0 0 0 0 0 6 acd dca 0 1 0 0 0 0 0 0 0 0 0 Dec 0 0 1 0 2 0 3 0 4 0 5 0 6 0 替换为空字符串,但是用字符串数据获取数字,下一步处理应该是问题:

0