Python数据框更新标志

时间:2019-06-19 21:36:20

标签: python

我正在创建四个列,分别标记为flagMin,flagMax,flagLow,flagUp。每当我的原始数据被覆盖时,每次循环运行时,我都会更新这些dataframe列。我想将以前的数据保留在4列中,因为当它们为true时它们包含1。

import pandas as pd
import numpy as np
df = pd.read_excel('help test 1.xlsx')

#groupby function separates the different Name parameters within the Name column and performing functions like finding the lowest of the "minimum" and "lower" columns and highest of the "maximum" and "upper" columns.

flagMin = df.groupby(['Name'], as_index=False)['Min'].min()
flagMax = df.groupby(['Name'], as_index=False)['Max'].max()
flagLow = df.groupby(['Name'], as_index=False)['Lower'].min()
flagUp = df.groupby(['Name'], as_index=False)['Upper'].max()
print(flagMin)
print(flagMax)
print(flagLow)
print(flagUp)

num = len(flagMin)  #size of 2, works for all flags in this case

for i in range(num):
    #iterating through each row of parameters and column number 1(min,max,lower,upper column) 
    colMin = flagMin.iloc[i, 1]
    colMax = flagMax.iloc[i, 1]
    colLow = flagLow.iloc[i, 1]
    colUp = flagUp.iloc[i, 1]

    #setting flags if any column's parameter matches the flag dataframe's parameter, sets a 1 if true, sets a 0 if false
    df['flagMin'] = np.where(df['Min'] == colMin, '1', '0') 
    df['flagMax'] = np.where(df['Max'] == colMax, '1', '0')
    df['flagLow'] = np.where(df['Lower'] == colLow, '1', '0')
    df['flagUp'] = np.where(df['Upper'] == colUp, '1', '0')
    print(df)

4 Dataframes for each flag printed above
    Name       Min
0     Vo       12.8
1     Vi      -51.3

    Name       Max
0     Vo       39.9
1     Vi      -25.7

    Name       Low
0     Vo      -46.0
1     Vi      -66.1

   Name        Up
0     Vo       94.3
1     Vi      -14.1

输出第一次迭代

      flagMax    flagLow   flagUp  
0        0         0         0  
1        0         0         0  
2        0         0         0  
3        1         0         0  
4        0         0         0  
5        0         0         0  
6        0         0         1  
7        0         1         0  
8        0         0         0  
9        0         0         0  
10       0         0         0  
11       0         0         0  
12       0         0         0  
13       0         0         0  
14       0         0         0  
15       0         0         0  
16       0         0         0  
17       0         0         0 

输出第二次迭代

      flagMax   flagLow   flagUp
0        0         0         0  
1        0         0         0  
2        0         0         0  
3        0         0         0  
4        0         0         0  
5        0         0         0  
6        0         0         0  
7        0         0         0  
8        0         0         0  
9        1         0         1  
10       0         0         0  
11       0         0         0  
12       0         0         0  
13       0         0         0  
14       0         0         0  
15       0         1         0  
16       0         0         0  
17       0         0         0  

我在第3、6、7行中输了1。我想保留两组数据中的1。谢谢

1 个答案:

答案 0 :(得分:0)

仅将要更新的元素设置为'1',而不是将整个列设置为{

import pandas as pd
import numpy as np
df = pd.read_excel('help test 1.xlsx')

#groupby function separates the different Name parameters within the Name column and performing functions like finding the lowest of the "minimum" and "lower" columns and highest of the "maximum" and "upper" columns.

flagMin = df.groupby(['Name'], as_index=False)['Min'].min()
flagMax = df.groupby(['Name'], as_index=False)['Max'].max()
flagLow = df.groupby(['Name'], as_index=False)['Lower'].min()
flagUp = df.groupby(['Name'], as_index=False)['Upper'].max()
print(flagMin)
print(flagMax)
print(flagLow)
print(flagUp)

num = len(flagMin)  #size of 2, works for all flags in this case

df['flagMin'] = '0'
df['flagMax'] = '0'
df['flagLow'] = '0'
df['flagUp'] = '0'

for i in range(num):
    #iterating through each row of parameters and column number 1(min,max,lower,upper column) 
    colMin = flagMin.iloc[i, 1]
    colMax = flagMax.iloc[i, 1]
    colLow = flagLow.iloc[i, 1]
    colUp = flagUp.iloc[i, 1]

    #setting flags if any column's parameter matches the flag dataframe's parameter, sets a 1 if true, sets a 0 if false
    df['flagMin'][df['Min'] == colMin] = '1'
    df['flagMax'][df['Max'] == colMax] = '1'
    df['flagLow'][df['Lower'] == colLow] = '1'
    df['flagUp'][df['Upper'] == colUp] = '1'
    print(df)

P.S。我不知道您为什么要使用'0''1'的字符串,而不是仅仅使用01的字符串,但这取决于您。