熊猫真假匹配

时间:2020-08-16 17:52:01

标签: pandas dataframe boolean calculated-columns

对于此表:

enter image description here

我想生成“ desired_output”列。一种实现此目的的方法可能是:

  1. col_1中的所有True值都直接传输到期望的输出(红色箭头)
  2. 在desirable_output中,将True值放在任何现有True值上方(绿色箭头)

我尝试过的代码:

df['desired_output']=df.col_1.apply(lambda x: True if x.shift()==True else False)

谢谢

3 个答案:

答案 0 :(得分:5)

您可以按|链接按位OR的原始d = {"col1":[False,True,True,True,False,True,False,False,True,False,False,False]} df = pd.DataFrame(d) df['new'] = df.col1 | df.col1.shift(-1) print (df) col1 new 0 False True 1 True True 2 True True 3 True True 4 False True 5 True True 6 False False 7 False True 8 True True 9 False False 10 False False 11 False False ,其值偏移Series.shift


7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

Scanning the drive for archives:
1 file, 40021368 bytes (39 MiB)

Testing archive: C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#10-AT_2020-08-06_13268.txt.gz
--
Path = C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#10-AT_2020-08-06_13268.txt.gz
Type = gzip
Headers Size = 10


Sub items Errors: 1

Archives with Errors: 1

Sub items Errors: 1

7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

Scanning the drive for archives:
1 file, 40021368 bytes (39 MiB)

Testing archive: C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#210-AT_2020-08-06_13268.txt - Copy.gz
--
Path = C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#210-AT_2020-08-06_13268.txt - Copy.gz
Type = gzip
Headers Size = 10


Sub items Errors: 1

Archives with Errors: 1

Sub items Errors: 1

7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

Scanning the drive for archives:
1 file, 56581 bytes (56 KiB)

Testing archive: C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#DT-F@_2020-08-06_13268.txt.gz
--
Path = C:\Users\Lozzy\Documents\ARDF\broken\Cart_Weel_#DT-F@_2020-08-06_13268.txt.gz
Type = gzip
Headers Size = 10

Everything is Ok

Size:       504716
Compressed: 56581

答案 1 :(得分:2)

尝试

df['desired_output'] = df['col_1']
df.loc[1:, 'desired_output'] = df.col_1[1:].values | df.col_1[:-1].values
print(df)

答案 2 :(得分:1)

如果这些另存为字符串。 all_caps(是/否) 输入:

   col_1
0   True
1   True
2   False
3   True
4   True
5   False
6   Flase
7   True
8   False

代码:

df['desired']=df['col_1']
for i, e in enumerate(df['col_1']):
    if e=='True':
        df.at[i-1,'desired']=df.at[i,'col_1']
df = df[:(len(df)-1)]
df

输出:

   col_1    desired
0   True    True
1   True    True
2   False   True
3   True    True
4   True    True
5   False   False
6   Flase   True
7   True    True
8   False   False