选择超过1列不是NaN的Pandas行

时间:2019-01-04 14:40:18

标签: python pandas

我通过以下方式设置了数据框:

import { FormBuilder, FormGroup, FormControl, Validators } from '@angular/forms';

export class AccountPage  implements OnInit {

    accountError: string;
    form: FormGroup;
    name: FormControl;
    firstname: FormControl;
    add1: FormControl;
    add2: FormControl;
    zip: FormControl;
    city: FormControl;
    phone: FormControl;

 constructor(
        private formBuilder: FormBuilder,
        private navCtrl: NavController,
        private auth: AuthService,
        private afs: FirestoreService,
        private fcmProvider: FcmProvider,
    ) {
        this.form = this.formBuilder.group({
            name: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(2), ]) ),
            firstname: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(2), ]) ),
            add1: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(6), ]) ),
            add2: '',
            zip: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(5), NumberValidator.numeric, ]) ),
            city: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(3), ]) ),
            phone: new FormControl( null, Validators.compose([Validators.required, Validators.minLength(10), ]) ),
        });

}

如何使用列索引(列名更改)选择header_3和header_4都不是NaN的行? header_3和header_4是整数

谢谢

2 个答案:

答案 0 :(得分:4)

如果可能,列表中定义了多个列,请使用DataFrame.all检查是否缺少过滤列的值,以检查每行所有True

cols = ['header_3','header_4']

df = df[df[cols].notnull().all(axis=1)]
print (df)
  header_1 header_2  header_3  header_4
1        b        c       9.0      10.0
# df[df[['header_3', 'header_4']].notnull().all(axis=1)]  # Just to avoid creating a list of cols and calling that.

对于后2列选择,请使用iloc进行按位置选择:

df = df[df.iloc[:, -2:].notnull().all(axis=1)]

也可以通过索引器指定列

#python count from 0
df = df[df.iloc[:, [2,3]].notnull().all(axis=1)]
# df[df.loc[:, ['header_3', 'header_4']].notnull().all(axis=1)]  # or can use loc with direct columns name

或者如果只有2列将条件&AND链接起来,则条件为:

df = df[df['header_3'].notnull() & df['header_4'].notnull()]

答案 1 :(得分:2)

还有.dropna

subset = ['header_3', 'header_4']
df.dropna(subset=subset, thresh=len(subset))

#  header_1 header_2  header_3  header_4
#1        b        c       9.0      10.0