删除包含' N'其中NA值的数量 - python

时间:2016-03-22 23:25:09

标签: python python-2.7 numpy pandas

假设我使用df.isnull()。sum(),我得到了所有' NA' df数据帧的所有列中的值。我想删除NA值高于' K'。

的列

例如,

df = pd.DataFrame({'A': [1, 2.1, np.nan, 4.7, 5.6, 6.8],
                'B': [0, np.nan, np.nan, 0, 0, 0],
                'C': [0, 0, 0, 0, 0, 0.0],
                'D': [5, 5, np.nan, np.nan, 5.6, 6.8],
                'E': [0,np.nan,np.nan,np.nan,np.nan,np.nan],})
df.isnull().sum()

A    1
B    2
C    0
D    2
E    5
dtype: int64

假设我要删除包含' 2'以上NA值的数量。怎么会遇到这个问题?我的输出应该是,

df.columns
A,C

有人可以帮我这么做吗?

由于

3 个答案:

答案 0 :(得分:3)

致电dropna并传递axis=1以逐列方式删除并传递thresh=len(df)-Kthresh所做的是设置相等的非NaN值的最小数量行数减去K NaN值

In [22]:

df.dropna(axis=1, thresh=len(df)-1)
Out[22]:
     A  C
0  1.0  0
1  2.1  0
2  NaN  0
3  4.7  0
4  5.6  0
5  6.8  0

如果您只想要列:

In [23]:
df.dropna(axis=1, thresh=len(df)-1).columns

Out[23]:
Index(['A', 'C'], dtype='object')

或者简单地根据列掩盖计数输出:

In [28]:
df.columns[df.isnull().sum() <2]

Out[28]:
Index(['A', 'C'], dtype='object')

答案 1 :(得分:1)

可以做类似的事情:

df = df.reindex(columns=[x for x in df.columns.values if df[x].isnull().sum() < threshold])

这只是构建符合您要求的列的列表(少于阈值空值),然后使用该列表重新索引数据帧。因此,如果您将阈值设置为1:

threshold = 1
df = pd.DataFrame({'A': [1, 2.1, np.nan, 4.7, 5.6, 6.8],
            'B': [0, np.nan, np.nan, 0, 0, 0],
            'C': [0, 0, 0, 0, 0, 0.0],
            'D': [5, 5, np.nan, np.nan, 5.6, 6.8],
            'E': ['NA', 'NA', 'NA', 'NA', 'NA', 'NA'],})
df = df.reindex(columns=[x for x in df.columns.values if df[x].isnull().sum() < threshold])
df.count()

将屈服:

C    6
E    6
dtype: int64

答案 2 :(得分:0)

thresh函数有一个df.dropna(axis=1,thresh=5).count() A 5 C 6 E 6 参数,允许您给出所需的非NaN值的数量,这样就可以得到您想要的输出:

extension MainViewController: UIViewControllerPreviewingDelegate {

    func previewingContext(previewingContext: UIViewControllerPreviewing, viewControllerForLocation location: CGPoint) -> UIViewController? {
        if #available(iOS 9.0, *) {
            previewingContext.sourceRect = myButton!.bounds //optional
        }

        let homePeakViewController = UIStoryboard.homePeakViewController()
        homePeakViewController.delegate = self

        return homePeakViewController
    }


    func previewingContext(previewingContext: UIViewControllerPreviewing, commitViewController viewControllerToCommit: UIViewController) {
        let balanceViewController = viewControllerToCommit as! HomePeakViewController
        navigationController?.pushViewController(balanceViewController, animated: true)
    }

}

extension MainViewController: HomePeakViewControllerDelegate {

    func homePeakViewControllerUpadateActionTapped() {
      let bookAppointmentViewController = let nb:BookAppointmentViewController = BookAppointmentViewController(nibName: "BookAppointmentViewController", bundle: nil)
      navigationController?.pushViewController(bookAppointmentViewController, animated: true) //present as you want
    }

}

protocol HomePeakViewControllerDelegate {
  func homePeakViewControllerUpadateActionTapped()
}

class HomePeakViewController {

  var delegate: HomePeakViewControllerDelegate?

  @available(iOS 9.0, *)
  override func previewActionItems() -> [UIPreviewActionItem] {
    let item3 = UIPreviewAction(title: "Update", style: .Default) { (action:UIPreviewAction, vc:UIViewController) -> Void in
      delegate?.homePeakViewControllerUpadateActionTapped()
    }

    return [item3]
  }

}

如果你只想要C&amp; E,在这种情况下你必须将thresh更改为6。