我有一个包含许多行和列的大型数据框,我想删除至少有一列是NA / NaN的行。以下是我正在使用的数据框的一个小例子:
team_id athlete_id GP tm_STL tm_TOV player_WS
1 13304 75047 1 2 8 NaN
2 13304 75048 1 2 8 0.28563827
3 13304 75049 1 2 8 NaN
4 13304 75050 1 2 8 NaN
5 13304 75053 1 2 8 0.03861989
6 13304 75060 1 2 8 -0.15530707
...虽然是一个糟糕的例子,因为在这种情况下,所有的NaN都显示在最后一列中。我熟悉which(is.na(df$column_name))
从单个列获取具有NA值的行的方法,但是对于数据帧的一行中至少有1列具有NA值的行,我再次想要这样做。
谢谢!
答案 0 :(得分:13)
尝试使用complete.cases
。
> df <- data.frame(col1 = c(1, 2, 3, NA, 5), col2 = c('A', 'B', NA, 'C', 'D'),
col3 = c(9, NaN, 8, 7, 6))
> df
col1 col2 col3
1 1 A 9
2 2 B NaN
3 3 <NA> 8
4 NA C 7
5 5 D 6
> df[complete.cases(df), ]
col1 col2 col3
1 1 A 9
5 5 D 6
答案 1 :(得分:7)
try {
soapMessage.writeTo(System.out);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
有效:
ago 12, 2016 12:58:17 PM com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection post
GRAVE: SAAJ0008: respuesta errónea; Not Found
Exception in thread "main" com.sun.xml.internal.messaging.saaj.SOAPExceptionImpl: com.sun.xml.internal.messaging.saaj.SOAPExceptionImpl: Bad response: (404Not Found
at com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection.call(HttpSOAPConnection.java:149)
at
AbstractSoapClient.createRequest(AbstractSoapClient.java:44)
at SoapClient.main(SoapClient.java:67)
Caused by: com.sun.xml.internal.messaging.saaj.SOAPExceptionImpl: Bad response: (404Not Found
at com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection.post(HttpSOAPConnection.java:264)
at com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection.call(HttpSOAPConnection.java:145)
... 2 more
CAUSE:
com.sun.xml.internal.messaging.saaj.SOAPExceptionImpl: Bad response: (404Not Found
at com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection.post(HttpSOAPConnection.java:264)
at com.sun.xml.internal.messaging.saaj.client.p2p.HttpSOAPConnection.call(HttpSOAPConnection.java:145)
at AbstractSoapClient.createRequest(AbstractSoapClient.java:44)
at SoapClient.main(SoapClient.java:67)
如果你正在使用它,它比na.omit
更方便,因为它不需要像na.omit(df)
## team_id athlete_id GP tm_STL tm_TOV player_WS
## 2 13304 75048 1 2 8 0.28563827
## 5 13304 75053 1 2 8 0.03861989
## 6 13304 75060 1 2 8 -0.15530707
,complete.cases
或dplyr::filter
这样的子集的其他功能。
答案 2 :(得分:5)
你可以用它。
df[rowSums(is.na(df))==0,]
# team_id athlete_id GP tm_STL tm_TOV player_WS
#2 13304 75048 1 2 8 0.28563827
#5 13304 75053 1 2 8 0.03861989
#6 13304 75060 1 2 8 -0.15530707
这样,您可以计算每行的NA数。你只保留行是非NA的总和为零。