省略两个列,省略缺失值

时间:2013-02-10 18:46:15

标签: r plot missing-data

这是我使用dput()的数据框:

structure(list(Year = 1900:1903, Top.10..income.share = structure(c(82L, 
81L, 76L, 75L), .Label = c("", "30,3", "30,65", "30,8", "31,3", 
"31,37", "31,38", "31,4", "31,5", "31,51", "31,52", "31,55", 
"31,62", "31,64", "31,66", "31,67", "31,69", "31,75", "31,77", 
"31,8", "31,81", "31,82", "31,85", "31,9", "31,98", "32,01", 
"32,03", "32,04", "32,05", "32,07", "32,11", "32,12", "32,2", 
"32,35", "32,36", "32,42", "32,43", "32,44", "32,5", "32,62", 
"32,64", "32,67", "32,72", "32,82", "32,87", "33,02", "33,22", 
"33,4", "33,69", "33,72", "33,76", "33,87", "33,95", "34,25", 
"34,4", "34,57", "34,62", "34,71", "35,49", "36,3", "36,48", 
"37,26", "37,3", "37,73", "37,77", "37,78", "37,84", "37,92", 
"38,01", "38,1", "38,2", "38,38", "38,4", "38,47", "38,52", "38,59", 
"38,6", "38,63", "38,84", "38,91", "38,99", "39,13", "39,31", 
"39,48", "39,6", "39,82", "39,9", "40,29", "40,54", "40,59", 
"40,75", "41,02", "41,16", "41,52", "41,73", "41,98", "42,12", 
"42,23", "42,36", "42,67", "42,76", "42,86", "42,95", "43", "43,07", 
"43,11", "43,26", "43,35", "43,39", "43,64", "43,76", "44,07", 
"44,17", "44,4", "44,43", "44,57", "44,67", "44,77", "44,94", 
"45,03", "45,16", "45,47", "45,5", "45,67", "45,96", "46,09", 
"46,3", "46,35", "46,54"), class = "factor")), .Names = c("Year", 
"Top.10..income.share"), row.names = c(NA, 4L), class = "data.frame")

我的sessionInfo():

R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] rstudio_0.97.248 tools_2.15.0  

我想要做的是将Year列放在Top10列上 但由于存在相当多的缺失值,因此图表看起来非常糟糕 这是我到目前为止编写的代码。

require(stats)

tid_a = read.csv("tid-a.csv", header=TRUE, sep=";")
germany <- tid_a[1:111, ]
usa <- tid_a[112:210, ]

germany_year <- germany[, 2]
germany_top10 <- na.omit(germany[, 3])

plot(germany_year, germany_top10, type="l")

我尝试了一些我在网上发现的省略缺失值的例子,但我是一个菜鸟,无法让它发挥作用。

1 个答案:

答案 0 :(得分:2)

这个问题不是reproducible,但这样的事情应该有效:

gdat <- na.omit(subset(germany,select=c(Year,Top10)))
plot(Top10~Year,data=gdat,type="l")

(以这种方式使用subset相当罕见 - germany[,c("Year","Top10")]更常见 - 但我喜欢它,因为它是可读的。更简单的版本是plot(...,data=na.omit(germany)),但是将在数据框的任何列中省略NA个行。)