数据框列名来自数组值

时间:2012-04-10 03:18:47

标签: arrays r dataframe

我有一个名字数组,我想将这些名称用于数据框的列名,但是我遇到了一些错误。我不确定究竟是怎么做到的,但这是我到目前为止所做的。

windspeeds = data.frame()
cities <- c("albuquerque_nm", "boston_ma", "charlotte_nc", "dallas_tx", "denver_co", "helena_mt", "louisville_ky", "pittsburgh_pa", "salt_lake_city_ut", "seattle_wa")
for(i in 1:10){
  fastest <- read.delim(paste("http://www.itl.nist.gov/div898/winds/data/nondirectional/datasets/", cities[i], ".prn", sep=""), col.names=c("NULL", "fastest", "NULL", "NULL"), skip=4, header=F, sep="")$fastest
  windspeeds$cities[i] = fastest
}

我收到此错误:

Error in `$<-.data.frame`(`*tmp*`, "cities", value = 59L) : 
replacement has 1 rows, data has 0
In addition: Warning message:
In windspeeds$cities[i] = fastest :
number of items to replace is not a multiple of replacement length

我必须将数组转换为某种类型的字符串或常量吗?

1 个答案:

答案 0 :(得分:3)

您的一个问题是您的查询不会为每个城市返回相同数量的记录(免责声明,我对您的数据一无所知或它应该是什么样子)。无论如何,这是将数据读入列表对象的一种方法,这可能是一种更“R-ish”的做事方式:

x <- lapply(cities, function(x) 
  read.delim(paste("http://www.itl.nist.gov/div898/winds/data/nondirectional/datasets/", x, ".prn", sep=""), 
             col.names=c("NULL", "fastest", "NULL", "NULL"), skip=4, header=FALSE, sep="")$fastest
            )

X现在看起来像:

> str(x)
List of 10
 $ : int [1:46] 59 49 51 52 57 52 45 54 49 64 ...
 $ : int [1:42] 50 55 79 56 53 41 51 51 65 62 ...
 $ : int [1:29] 33 42 40 52 42 48 52 51 46 51 ...
 $ : int [1:32] 48 45 46 45 43 53 43 58 46 43 ...
 $ : int [1:33] 42 51 49 44 47 47 50 44 44 54 ...
 $ : int [1:48] 58 58 58 58 70 55 56 62 59 70 ...
 $ : int [1:39] 40 39 50 53 50 51 54 54 51 50 ...
 $ : int [1:18] 47 56 60 44 54 50 42 52 47 47 ...
 $ : int [1:46] 53 49 40 53 55 40 49 46 61 41 ...
 $ : int [1:10] 38 44 35 46 42 45 41 45 42 43

并有描述性统计数据:

> do.call(rbind, lapply(x, summary))
      Min. 1st Qu. Median  Mean 3rd Qu. Max.
 [1,]   45   49.50   53.0 55.02   57.00   85
 [2,]   41   49.25   54.5 56.26   60.75   85
 [3,]   33   39.00   42.0 44.86   51.00   65
 [4,]   39   45.75   48.0 49.16   51.50   67
 [5,]   42   44.00   48.0 48.67   51.00   61
 [6,]   42   49.00   55.0 54.04   58.00   71
 [7,]   38   43.50   49.0 48.74   52.50   66
 [8,]   39   45.00   47.0 48.44   53.50   60
 [9,]   40   45.25   49.0 50.41   54.00   69
[10,]   35   41.25   42.5 42.10   44.75   46

你是否应该为每个城市提供相同数量的记录尚不得而知,但希望这会让你走上正确的道路。