我有每个月从1980年到2004年的数据集(下面给出了部分内容),但我不知道如何从CSV读取它并将其转换为具有以下形式的矩阵:data [lat,lon ,时间]从1到(2004-1980)* 12
的时间...
答案 0 :(得分:2)
数据已存在于.rda
数据文件中,因此读取数据很容易。从干净的工作区开始,执行以下操作:
load("fedfire8004.rda")
ls() ## What objects were read in?
# [1] "fedfire8004"
str(fedfire8004) ## What does that object look like?
# List of 10
# $ lon : num [1:24] -124 -124 -122 -122 -120 ...
# $ lat : num [1:18] 31.5 32.5 33.5 34.5 35.5 36.5 37.5 38.5 39.5 40.5 ...
# $ x : num [1:25] -125 -124 -123 -122 -121 -120 -119 -118 -117 -116 ...
# $ y : num [1:19] 31 32 33 34 35 36 37 38 39 40 ...
# $ year : int [1:300] 1980 1980 1980 1980 1980 1980 1980 1980 1980 1980 ...
# $ month: int [1:300] 1 2 3 4 5 6 7 8 9 10 ...
# $ acres: num [1:24, 1:18, 1:300] NA NA NA NA NA NA NA NA NA NA ...
# ..- attr(*, "dimnames")=List of 3
# .. ..$ lon : chr [1:24] "-124.5" "-123.5" "-122.5" "-121.5" ...
# .. ..$ lat : chr [1:18] "31.5" "32.5" "33.5" "34.5" ...
# .. ..$ month: chr [1:300] "1980.1" "1980.2" "1980.3" "1980.4" ...
# $ fires: num [1:24, 1:18, 1:300] NA NA NA NA NA NA NA NA NA NA ...
# ..- attr(*, "dimnames")=List of 3
# .. ..$ lon : chr [1:24] "-124.5" "-123.5" "-122.5" "-121.5" ...
# .. ..$ lat : chr [1:18] "31.5" "32.5" "33.5" "34.5" ...
# .. ..$ month: chr [1:300] "1980.1" "1980.2" "1980.3" "1980.4" ...
# $ meta : chr "USFS, NPS, BLM, BIA total fires and acres on 1 degree monthly grid 1980-2004"
# $ cite : chr "Westerling, A.L., T.J. Brown, A. Gershunov, D.R. Cayan and M.D. Dettinger, 2003: Climate and Wildfire in the Western United Sta"| __truncated__
如您所见,核心数据似乎是acres
和fires
列表项。将它们重塑为long
数据集可能更方便。最直接的方法可能是“reshape2”包中的melt
。
library(reshape2)
Acres <- melt(fedfire8004$acres)
Fires <- melt(fedfire8004$fires)
让我们查看每个新对象的前几行和最后几行。
head(Acres)
# lon lat month value
# 1 -124.5 31.5 1980.1 NA
# 2 -123.5 31.5 1980.1 NA
# 3 -122.5 31.5 1980.1 NA
# 4 -121.5 31.5 1980.1 NA
# 5 -120.5 31.5 1980.1 NA
# 6 -119.5 31.5 1980.1 NA
tail(Acres)
# lon lat month value
# 129595 -106.5 48.5 2004.12 0
# 129596 -105.5 48.5 2004.12 0
# 129597 -104.5 48.5 2004.12 71
# 129598 -103.5 48.5 2004.12 NA
# 129599 -102.5 48.5 2004.12 NA
# 129600 -101.5 48.5 2004.12 NA
head(Fires)
# lon lat month value
# 1 -124.5 31.5 1980.1 NA
# 2 -123.5 31.5 1980.1 NA
# 3 -122.5 31.5 1980.1 NA
# 4 -121.5 31.5 1980.1 NA
# 5 -120.5 31.5 1980.1 NA
# 6 -119.5 31.5 1980.1 NA
tail(Fires)
# lon lat month value
# 129595 -106.5 48.5 2004.12 0
# 129596 -105.5 48.5 2004.12 0
# 129597 -104.5 48.5 2004.12 2
# 129598 -103.5 48.5 2004.12 NA
# 129599 -102.5 48.5 2004.12 NA
# 129600 -101.5 48.5 2004.12 NA
答案 1 :(得分:0)
您应该(始终)尝试重新组织数据,以便每列包含一种类型的信息:
Year Month Lat Lon Value
python脚本可能是执行此操作的最佳方式...一旦您使用此样式,就可以轻松地在R中导入和分析。
我制作了一个脚本,可以为您重新组织数据...但是目前还不清楚是否可以轻松运行它。你在用什么系统?
这是脚本......输出低于......
#!/usr/bin/env python
import csv
file_obj = open('originaldata.txt', 'r')
Input = csv.reader(file_obj, delimiter='\t')
LineNo = 0
year,month,data = [],[],[]
for items in Input:
if LineNo == 0:
lat = items[2:]
elif LineNo == 1:
lon = items[2:]
else:
year.append(items[0])
month.append(items[1])
data.append(items[2:])
LineNo += 1
# print header
print "%s\t%s\t%s\t%s\t%s"% ("Year","Month","Lat","Lon","Data")
for La,Lo,Ind in zip(lat,lon,range(len(lat))):
for Y,M,D in zip(year,month,data):
print "%s\t%s\t%s\t%s\t%s"% (Y,M,La,Lo,D[Ind])
脚本输出:
Year Month Lat Lon Data
1980 1 31.5 -111.5 0
1980 2 31.5 -111.5 0
1980 3 31.5 -111.5 0
1980 4 31.5 -111.5 0
1980 5 31.5 -111.5 8.1
1980 6 31.5 -111.5 5.1
1980 7 31.5 -111.5 0
1980 8 31.5 -111.5 0
1980 9 31.5 -111.5 0
1980 10 31.5 -111.5 0
1980 11 31.5 -111.5 0
1980 12 31.5 -111.5 0
1981 1 31.5 -111.5 0
1981 2 31.5 -111.5 0
1981 3 31.5 -111.5 0
1981 4 31.5 -111.5 0
1981 5 31.5 -111.5 0
1981 6 31.5 -111.5 0
1981 7 31.5 -111.5 0
1981 8 31.5 -111.5 0
1981 9 31.5 -111.5 0
1981 10 31.5 -111.5 0
1981 11 31.5 -111.5 0
1981 12 31.5 -111.5 0
1980 1 31.5 -110.5 0
1980 2 31.5 -110.5 0
1980 3 31.5 -110.5 0
1980 4 31.5 -110.5 881
1980 5 31.5 -110.5 794.1
1980 6 31.5 -110.5 644.4
1980 7 31.5 -110.5 85.2
1980 8 31.5 -110.5 0.1
1980 9 31.5 -110.5 0
1980 10 31.5 -110.5 0
1980 11 31.5 -110.5 0
1980 12 31.5 -110.5 0
1981 1 31.5 -110.5 0
1981 2 31.5 -110.5 0
1981 3 31.5 -110.5 0
1981 4 31.5 -110.5 0
1981 5 31.5 -110.5 0
1981 6 31.5 -110.5 0
1981 7 31.5 -110.5 0
1981 8 31.5 -110.5 0
1981 9 31.5 -110.5 0
1981 10 31.5 -110.5 0
答案 2 :(得分:0)
轻松加载
meaningful.name<-read.csv(file.choose(new = FALSE))
meaningful.name<-as.matrix(meaningful.name)
meaningful.name$time<-1:nrow(meaningful.name)
之后,我不知道你在追求什么,请你澄清一下吗?