我有一个美国降水数据集。它有纬度,经度和降雨量。它具有以下格式:
lon -124 -125 -126 -127 -128
lat 45 120 110 NA 230 145
44 NA 130 205 240 195
43 120 110 NA 235 185
42 170 140 204 NA 155
这是数据集的链接:
https://www.dropbox.com/s/1xxy2ospr9xvy8n/Pmaxupscaled.csv
我想使用R:
将其转换为此格式 precipitation lat lon
120 45 -124
110 45 -125
NA 45 -126
答案 0 :(得分:0)
您可能需要使用'reshape2'包 -
library(reshape2)
df<- read.table(textConnection('
lat -124 -125 -126 -127 -128
45 120 110 NA 230 145
44 NA 130 205 240 195
43 120 110 NA 235 185
42 170 140 204 NA 155'), header = TRUE)
df2 <- melt(df, id.vars = 'lat')
您还需要获取原始列名称的变通方法。 R不喜欢数字列名。在这种特定情况下,这可能是一种解决方法 -
df2$variable <- gsub(x = df2$variable, pattern = "X\\.", replacement = "")
答案 1 :(得分:0)
我不是超级reshape2
用户,但这对我有用。
library(reshape2)
a <- read.csv("~/Documents/Pmaxupscaled.csv")
# Convert to matrix
a <- as.matrix(a)
# Replace row names with the values from the first column
dimnames(a)[[1]] <- a[, 1]
# Drop the first column
a <- a[, -1]
# Melt the matrix into a data frame.
b <- melt(a, varnames = c("Lat", "Lon"))
# Get rid of "X."
b$Lon <- gsub("X\\.", "", b$Lon)
# Format longitude as negative number
b$Lon <- as.numeric(b$Lon)
b$Lon <- -1 * b$Lon
# Rename precipitation column
names(b)[3] <- "precipitation"
答案 2 :(得分:0)
由于答案超过了“reshape2”的最大值,这里是基数R中的一个选项:
a <- read.csv("path/to/Pmaxupscaled.csv", check.names = FALSE)
out <- cbind(lat = a[, 1], setNames(stack(a[-1]), c("precip", "lon")))
head(out)
# lat precip lon
# 1 45.0 77.63 -105
# 2 42.5 76.15 -105
# 3 40.0 72.18 -105
# 4 37.5 78.60 -105
# 5 35.0 80.93 -105
# 6 32.5 87.29 -105
tail(out)
# lat precip lon
# 99 40.0 136.05 -75
# 100 37.5 NA -75
# 101 35.0 NA -75
# 102 32.5 NA -75
# 103 30.0 NA -75
# 104 27.5 NA -75
为了记录,这将是我使用“reshape2”的方法:
library(reshape2)
a <- read.csv("path/to/Pmaxupscaled.csv", check.names = FALSE)
names(a)[1] <- "lat"
out <- melt(a, id.vars="lat", value.name="precip", variable.name="lon")
head(out)
# lat precip lon
# 1 45.0 77.63 -105
# 2 42.5 76.15 -105
# 3 40.0 72.18 -105
# 4 37.5 78.60 -105
# 5 35.0 80.93 -105
# 6 32.5 87.29 -105