我想要比较两个csv文件,并在满足四个条件的情况下执行函数/计算。
文件1:
SN CY Year Month Day Hour Lat Lon
196101 1 1961 1 14 12 8.3 134.7
196101 1 1961 1 14 18 8.8 133.4
196101 1 1961 1 15 0 9.1 132.5
196101 1 1961 1 15 6 9.3 132.2
196101 1 1961 1 15 12 9.5 132
196101 1 1961 1 15 18 9.9 131.8
196125 1 1961 1 14 12 10.0 136
196125 1 1961 1 14 18 10.5 136.5
file2的:
Year Month Day RR Hour Lat Lon
1961 1 14 0 0 14.0917 121.055
1961 1 14 0 6 14.0917 121.055
1961 1 14 0 12 14.0917 121.055
1961 1 14 0 18 14.0917 121.055
1961 1 15 0 0 14.0917 121.055
1961 1 15 0 6 14.0917 121.055
我试图计算这两个文件中Lat-Lon点之间的距离,只要它们具有相同的年,月,日,小时。这是我的代码:
jtwc <-read.csv("file1.csv",header=T,sep=",")
stn <-read.csv("file2.csv",header=T,sep=",")
dms_to_rad <- function(d, m, s) (d + m / 60 + s / 3600) * pi / 180
great_circle_distance <- function(lat1, long1, lat2, long2) {
a <- sin(0.5 * (lat2 - lat1))
b <- sin(0.5 * (long2 - long1))
12742 * asin(sqrt(a * a + cos(lat1) * cos(lat2) * b * b))
}
jtwc$dist<- great_circle_distance(dms_to_rad(jtwc$Lat,0,0),dms_to_rad(jtwc$Lon,0,0),dms_to_rad(stn$Lat,0,0),dms_to_rad(stn$Lon,0,0))
write.csv(stn,file="dist.csv",row.names=T)
&#34; SN&#34; column是file1中的唯一标识符。我想做什么:
[1]当两个文件具有相同的年,月,日和小时时,计算距离(jtwc $ dist)。
[2]如果一行具有相同的年,月,日和小时但在文件1中不同的SN号,我将使用具有相同年份,月份的行中的值,计算距离的日期和文件2中的小时。
输出应该是这样的:
SN CY Year Month Day Hour Lat Lon dist
196101 1 1961 1 14 12 8.3 134.7 1620.961
196101 1 1961 1 14 18 8.8 133.4 1467.859
196101 1 1961 1 15 0 9.1 132.5 1334.382
196101 1 1961 1 15 6 9.3 132.2 1324.915
196125 1 1961 1 14 12 10.0 136 1687.127
196125 1 1961 1 14 18 10.5 136.5 1724.351
有关如何正确执行此操作的任何建议吗?
答案 0 :(得分:2)
如果我理解你,你可以尝试这个解决方案:
library(tidyverse)
#functions
dms_to_rad <- function(d, m, s) (d + m / 60 + s / 3600) * pi / 180
great_circle_distance <- function(lat1, long1, lat2, long2) {
a <- sin(0.5 * (lat2 - lat1))
b <- sin(0.5 * (long2 - long1))
12742 * asin(sqrt(a * a + cos(lat1) * cos(lat2) * b * b))
}
#read file
dir1 = 'path_to_your_files'
dir1 = 'path_to_your_files'
jtwc <- read.csv(dir1) %>%
unite('key',c('Year','Month','Day','Hour'))
stn <- read.csv(dir2) %>%
unite('key',c('Year','Month','Day','Hour'))
#aggregating
stn <- left_join(jtwc,stn,by = 'key') %>%
drop_na() %>%
mutate_at(vars(Lat.x,Lon.x, Lat.y,Lon.y),funs(dms_to_rad),m = 0,s =0) %>%
mutate(dist = great_circle_distance(Lat.x,Lon.x, Lat.y,Lon.y))
write.csv(stn,file="dist.csv",row.names=T)