使用测量包

时间:2017-07-24 17:52:21

标签: r units-of-measurement

简短版本:我有一个包含两列的数据框:纬度和经度,分别是度,分和秒。在阅读了一些类似的问题like this one之后,我决定使用measurements包来进行转换,但是当从一个单元转换到另一个单元时,结果会变得混乱(见下文)。

详细版本:

提供以下数据框,其中我有两列,LatitudeLongitude以度,分和秒表示

df = data.frame(
      Latitude = c("15° 33' 9\"",NA,"52° 58' 13\"", NA, "21° 1' 28\"", "21° 2' 26\"", "10° 47' 31\"", NA, "-34° 53' 38\"", "41° 7' 56\""), 
      Longitude = c("48° 30' 59\"", NA, "-3° 10' 13\"", NA, "105° 50' 34\"", "105° 47' 52\"", "106° 41' 29\"", NA, "-56° 8' 16\"", "-104° 46' 30\""))

我希望使用measurements包将这些值转换为十进制度,如下所示:

library(measurements)

# Turn degrees, minutes and seconds into spaces so they can be used with
# measurements::conv_unit.
df$Latitude = str_replace(df$Latitude, "°", "")
df$Latitude = str_replace(df$Latitude, "'", "")
df$Latitude = str_replace(df$Latitude, "\"", "")

df$Longitude = str_replace(df$Longitude, "°", "")
df$Longitude = str_replace(df$Longitude, "'", "")
df$Longitude = str_replace(df$Longitude, "\"", "")


# Use measurements::conv_unit to convert to decimal degrees.
df$Latitude = conv_unit(df$Latitude, "deg_min_sec", "dec_deg")
df$Longitude = conv_unit(df$Longitude, "deg_min_sec", "dec_deg")

但是,我得到以下输出:

> df
       Latitude     Longitude     Latitude_dec    Longitude_dec
1    15° 33' 9"   48° 30' 59"          15.5525 48.5163888888889
2          <NA>          <NA>             <NA>             <NA>
3   52° 58' 13"   -3° 10' 13"             <NA>             <NA>
4          <NA>          <NA>           1.4725 50.5958333333333
5    21° 1' 28"  105° 50' 34" 2.43611111111111 47.8961111111111
6    21° 2' 26"  105° 47' 52"             <NA>             <NA>
7   10° 47' 31"  106° 41' 29" 34.8938888888889 56.1377777777778
8          <NA>          <NA> 41.1322222222222          104.775
9  -34° 53' 38"   -56° 8' 16"               -0               -0
10   41° 7' 56" -104° 46' 30"                0               -0

正如你所看到的,第一行计算字段似乎是正确的,而从第3行开始,结果搞砸了,因此完全没用。

我已多次阅读?conv_unit,但我没有发现任何错误。我做错了什么?

1 个答案:

答案 0 :(得分:2)

当NA出现时,

conv_unit显然会中断,大概是因为它使用unlist(strsplit(...解析的方式,如源代码的这一行

secs = lapply(split(as.numeric(unlist(strsplit(x, 
                " "))) * c(3600, 60, 1), f = rep(1:length(x), 
                each = 3)), sum)

所以我认为你转换时需要忽略NA,如下所示:

library(measurements)

df = data.frame(
   Latitude = c("15° 33' 9\"",NA,"52° 58' 13\"", NA, "21° 1' 28\"", "21° 2' 26\"", "10° 47' 31\"", NA, "-34° 53' 38\"", "41° 7' 56\""), 
   Longitude = c("48° 30' 59\"", NA, "-3° 10' 13\"", NA, "105° 50' 34\"", "105° 47' 52\"", "106° 41' 29\"", NA, "-56° 8' 16\"", "-104° 46' 30\""))

# Turn degrees, minutes and seconds into spaces so they can be used with
# measurements::conv_unit.
# NOTE THIS CAN BE DONE IN ONE OR TWO LINES USING REGEX "OR" (|)
#  - I would think this could be done in stringr::str_replace too
#  - but I don't know how.
df$Latitude = gsub("°|'|\"", "", df$Latitude)
df$Longitude = gsub("°|'|\"", "", df$Longitude)

# Use measurements::conv_unit to convert to decimal degrees.
not_na <- !is.na(df$Latitude) #identify non-na (I assume same for Long here)
#convert only non-na values
df$Latitude[not_na] = conv_unit(df$Latitude[not_na], "deg_min_sec", "dec_deg")
df$Longitude[not_na] = conv_unit(df$Longitude[not_na], "deg_min_sec", "dec_deg")

给出了

df
            Latitude         Longitude
1            15.5525  48.5163888888889
2               <NA>              <NA>
3   52.9702777777778 -3.17027777777778
4               <NA>              <NA>
5   21.0244444444444  105.842777777778
6   21.0405555555556  105.797777777778
7   10.7919444444444  106.691388888889
8               <NA>              <NA>
9  -34.8938888888889 -56.1377777777778
10  41.1322222222222          -104.775