我有以下加州住房数据的数据集:
head(calif_cluster,15)
MedianHouseValue MedianIncome MedianHouseAge TotalRooms TotalBedrooms Population
1 190300 4.20510 16 2697.00 490.00 1462
2 150800 2.54810 33 2821.00 652.00 1206
3 252600 6.08290 17 6213.20 1276.05 3288
4 269700 4.03680 52 919.00 213.00 413
5 91200 1.63680 28 3072.00 790.00 1375
6 66200 2.18980 30 744.00 156.00 410
7 148800 2.63640 39 620.95 136.00 348
8 384800 4.46150 20 2270.00 498.00 1070
9 153200 2.75000 22 1931.00 445.00 1009
10 66200 1.60057 36 973.00 219.00 613
11 461500 3.78130 43 3070.00 668.00 1240
12 144600 2.85000 22 5175.00 1213.00 2804
13 143700 5.09410 8 6213.20 1276.05 3288
14 195500 5.30620 16 2918.00 444.00 1697
15 268800 2.42110 22 620.95 136.00 348
Households Latitude Longitude cluster_kmeans gender_dom marital race edu_level rental
1 515 38.48 -122.47 1 M other black jrcollege rented
2 640 38.00 -122.13 1 F other hispanic doctorate owned
3 1162 33.88 -117.79 3 M other white jrcollege owned
4 193 37.85 -122.25 1 M single others jrcollege owned
5 705 38.13 -122.26 1 F single white doctorate rented
6 165 38.96 -122.21 1 F single others jrcollege owned
7 125 34.01 -118.18 2 M married others postgrad owned
8 521 33.83 -118.38 2 F single white graduate rented
9 407 38.95 -121.04 1 M married others postgrad leased
10 187 35.34 -119.01 2 M single hispanic doctorate owned
11 646 33.76 -118.12 2 F other others highschl leased
12 1091 37.95 -122.05 3 M other white graduate rented
13 1162 36.87 -119.75 3 M other others postgrad leased
14 444 32.93 -117.13 2 M other asian jrcollege owned
15 125 37.71 -120.98 1 F single asian postgrad leased
因为我有纬度&数据集中的经度信息,我想使用R为给定的地理信息提取相应的county
。也可以为每个提取的县获得首都(或最大城市)。这可能使我的分层分析更具洞察力;打算进行一些聚类/制图练习。
答案 0 :(得分:3)
看看ggmap::revgeocode
<强>码强>
library(ggmap)
revgeocode(c(-122.47,38.48)) # longitude then latitude
# [1] "2233 Sulphur Springs Ave, St Helena, CA 94574, USA"
library(dplyr)
library(magrittr)
df12 %<>% rowwise %>% mutate(address = revgeocode(c(Longitude,Latitude))) %>% ungroup # add full address using google api through ggmap
df12 %<>% separate(address,c("street_address", "city","county","country"),remove=F,sep=",") # structure all the info you need
<强>结果强>
df12 %>% select(Longitude,Latitude,address,county)
# A tibble: 15 x 4
# Longitude Latitude address county
# * <dbl> <dbl> <chr> <chr>
# 1 -122.47 38.48 2233 Sulphur Springs Ave, St Helena, CA 94574, USA CA 94574
# 2 -122.13 38.00 3400-3410 Brookside Dr, Martinez, CA 94553, USA CA 94553
# 3 -117.79 33.88 19721 Bluefield Plaza, Yorba Linda, CA 92886, USA CA 92886
# 4 -122.25 37.85 6365 Florio St, Oakland, CA 94618, USA CA 94618
# 5 -122.26 38.13 119 Mimosa Ct, Vallejo, CA 94589, USA CA 94589
# 6 -122.21 38.96 Unnamed Road, Arbuckle, CA 95912, USA CA 95912
# 7 -118.18 34.01 4360-4414 Noakes St, Los Angeles, CA 90023, USA CA 90023
# 8 -118.38 33.83 903 Serpentine St, Redondo Beach, CA 90277, USA CA 90277
# 9 -121.04 38.95 14666-14690 Musso Rd, Auburn, CA 95603, USA CA 95603
# 10 -119.01 35.34 800 Ming Ave, Bakersfield, CA 93307, USA CA 93307
# 11 -118.12 33.76 6211-6295 E Marina Dr, Long Beach, CA 90803, USA CA 90803
# 12 -122.05 37.95 1120 Carey Dr, Concord, CA 94520, USA CA 94520
# 13 -119.75 36.87 1815-1899 E Pryor Dr, Fresno, CA 93720, USA CA 93720
# 14 -117.13 32.93 9010-9016 Danube Ln, San Diego, CA 92126, USA CA 92126
# 15 -120.98 37.71 748-1298 Claribel Rd, Modesto, CA 95356, USA CA 95356
数据强>
df1 <- read.table(text = "MedianHouseValue MedianIncome MedianHouseAge TotalRooms TotalBedrooms Population
1 190300 4.20510 16 2697.00 490.00 1462
2 150800 2.54810 33 2821.00 652.00 1206
3 252600 6.08290 17 6213.20 1276.05 3288
4 269700 4.03680 52 919.00 213.00 413
5 91200 1.63680 28 3072.00 790.00 1375
6 66200 2.18980 30 744.00 156.00 410
7 148800 2.63640 39 620.95 136.00 348
8 384800 4.46150 20 2270.00 498.00 1070
9 153200 2.75000 22 1931.00 445.00 1009
10 66200 1.60057 36 973.00 219.00 613
11 461500 3.78130 43 3070.00 668.00 1240
12 144600 2.85000 22 5175.00 1213.00 2804
13 143700 5.09410 8 6213.20 1276.05 3288
14 195500 5.30620 16 2918.00 444.00 1697
15 268800 2.42110 22 620.95 136.00 348",header=T,stringsAsFactors=F)
df2 <- read.table(text = "Households Latitude Longitude cluster_kmeans gender_dom marital race edu_level rental
1 515 38.48 -122.47 1 M other black jrcollege rented
2 640 38.00 -122.13 1 F other hispanic doctorate owned
3 1162 33.88 -117.79 3 M other white jrcollege owned
4 193 37.85 -122.25 1 M single others jrcollege owned
5 705 38.13 -122.26 1 F single white doctorate rented
6 165 38.96 -122.21 1 F single others jrcollege owned
7 125 34.01 -118.18 2 M married others postgrad owned
8 521 33.83 -118.38 2 F single white graduate rented
9 407 38.95 -121.04 1 M married others postgrad leased
10 187 35.34 -119.01 2 M single hispanic doctorate owned
11 646 33.76 -118.12 2 F other others highschl leased
12 1091 37.95 -122.05 3 M other white graduate rented
13 1162 36.87 -119.75 3 M other others postgrad leased
14 444 32.93 -117.13 2 M other asian jrcollege owned
15 125 37.71 -120.98 1 F single asian postgrad leased",header=T,stringsAsFactors=F)
df12 <- cbind(df1,df2)
我认为图书馆不提供获得该县首都或最大城市的选择,但我认为从在线信息构建查询表不会有太多麻烦。