我有一个包含地理数据的数据集,我正在将子集划分为区域。我希望可以将国外的许多随机城市和国家归为一个国际数据框架。我已经创建了执行此操作的代码,但是这很费力且很长。我基本上只需要代码来查找我创建的其他7个区域数据框中的所有值。
{r}
table(AdmitsCleaned$State)
AE Ajman Anhui Ar Riyadh
2 1 1 1
AS AZ Ba Dinh Beijing
1 14 1 2
CA Casablanca-Settat Central Visayas Chiba
70 1 1 1
CO CT DC DE
10 28 7 20
Delhi Distrito Nacional FL Fukuoka
1 4 12 1
GA Gujarat Gyeonggi Ha Noi
7 2 1 1
Hanoi Henan HI Hyogo
1 1 1 2
IA IL IN Iwate
1 21 7 1
Jeonrabuk Jiangsu Jiangxi Kanagawa
1 1 1 2
Karnataka Kathmandu Kerala KY
1 1 1 1
LA Lalitpur Lam Dong MA
2 1 1 29
Madrid Maharashtra MD ME
1 8 123 8
MI MN MO Nairobi
4 5 4 1
NC ND NH Nicosia
14 1 9 1
NJ NM NV NY
123 4 4 122
OGUN OH OK OR
1 17 2 10
ouest Overijssel PA Punjab
1 1 795 2
^这是State变量的概述。
以下是我创建的区域:
{r}
Southwest_df = c("AZ" , "OK" , "NM" , "TX")
sum(AdmitsCleaned$State %in% Southwest_df)
{r}
Midwest_df = c("WI" , "IA" , "IN" , "ND" , "IL" , "MI" , "OR" , "MN" , "OH")
sum(AdmitsCleaned$State %in% Midwest_df)
等等...
对于国际学生,我不得不手工将每个独特的价值放在手下,如下所示: {r}
International_df = c("Ajman" , "Anhui", "Ar Riyadh" , "Ba Dihn" , "Beijing" , "Casablanca-Settat" , "Central Visayas" , "Chiba" , "Delhi" , "Distrito Nacional" , "Fukuoka" , "Gujarat" , "Gyeonggi" , "Ha Nai" , "Hanoi" , "Henan" , "Hyogo" , "Iwate" , "Jeonrabuk" , "Jiangsu" , "Jiangxi" , "Kanagawa" , "Karnataka" , "Kathmandu" , "Kerale" , "Laltipur" , "Lam Dong" , "Madrid" , "Maharashtra" , "Nairobi" , "Nicosia" , "OGUN" , "ouest" , "OVerjissel" , "Punjab" , "Rajasthan" , "Seoul" , "Sichuan" , "Thai Nguyen")
sum(AdmitsCleaned$State %in% International_df)
有更好的方法吗?