否定“%in%”命令

时间:2018-09-03 21:26:05

标签: filter subset negate

我有一个包含地理数据的数据集,我正在将子集划分为区域。我希望可以将国外的许多随机城市和国家归为一个国际数据框架。我已经创建了执行此操作的代码,但是这很费力且很长。我基本上只需要代码来查找我创建的其他7个区域数据框中的所有值。

{r} table(AdmitsCleaned$State)

 AE             Ajman             Anhui         Ar Riyadh 
                2                 1                 1                 1 
               AS                AZ           Ba Dinh           Beijing 
                1                14                 1                 2 
               CA Casablanca-Settat   Central Visayas             Chiba 
               70                 1                 1                 1 
               CO                CT                DC                DE 
               10                28                 7                20 
            Delhi Distrito Nacional                FL           Fukuoka 
                1                 4                12                 1 
               GA           Gujarat          Gyeonggi            Ha Noi 
                7                 2                 1                 1 
            Hanoi             Henan                HI             Hyogo 
                1                 1                 1                 2 
               IA                IL                IN             Iwate 
                1                21                 7                 1 
        Jeonrabuk           Jiangsu           Jiangxi          Kanagawa 
                1                 1                 1                 2 
        Karnataka         Kathmandu            Kerala                KY 
                1                 1                 1                 1 
               LA          Lalitpur          Lam Dong                MA 
                2                 1                 1                29 
           Madrid       Maharashtra                MD                ME 
                1                 8               123                 8 
               MI                MN                MO           Nairobi 
                4                 5                 4                 1 
               NC                ND                NH           Nicosia 
               14                 1                 9                 1 
               NJ                NM                NV                NY 
              123                 4                 4               122 
             OGUN                OH                OK                OR 
                1                17                 2                10 
            ouest        Overijssel                PA            Punjab 
                1                 1               795                 2 

^这是State变量的概述。

以下是我创建的区域:

{r} Southwest_df = c("AZ" , "OK" , "NM" , "TX") sum(AdmitsCleaned$State %in% Southwest_df) {r} Midwest_df = c("WI" , "IA" , "IN" , "ND" , "IL" , "MI" , "OR" , "MN" , "OH") sum(AdmitsCleaned$State %in% Midwest_df) 等等...

对于国际学生,我不得不手工将每个独特的价值放在手下,如下所示: {r}

International_df = c("Ajman" , "Anhui", "Ar Riyadh" , "Ba Dihn" , "Beijing" , "Casablanca-Settat" , "Central Visayas" , "Chiba" , "Delhi" , "Distrito Nacional" , "Fukuoka" , "Gujarat" , "Gyeonggi" , "Ha Nai" , "Hanoi" , "Henan" , "Hyogo" , "Iwate" , "Jeonrabuk" , "Jiangsu" , "Jiangxi" , "Kanagawa" , "Karnataka" , "Kathmandu" , "Kerale" , "Laltipur" , "Lam Dong" , "Madrid" , "Maharashtra" , "Nairobi" , "Nicosia" , "OGUN" , "ouest" , "OVerjissel" , "Punjab" , "Rajasthan" , "Seoul" , "Sichuan" , "Thai Nguyen")
sum(AdmitsCleaned$State %in% International_df)

有更好的方法吗?

0 个答案:

没有答案