R:从数据框中拆分变量并找到唯一的变量

时间:2018-06-03 20:40:24

标签: r list dataframe unique

我有一排28行:

> al
# A tibble: 28 x 1
   lang_name                                               
   <chr>                                                   
 1 Objective-C,Swift,Other                                 
 2 Ruby,Shell                                              
 3 Ruby,HTML,Shell                                         
 4 Java,HTML,Kotlin,Other                                  
 5 TypeScript,JavaScript,CSS,Inno Setup,Shell,HTML         
 6 Vue,JavaScript,CSS,HTML                                 
 7 HTML,JavaScript,CSS                                     
 8 JavaScript,HTML,CSS,Other                               
 9 NA                                                      
10 Vim script,Ruby,Shell,Python,CoffeeScript,Makefile,Other
# ... with 18 more rows

通过使用al <- gh[,'lang_name']切割其他数据框,我得到了什么。我想从每一行中提取数据并将其全部放在一个列表中,这样我就能找到唯一的值。

我该怎么做?

我尝试使用al <- str_split(al, ",")拆分,但它会返回以下列表:

[[1]]
  [1] "c(\"Objective-C"  "Swift"            "Other\""          " \"Ruby"         
  [5] "Shell\""          " \"Ruby"          "HTML"             "Shell\""         
  [9] " \"Java"          "HTML"             "Kotlin"           "Other\""         
 [13] " \"TypeScript"    "JavaScript"       "CSS"              "Inno Setup"      
 [17] "Shell"            "HTML\""           " \"Vue"           "JavaScript"      
 [21] "CSS"              "HTML\""           " \"HTML"          "JavaScript"      
 [25] "CSS\""            " \"JavaScript"    "HTML"             "CSS"             
 [29] "Other\""          " NA"              " \"Vim script"    "Ruby"            
 [33] "Shell"            "Python"           "CoffeeScript"     "Makefile"        
 [37] "Other\""          " \"PHP\""         " \"JavaScript"    "TypeScript"      
 [41] "Other\""          " \"JavaScript"    "Other\""          " \"JavaScript"   
 [45] "CSS"              "Shell\""          " \"Ruby"          "JavaScript"      
 [49] "HTML"             "Vue"              "CSS"              "Shell\""         
 [53] " \"Go"            "Assembly"         "HTML"             "C"               
 [57] "Shell"            "Perl\""           " \"Go"            "HCL"             
 [61] "Other\""          " \"JavaScript\""  " \"C++"           "JavaScript"      
 [65] "Python"           "Go"               "Shell"            "C\""             
 [69] " \n\"JavaScript"  "CSS"              "HTML"             "Other\""         
 [73] " \"C++"           "Cuda"             "C"                "CMake"           
 [77] "Java"             "Python"           "Other\""          " \"JavaScript"   
 [81] "GLSL\""           " \"JavaScript"    "TypeScript"       "CSS\""           
 [85] " \"Kotlin"        "C"                "Makefile"         "HTML"            
 [89] "C++"              "Java"             "Other\""          " \"Java"         
 [93] "Other\""          " \"Python"        "Jupyter Notebook" "C++"             
 [97] "HTML"             "Shell"            "JavaScript\""     " \"CSS"          
[101] "JavaScript"       "HTML"             "Other\""          " \"HTML"         
[105] "CSS"              "JavaScript\")"   

unique(al)只返回相同的字符串。

我也尝试将它全部作为一个角色:

al <- gh[1,'lang_name']
i = 2
while(i < nrow(gh)) {
    al <- paste(al, ",", gh[i+1,'lang_name'])
    i = i + 1
  }
}

导致以下字符:[1] "Objective-C,Swift,Other , Ruby,HTML,Shell , Java,HTML,Kotlin,Other , TypeScript,JavaScript,CSS,Inno Setup,Shell,HTML , Vue,JavaScript,CSS,HTML , HTML,JavaScript,CSS , JavaScript,HTML,CSS,Other , NA , Vim script,Ruby,Shell,Python,CoffeeScript,Makefile,Other , PHP , JavaScript,TypeScript,Other , JavaScript,Other , JavaScript,CSS,Shell , Ruby,JavaScript,HTML,Vue,CSS,Shell , Go,Assembly,HTML,C,Shell,Perl , Go,HCL,Other , JavaScript , C++,JavaScript,Python,Go,Shell,C , JavaScript,CSS,HTML,Other , C++,Cuda,C,CMake,Java,Python,Other , JavaScript,GLSL , JavaScript,TypeScript,CSS , Kotlin,C,Makefile,HTML,C++,Java,Other , Java,Other , Python,Jupyter Notebook,C++,HTML,Shell,JavaScript , CSS,JavaScript,HTML,Other , HTML,CSS,JavaScript"

我不知道如何转换为字符串来运行unique

2 个答案:

答案 0 :(得分:3)

我希望这会给你你想要的东西:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:context="http://www.springframework.org/schema/context"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/context
       http://www.springframework.org/schema/context/spring-context.xsd">

   <context:component-scan base-package="com.tutorialspoint" />

   <bean class="org.springframework.web.servlet.view.InternalResourceViewResolver">
      <property name="prefix" value="/" />
      <property name="suffix" value=".jsp" />  
   </bean>

</beans>

答案 1 :(得分:1)

如果您喜欢tidyverse / purrr个功能,可以在一个管道步骤中执行此操作。 stringr::str_splitstringi::stri_split周围的便捷包装器。 purrr::reduce允许您重复应用函数,在本例中为c,直到将str_split返回的整个向量列表缩减为一个字符向量。来自基地R的unlist也可以代替reduce - 我有非常purrr - 专注于此类任务的习惯,但这不一定是简单任务的默认设置

library(tidyverse)

al$lang_name %>%
  str_split(",") %>%
  reduce(c) %>%
  unique()
#>  [1] "Objective-C"  "Swift"        "Other"        "Ruby"        
#>  [5] "Shell"        "HTML"         "Java"         "Kotlin"      
#>  [9] "TypeScript"   "JavaScript"   "CSS"          "Inno Setup"  
#> [13] "Vue"          NA             "Vim script"   "Python"      
#> [17] "CoffeeScript" "Makefile"

reprex package(v0.2.0)创建于2018-06-03。