在r中拆分字符串并将它们放入新数据框中的新列中

时间:2014-10-12 09:35:52

标签: r split character

我想在R中这样做 我运行代码来提取所有文件夹和子文件夹的路径。我得到下面提到的清单。我想对此应用一组规则:

  1. 如果在整行中遇到1 "/",请将“/”替换为"/Folder/"
  2. 如果在整行中遇到2 "/",则不执行任何操作。
  3. 如果遇到3个或更多"/",则忽略第一个和最后一个“/”并将所有剩余的“/”替换为"-"
  4. 我运行的代码是提取文件路径:

      b<-list.files(path="/Users/Mohit/Desktop/Company/Database",recursive=TRUE)
    
      [1] "Accounts/Academic History.pdf"                       "Accounts/Contract.pdf"                              
      [3] "Accounts/Credit/Analyst/Banking/TFileOutput.txt"     "Accounts/Credit/Analyst/untitled.jpg"               
      [5] "Accounts/Credit/background.jpg"                      "Accounts/Credit/background.xcf"                     
      [7] "Accounts/Debit/index.html"                           "Human Resources/RStudio-0.98.1073.dmg"              
      [9] "Information Technology/Iti.pdf"                      "Logistics/1610085_10152585224658626_398303669_n.jpg"
      [11] "Sales/947309_10152376144413626_1056138683_n.jpg"    
    

    我无法理解使用哪个功能。可能带有sapply的stringr包? 我想将它放在带有标题的列中,并将其导出为文本文件。

    非常感谢任何帮助。

    非常感谢

1 个答案:

答案 0 :(得分:1)

可能有帮助:

library(stringr)
Ct <- str_count(b, "/")
b1 <- ifelse(Ct==1, gsub("[/]", "/Folder/", b), 
        ifelse(Ct>=3, gsub("(^([^/]+[/])|([/][^/]+)$)(*SKIP)(*F)|[/]", "-",b,
                                         perl=TRUE), b))

 b1
 #[1] "Accounts/Folder/Academic History.pdf"                      
 #[2] "Accounts/Folder/Contract.pdf"                              
 #[3] "Accounts/Credit-Analyst-Banking/TFileOutput.txt"           
 #[4] "Accounts/Credit-Analyst/untitled.jpg"                      
 #[5] "Accounts/Credit/background.jpg"                            
 #[6] "Accounts/Credit/background.xcf"                            
 #[7] "Accounts/Debit/index.html"                                 
 #[8] "Human Resources/Folder/RStudio-0.98.1073.dmg"              
 #[9] "Information Technology/Folder/Iti.pdf"                     
 #[10] "Logistics/Folder/1610085_10152585224658626_398303669_n.jpg"
 #[11] "Sales/Folder/947309_10152376144413626_1056138683_n.jpg"    

如果您要创建data.frame,然后导出为.txt文件

   dat <- data.frame(b, b1, stringsAsFactors=FALSE)
   write.table(dat, file="Mohit.txt", quote=FALSE, row.names=FALSE, sep=",")

更新

如果您需要根据b1

创建3列
  datN <- setNames(read.table(text=b1, sep="/"), c("Class", "Title", "File"))
  head(datN,2)
  #     Class  Title                 File
  #1 Accounts Folder Academic History.pdf
  #2 Accounts Folder         Contract.pdf

现在,您可以使用write.table

保存文件

数据

 b <- c("Accounts/Academic History.pdf", "Accounts/Contract.pdf", "Accounts/Credit/Analyst/Banking/TFileOutput.txt", 
 "Accounts/Credit/Analyst/untitled.jpg", "Accounts/Credit/background.jpg", 
 "Accounts/Credit/background.xcf", "Accounts/Debit/index.html", 
 "Human Resources/RStudio-0.98.1073.dmg", "Information Technology/Iti.pdf", 
 "Logistics/1610085_10152585224658626_398303669_n.jpg","Sales/947309_10152376144413626_1056138683_n.jpg"