将字符串拆分为r中的六个新变量

时间:2018-05-23 09:38:32

标签: r reshape2

我试图将一串数字分成六个不同的变量。我尝试在原来的df中做到这一点,但它给我带来了很多问题,所以我决定提取我需要拆分成临时数据帧的列(目的是让数据帧连接到原来的一次变量在哪里正确):

library(magrittr) # to avoid repeating the long subscript below
df$A <- as.character(df$A) # think this is what you wanted

# get rid of the (0_...) bits:
df$A[! is.na(df$zero) & df$zero > 0] %<>% 
      {gsub("?\\(0_.*?\\)", "", .)} 

# and the (1_...) bits:
df$A[! is.na(df$one) & df$one > 0]   %<>% 
      {gsub("?\\(1_.*?\\)", "", .)}

# now get rid of trailing commas (this was trickiest!)
df$A %<>%
      {gsub(",+$", "", .)} %>%
      {gsub("^,+", "", .)} %>%
      {gsub(",+", ",", .)}

然后我尝试使用func isValidEmail(testStr:String) -> Bool { let emailRegEx = "[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,64}" let emailTest = NSPredicate(format:"SELF MATCHES %@", emailRegEx) return emailTest.evaluate(with: testStr) } func textField(textField: UITextField, shouldChangeCharactersInRange range: NSRange, replacementString string: String) -> Bool { guard let text = textField.text else{ return true } guard let floatingLabelTextField = textField as? SkyFloatingLabelTextField else { return true } if isValidEmail(testStr: text) { floatingLabelTextField.errorMessage = "" }else{ floatingLabelTextField.errorMessage = "Invalid email" } return true } 中的statusTemp <- select(recruitDF, Status) tail(statusTemp, n = 15) Status 486 109 ; 0 ; 0 ; 22 ; 0 ; 7 487 63 ; 0 ; 0 ; 2 ; 0 ; 3 488 93 ; 0 ; 0 ; 4 ; 0 ; 2 489 42 ; 0 ; 0 ; 3 ; 0 ; 2 490 13 ; 0 ; 0 ; 5 ; 0 ; 1 491 50 ; 0 ; 0 ; 1 ; 0 ; 3 492 10 ; 0 ; 0 ; 2 ; 0 ; 1 493 56 ; 0 ; 0 ; 3 ; 0 ; 2 494 40 ; 0 ; 0 ; 3 ; 0 ; 0 495 35 ; 0 ; 0 ; 10 ; 0 ; 0 496 134 ; 0 ; 0 ; 5 ; 0 ; 1 497 12 ; 0 ; 0 ; 2 ; 0 ; 1 498 30 ; 0 ; 0 ; 0 ; 0 ; 2 499 49 ; 0 ; 0 ; 6 ; 0 ; 4 500 11 ; 0 ; 0 ; 0 ; 0 ; 0 将温度分成六个新的和适当的变量,但是我搞砸了,我无法弄清楚原因。

colsplit

有人可以帮助我找出我忽视或做错的事情吗?

2 个答案:

答案 0 :(得分:2)

library(dplyr)
library(stringr)

statusTemp %>% separate(Status, c("1", "2", "3", "4", "5", "6"), ";") %>%
mutate_all(funs(str_trim))       # to remove both leading and trailing whitespace

答案 1 :(得分:1)

您的代码运行正常。如@Cath所述,您需要使用statusTemp$Status代替statusTemp。这是一个例子。

library(reshape2)
colsplit(df$Status, ";", names = c("Application",
                                   "Screening",
                                   "Test",
                                   "Interview",
                                   "References",
                                   "Hired"))
# output
#   Application Screening Test Interview References Hired
#1          109         0    0        22          0     7
#2           63         0    0         2          0     3
#3           93         0    0         4          0     2
#4           42         0    0         3          0     2
#...

# data
structure(list(Status = structure(c(2L, 14L, 15L, 10L, 5L, 12L, 
1L, 13L, 9L, 8L, 6L, 4L, 7L, 11L, 3L), .Label = c("                        10 ; 0 ; 0 ; 2 ; 0 ; 1 ", 
"                        109 ; 0 ; 0 ; 22 ; 0 ; 7 ", "                        11 ; 0 ; 0 ; 0 ; 0 ; 0", 
"                        12 ; 0 ; 0 ; 2 ; 0 ; 1 ", "                        13 ; 0 ; 0 ; 5 ; 0 ; 1 ", 
"                        134 ; 0 ; 0 ; 5 ; 0 ; 1 ", "                        30 ; 0 ; 0 ; 0 ; 0 ; 2 ", 
"                        35 ; 0 ; 0 ; 10 ; 0 ; 0 ", "                        40 ; 0 ; 0 ; 3 ; 0 ; 0 ", 
"                        42 ; 0 ; 0 ; 3 ; 0 ; 2 ", "                        49 ; 0 ; 0 ; 6 ; 0 ; 4 ", 
"                        50 ; 0 ; 0 ; 1 ; 0 ; 3 ", "                        56 ; 0 ; 0 ; 3 ; 0 ; 2 ", 
"                        63 ; 0 ; 0 ; 2 ; 0 ; 3 ", "                        93 ; 0 ; 0 ; 4 ; 0 ; 2 "
), class = "factor")), .Names = "Status", class = "data.frame", row.names = c(NA, 
-15L))