我有超过1000行的字符串,我从Excel工作表的列中提取。以下是数据的外观(3行):
鸡(31%);鸭(16%);野鸭(14%);土耳其(10%);鸽子(4%);鹅(4%);野鸟(4%);树麻雀( 2%)
树麻雀(2%)
鸡(1%)
我需要将数据放入表中(对于此示例:8列x 3行)。有人可以帮忙吗?
x <- c("Chicken(31%);Duck(16%);Wild duck(14%);Turkey(10%);Pigeon(4%);Goose(4%);Wild bird(4%);Tree sparrow(2%)",
"Tree sparrow(2%)", "Chicken(1%)")
答案 0 :(得分:2)
最有可能更简洁的方法,但你可以尝试这样的事情:
library(stringi)
library(data.table)
# Drop empty lines if any
txt <- Filter(function(x) !stri_isempty(stri_trim(x)), x)
# Extract matches
matches <- stri_match_all_regex(txt, "([\\w\\s]+)\\(([1-9]+)%\\);?")
matches[[1]]
## [,1] [,2] [,3]
## [1,] "Chicken(31%);" "Chicken" "31"
## [2,] "Duck(16%);" "Duck" "16"
## [3,] "Wild duck(14%);" "Wild duck" "14"
## [4,] "Pigeon(4%);" "Pigeon" "4"
## [5,] "Goose(4%);" "Goose" "4"
## [6,] "Wild bird(4%);" "Wild bird" "4"
## [7,] "Tree sparrow(2%)" "Tree sparrow" "2"
# Rearrange
rows <- lapply(
matches,
function(x) setNames(as.list(as.numeric(x[, 3])), x[, 2]))
rbindlist(rows, fill=TRUE)
## Chicken Duck Wild duck Pigeon Goose Wild bird Tree sparrow
## 1: 31 16 14 4 4 4 2
## 2: NA NA NA NA NA NA 2
## 3: 1 NA NA NA NA NA NA
正则表达式解释
([\\w\\s]+) # At least one word character or whitespace *, 1st group
\\( # Left parenthesis
([1-9]+) # At least one digit. You can replace + with {1,2}, 2nd group
% # Percent sign
\\) # Right parenthesis
;? # Optional semicolon
*可能是\\w[\\w\\s]+
答案 1 :(得分:1)
这里有可能的解决方案:
query = $mysqli->prepare("CREATE TABLE $tbname (ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY)") or trigger_error($mysqli->error."[$query]");