说我有以下字符串:
params <- "var1 /* first, variable */, var2, var3 /* third, variable */"
我想使用,
作为分隔符将其拆分,然后提取“引用的子串”,所以我得到2个向量如下:
params_clean <- c("var1","var2","var3")
params_def <- c("first, variable","","third, variable") # note the empty string as a second element.
我在广义上使用术语“引用”,其中包含/*
和*/
的任意字符串,可保护子字符串不被分割。
我找到了一个基于read.table
的解决方法,以及它允许引用元素的事实:
library(magrittr)
params %>%
gsub("/\\*","_temp_sep_ '",.) %>%
gsub("\\*/","'",.) %>%
read.table(text=.,strin=F,sep=",") %>%
unlist %>%
unname %>%
strsplit("_temp_sep_") %>%
lapply(trimws) %>%
lapply(`length<-`,2) %>%
do.call(rbind,.) %>%
inset(is.na(.),value="")
但是它非常丑陋和hackish,有什么更简单的方法?我认为在这种情况下必须有regex
提供给strsplit
。
答案 0 :(得分:2)
您可以使用
library(stringr)
cmnt_rx <- "(\\w+)\\s*(/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/)?"
res <- str_match_all(params, cmnt_rx)
params_clean <- res[[1]][,2]
params_clean
## => [1] "var1" "var2" "var3"
params_def <- gsub("^/[*]\\s*|\\s*[*]/$", "", res[[1]][,3])
params_def[is.na(params_def)] <- ""
params_def
## => [1] "first, variable" "" "third, variable"
主要的正则表达式详细信息(实际上是(\w+)\s*)(COMMENTS_REGEX)?
):
(\w+)
- 捕获第1组:一个或多个单词字符\s*
- 0+空白字符(
- 捕获第2组开始/\*
- 匹配评论开始/*
[^*]*\*+
- 匹配*
以外的0 +个字符,后跟1 +字面*
(?:[^/*][^*]*\*+)*
- 0+序列:
[^/*][^*]*\*+
- 不是/
或*
(与[^/*]
匹配),后跟0 +非星号字符([^*]*
),后跟1 +星号(\*+
)/
- 关闭/
)?
- 捕获第2组结束,重复1次或0次(这意味着它是可选的)。请参阅regex demo。
"^/[*]\\s*|\\s*[*]/$"
中的gsub
模式会删除/*
和*/
相邻的空格。
params_def[is.na(params_def)] <- ""
部分用空字符串替换NA
。
答案 1 :(得分:1)
你在这里
<!DOCTYPE html>
<html lang="en">
<head>
<title>Bootstrap Example</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
<link href="https://fonts.googleapis.com/css?family=Black+Han+Sans" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Montserrat:400,700" rel="stylesheet">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
<link href="https://fonts.googleapis.com/css?family=Lora" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Gugi" rel="stylesheet">
<script defer src="https://use.fontawesome.com/releases/v5.0.9/js/all.js" integrity="sha384-8iPTk2s/jMVj81dnzb/iFR2sdA7u06vHJyyLlAd4snFpCl/SnyUjRrbdJsw1pGIl" crossorigin="anonymous"></script>
<link href="https://fonts.googleapis.com/css?family=Saira+Extra+Condensed:400,900" rel="stylesheet">
<style>
body {
position: relative;
}
ul.nav-pills {
top: 100px;
position: fixed;
}
div.col-sm-9 div {
height: 250px;
font-size: 28px;
}
.bg-1 {
background-color: black;
}
.bg-1 ul li {
color: #ecf0f1;
font-family: 'Gugi', cursive;
font-size: 15px;
}
.bg-2 {
width: 86%;
background-color: #d1d8e0;
}
.col-sm-3 {
width: 14% !important;
background-color: white !important;
}
@media screen and (max-width: 810px) {
#about,
#education,
#certifications,
#skills,
#projects,
#experience,
#interest {
margin-left: 150px;
}
}
#myScrollspy {
position: fixed;
align-self: center;
left: 0;
top: 0;
}
#hello {
font-family: 'Black Han Sans', sans-serif;
font-size: 65px;
margin-left: 150px;
margin-top: 200px;
}
#name {
font-family: 'Montserrat', sans-serif;
font-size: 55px;
font-weight: 600;
margin-left: 150px;
margin-top: 30px;
}
#self {
font-family: 'Montserrat', sans-serif;
font-size: 30px;
font-weight: 500;
margin-left: 150px;
margin-top: 30px;
}
#engineer {
font-family: 'Montserrat', sans-serif;
font-size: 30px;
font-weight: 500;
margin-left: 150px;
margin-top: 30px;
}
#intro {
font-family: 'Lora', serif;
font-size: 20px;
color: #d1d8e0;
margin-left: 150px;
margin-top: 15px;
}
hr {
width: 400px;
border-top: 1px solid #4b6584;
border-bottom: 1px solid #4b6584;
}
#education {
margin-top: 0px;
}
#email {
font-size: 10px;
}
#headings {
font-family: 'Saira Extra Condensed', sans-serif;
color: #343a40;
font-weight: 700;
font-size: 50px;
}
#social {
margin-top: -90px;
margin-left: 250px;
}
.image {
margin-left: 20px;
padding: 1px;
}
.subheadings {
font-family: 'Saira Extra Condensed', sans-serif;
color: #20bf6b;
font-weight: 500;
font-size: 40px;
}
#certifications {
margin-top: 350px;
}
.nav-pills>li.active>a,
.nav-pills>li.active>a:focus,
.nav-pills>li.active>a:hover {
color: #fff;
/* background-color: #337ab7; */
background-color: unset !important;
}
</style>
</head>
<body data-spy="scroll" data-target="#myScrollspy" data-offset="20">
<div class="container-fluid">
<div class="row bg-1">
<!-- Left Side Navigation Bar -->
<div class="col-sm-3 text-center" id="backg">
<nav id="myScrollspy">
<ul class="nav nav-pills nav-stacked ">
<img class="img-rounded img-responsive center-block image" src="naqqash.png" height="150" width="150 ">
<li class="active"><a href="#about">ABOUT</a></li>
<li><a href="#education">EDUCATION</a></li>
<li><a href="#certifications">CERTIFICATIONS</a></li>
<li><a href="#skills">SKILLS</a></li>
<li><a href="#projects">PROJECTS</a></li>
<li><a href="#experience">EXPERIENCE</a></li>
<li><a href="#interest">INTEREST</a></li>
</ul>
</nav>
</div>
<!-- Right Side -->
<div class="col-sm-9 bg-2">
<!-- About -->
<div id="about">
<h1 id="hello">hello</h1>
<h3 id="name">I'm Muhammad Naqqash,</h3>
<h3 id="self">a self taught developer.</h3>
<h3 id="engineer">I'm a Computer Engineer. </h3>
<p id="intro">I'm a positive and friendly person. Also, I love to set goals and achieve them.<br> My important qualities: self-motivated, ability overcome difficulties and the <br> ability to learn.</p>
<div id="social">
<a href=""><i class="fab fa-facebook fa-lg"></i><span style="display:inline-block; width: 5px;"></span>
<a href=""><i class="fab fa-linkedin fa-lg"></i></a> <span style="display:inline-block; width: 0px;"></span>
<a href=""><i class="fab fa-twitter-square fa-lg"></i></a><span style="display:inline-block; width: 5px;"></span>
<a href=""><i class="fab fa-github-square fa-lg"></i></a><span style="display:inline-block; width: 5px;"></span>
</div>
</div>
<br>
<br>
<br>
<br>
<br>
<br>
<hr>
<br>
<br>
<br>
<!-- education -->
<div id="education">
<br>
<br>
<br>
<br>
<br>
<h1 id="headings">EDUCATION</h1>
<h3 class="subheadings"><i class="fas fa-graduation-cap fa-sm"></i><span style="display:inline-block; width: 20px;"></span>BS Computer Engineering</h3>
<h3 class="subheadings"><i class="fas fa-university"></i><span style="display:inline-block; width: 20px;"></span>NUST College of E&ME</h3>
</div>
<div id="certifications">
<h1 id="headings">Section 3</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
<div id="skills">
<h1 id="headings">Section 4</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
<div id="projects">
<h1 id="headings">Section 5</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
<div id="experience">
<h1 id="headings">Section 6</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
<div id="interest">
<h1 id="headings">Section 7</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
<div>
<h1>Section 7</h1>
<p>Try to scroll this section and look at the navigation list while scrolling!</p>
</div>
</div>
</div>
</div>
</body>
</html>
答案 2 :(得分:1)
您可以将其包装在一个函数中并使用普通(*SKIP)(*FAIL)
中的(没有详细记录的)R
机制:
getparams <- function(params) {
tmp <- unlist(strsplit(params, "/\\*.*?\\*/(*SKIP)(*FAIL)|,", perl = TRUE))
params_clean <- vector(length = length(tmp))
params_def <- vector(length = length(tmp))
for (i in seq_along(tmp)) {
# get params_def if available
match <- regmatches(tmp[i], regexec("/\\*(.*?)\\*/", tmp[i]))
params_def[i] <- ifelse(identical(match[[1]], character(0)), "", trimws(match[[1]][2]))
# params_clean
params_clean[i] <- trimws(gsub("/(.*)\\*.*?\\*/", "\\1", tmp[i]))
}
return(list(params_clean = params_clean, params_def = params_def))
}
params <- "var1 /* first, variable */, var2, var3 /* third, variable */"
getparams(params)
这会使用(*SKIP)(*FAIL)
(请参阅a demo on regex101.com)拆分初始字符串,然后分析这些部分。
$params_clean
[1] "var1" "var2" "var3"
$params_def
[1] "first, variable" "" "third, variable"
<小时/> 或者,
sapply
:缩短
getparams <- function(params) {
tmp <- unlist(strsplit(params, "/\\*.*?\\*/(*SKIP)(*FAIL)|,", perl = TRUE))
(p <- sapply(tmp, function(x) {
match <- regmatches(x, regexec("/\\*(.*?)\\*/", x))
def <- ifelse(identical(match[[1]], character(0)), "", trimws(match[[1]][2]))
clean <- trimws(gsub("/(.*)\\*.*?\\*/", "\\1", x))
c(clean, def)
}, USE.NAMES = F))
}
这将产生一个矩阵:
[,1] [,2] [,3]
[1,] "var1" "var2" "var3"
[2,] "first, variable" "" "third, variable"
使用后者,您可以获得变量名称,例如: result[1,]
。