我的数据看起来像这样:
<video id="homeVideo" style="width: 100%; overflow: hidden;" autoplay>
<source src="/media/micro_home.mp4" type="video/mp4">
</video>
<style>#slider_loading {display: visible;}
<div id="slider_loading" class="slider_loading"></div>
<script type="text/javascript">
var video = document.getElementById("homeVideo");
if (video.readyState === 4) {
document.getElementById('homeVideo')style.display = "none";
}
</script>
然后数据继续 n 行数。所以我希望数据看起来像这样:
*first* *last*
M a rk Twain
Hun ter Stockt on Thompson
我知道我可以使用gsub删除所有这样的空白空间:
*first* *last*
Mark Twain
Hunter Stockton Thompson
我可以用这样的正则表达式识别模式:
gsub(" ", "", x, fixed = TRUE)
但是我如何将这两者结合起来对gsub说 - 删除所有空格但不匹配正则表达式的情况?
答案 0 :(得分:1)
最简单的方法:
txt <- c("M a rk", "Twain", "Hun ter", "Stockt on Thompson")
gsub("\\s([a-z])", "\\1", txt)
## [1] "Mark" "Twain" "Hunter" "Stockton Thompson"
如果要将其应用于data.frame中的多个变量,可以使用lapply和data.frame的list寻址替换函数来执行此操作。 (注意:您确实不应该在data.frame列的名称中使用星号。)
df <- data.frame("*first*" = c("M a rk", "Hun ter"),
"*last*" = c("Twain", "Stockt on Thompson"),
check.names = FALSE, stringsAsFactors = FALSE)
# names of the text columns you want to clean up
varsToModify <- c("*first*", "*last*")
df[varsToModify] <- lapply(df[varsToModify],
function(x) gsub("\\s([a-z])", "\\1", x))
df
## *first* *last*
## 1 Mark Twain
## 2 Hunter Stockton Thompson
答案 1 :(得分:0)
df <- data.frame(`*first*`=c('M a rk','Hun ter'),`*last*`=c('Twain','Stockt on Thompson'),check.names=F,stringsAsFactors=F);
df;
## *first* *last*
## 1 M a rk Twain
## 2 Hun ter Stockt on Thompson
我会使用Perl否定前瞻断言:
for (ci in seq_along(df)) df[[ci]] <- gsub(perl=T,' (?![A-Z])','',df[[ci]]);
df;
## *first* *last*
## 1 Mark Twain
## 2 Hunter Stockton Thompson
见Regular Expressions as used in R。关于Perl断言的讨论是在页面底部附近给出的。