在R中按n个字分割字符串

时间:2017-09-17 03:08:06

标签: r regex string

我需要在R中每五个单词(左右)拆分一个字符串。给定输入:

---
title: "***"
author:
- Author One
- Author Two
date: "`r format(Sys.time(), '%d %B %Y')`"
fontsize: 14pt
output:
  beamer_presentation:
    fig_height: 5
    fig_width: 8
    highlight: tango
    theme: metropolis
incremental: true
---

我想要输出:

x <- c("one, two, three, four, five, six, seven, eight, nine, ten")

是否有正则表达式或函数来完成此任务?

4 个答案:

答案 0 :(得分:3)

这是一种可能的方法。我们可以将字符串拆分为单词。之后,计算组数,然后使用use Carbon\Carbon; public function homepage() { $randomPost = Post::whereDate('created_at', Carbon::now()->format('Y-m-d'))->get()->random(3); return view('user/pages/homepage', compact('randomPost')); } tapply生成输出。

toString

请注意,即使单词数不是5的倍数,此解决方案仍然有效。以下是一个示例。

x <- c("one, two, three, four, five, six, seven, eight, nine, ten")

# Split the string
y <- strsplit(x, split = ", ")[[1]]

# Know how many groups by 5
group_num <- length(y) %/% 5
# Know how many words are left
group_last <- length(y) %% 5

# Generate the output
z <- tapply(y, c(rep(1:group_num, each = 5), 
                 rep(group_num + 1, times = group_last)),
            toString)
z
                                  1                                   2 
  "one,  two,  three,  four,  five" "six,  seven,  eight,  nine,  ten"

答案 1 :(得分:1)

这是一个适用于单长x的函数。

x <- c("one, two, three, four, five, six, seven, eight, nine, ten")

#' @param x Vector
#' @param n Number of elements in each vector
#' @param pattern Pattern to split on
#' @param ... Passed to strsplit
#' @param collapse String to collapse the result into
split_every <- function(x, n, pattern, collapse = pattern, ...) {
  x_split <- strsplit(x, pattern, perl = TRUE, ...)[[1]]
  out <- character(ceiling(length(x_split) / n))
  for (i in seq_along(out)) {
    entry <- x_split[seq((i - 1) * n + 1, i * n, by = 1)]
    out[i] <- paste0(entry[!is.na(entry)], collapse = collapse)
  }
  out
}

library(testthat)
expect_equal(split_every(x, 5, pattern = ", "),
             c("one, two, three, four, five",
               "six, seven, eight, nine, ten"))

答案 2 :(得分:0)

你是否经历过这样的事情:

lapply(1:ceiling(length(x)/5), function(i) x[(5*(i-1)+1):min(length(x),(5*i))])

即。你事先不知道你的向量x的长度,但是你希望能够处理任何不测事件?

答案 3 :(得分:0)

另一种搜索模式,的每个第五个实例的方法,将其变为任意字符,然后将字符串拆分为任意字符

x <- c("one, two, three, four, five, six, seven, eight, nine, ten")

library(stringr)
pattern <- ","
index <- as.data.frame(str_locate_all(x, pattern))           # find all positions of pattern
index <- index[seq(numobs, nrow(index), by=numobs),]$start   # filter to every fifth instance of pattern
stopifnot(grepl("!", x)==FALSE)    # throws error in case arbitrary symbol to split on is already present 
str_sub(x, index, index) <- "!"    # arbitrary symbol to split on
ans <- unlist(strsplit(x, "! "))   # split on new symbol 
# [1] "one, two, three, four, five"  
# [2] "six, seven, eight, nine, ten"