Question

以下是我要做的事情：这会引发更多的光，但这就是我想要的。假设您有一个类似下面的数据 -

Region      Open    Store
120..141       +    France
145..2115      +    Germany
3322..5643     +    Wales
5646..7451     -    Scotland
7454..8641     -    Mexico
8655..9860     -    India
9980..11413    +    Zambia
11478..1261    -    Nicaragua
12978..1318    +    Sweeden

我想要做的是选择找到第二个元素（141）和连续的第一个元素（145）之间的差异，如果它们符合某个值并且它们具有相同的符号（+或 - ），则组商店在一起。输出示例

期望的输出应该是（如果数字差异小于40且商店标志相同（具有相同的+或 - ）

 4 (France and Germany)
 3,14 (Scotland and Mexico and india)

Answer 1

＆＃34;第二元素的载体＆＃34; （＆＃34;字符＆＃34; vector是sapply( strsplit(dat$Region, "\\.\\.") , "[", 2)和＆＃34;第一个元素＆＃34; sapply( strsplit(dat$Region, "\\.\\.") , "[", 1)。据推测，第一个这样的差异（在data.frame使第一列成为默认因子类的情况下）是：

 as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[2]) - 
       as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[1])
#[1] 25

[注意：需要＆＃34; \。\。＆＃34;作为＆＃39;分裂＆＃39;争论来自于＆＃39;分裂＆＃39;参数被解释为正则表达式。]所有差异的向量：

as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[-1]) - 
as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[-length(dat$Region)])
[1]   25 3177 2324 1808 1201 1325 1498 1500

你的其余问题（和编辑）未能传达你的意图（可能是基于缺乏共享的自然语言。）（我不知道短语＆＃34;他们所有的商店和＃34 ;和＆＃34;商店标志＆＃34;可能意味着。）请努力用惯用语进行交流。

Answer 2

这适用于你提供的数据 - 我是R的noobie，如果它很乱，那就很抱歉。

# Split the string in the first column to make it easier to compare
library(stringr)
regionl<-str_split_fixed(data$Region,c("[..]"),3)[,1]
regionr<-str_split_fixed(data$Region,c("[..]"),3)[,3]

data$regionl <- regionl
data$regionr <-regionr

# We set a threshold for comparison
threshold = 100

# Lets loop through the data and check the right column with the left column 
# We see if it is less than the threshold and has the same sign
# We add the groups up until there is a discrepancy and we print

currentGroup = NULL

for(i in 1:(nrow(data)-1))
{

  # Boolean variables checking against signs and thresholds

  difference <- abs(as.numeric(data$regionr[i])-as.numeric(data$regionl[i+1])) <= threshold
  signs <- (data$Open[i] ==  data$Open[i+1])

  # Group things together
  if(difference & signs)
  {
    currentGroup <- c(currentGroup,as.character(data$Store[i]),as.character(data$Store[i+1]))
  }
  else
  {
    # If it's in a group alone, do not print
    if(is.null(currentGroup))
    {
      # Do nothing
    }else
    {
      # Print groups
      print(unique(currentGroup))
    }
    # Reset the group holder
    currentGroup<-NULL
  }
}

从列表中的每个项目和其他问题中提取第二个元素

期望的输出应该是（如果数字差异小于40且商店标志相同（具有相同的+或 - ）

2 个答案: