以下是我要做的事情: 这会引发更多的光,但这就是我想要的。假设您有一个类似下面的数据 -
Region Open Store
120..141 + France
145..2115 + Germany
3322..5643 + Wales
5646..7451 - Scotland
7454..8641 - Mexico
8655..9860 - India
9980..11413 + Zambia
11478..1261 - Nicaragua
12978..1318 + Sweeden
我想要做的是选择找到第二个元素(141)和连续的第一个元素(145)之间的差异,如果它们符合某个值并且它们具有相同的符号(+或 - ),则组商店在一起。输出示例
4 (France and Germany)
3,14 (Scotland and Mexico and india)
答案 0 :(得分:0)
"第二元素的载体" ("字符" vector是sapply( strsplit(dat$Region, "\\.\\.") , "[", 2)
和"第一个元素" sapply( strsplit(dat$Region, "\\.\\.") , "[", 1)
。据推测,第一个这样的差异(在data.frame
使第一列成为默认因子类的情况下)是:
as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[2]) -
as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[1])
#[1] 25
[注意:需要" \。\。"作为'分裂'争论来自于'分裂'参数被解释为正则表达式。]所有差异的向量:
as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[-1]) -
as.numeric(sapply( strsplit(as.character(dat$Region), "\\.\\.") , "[", 1)[-length(dat$Region)])
[1] 25 3177 2324 1808 1201 1325 1498 1500
你的其余问题(和编辑)未能传达你的意图(可能是基于缺乏共享的自然语言。)(我不知道短语"他们所有的商店和#34 ;和"商店标志"可能意味着。)请努力用惯用语进行交流。
答案 1 :(得分:0)
这适用于你提供的数据 - 我是R的noobie,如果它很乱,那就很抱歉。
# Split the string in the first column to make it easier to compare
library(stringr)
regionl<-str_split_fixed(data$Region,c("[..]"),3)[,1]
regionr<-str_split_fixed(data$Region,c("[..]"),3)[,3]
data$regionl <- regionl
data$regionr <-regionr
# We set a threshold for comparison
threshold = 100
# Lets loop through the data and check the right column with the left column
# We see if it is less than the threshold and has the same sign
# We add the groups up until there is a discrepancy and we print
currentGroup = NULL
for(i in 1:(nrow(data)-1))
{
# Boolean variables checking against signs and thresholds
difference <- abs(as.numeric(data$regionr[i])-as.numeric(data$regionl[i+1])) <= threshold
signs <- (data$Open[i] == data$Open[i+1])
# Group things together
if(difference & signs)
{
currentGroup <- c(currentGroup,as.character(data$Store[i]),as.character(data$Store[i+1]))
}
else
{
# If it's in a group alone, do not print
if(is.null(currentGroup))
{
# Do nothing
}else
{
# Print groups
print(unique(currentGroup))
}
# Reset the group holder
currentGroup<-NULL
}
}