我需要编写一个搜索函数,以使用R在大型数据集中查找某些元素的开始和结束位置。
我的样本数据集如下:
C1 C2 Index
aa J 1
aa J 2
aa J 3
ab O 4
aa O 5
aa J 6
aa J 7
aa J 8
aa J 9
aa K 10
ac K 11
aa J 12
aa J 13
我想编写一个类似search("aa","J")
的搜索功能(其中“ aa”是C1列中的值,而“ J”是C2列中的值)。该函数将首先根据“ aa”对数据集进行子集化;然后根据此子集提供索引。
结果将返回在如下矩阵中找到的所有位置的索引:
[,1] [,2]
[1,] 1 3
[2,] 5 8
[3,] 10 11
非常感谢您。
我试图修改提供的代码;但是有错误。您能帮忙看看吗?
get_inds <- function(test, C1, C2) {
test <- subset(test, test$C1 == C1)
inds <- rle(test$C1 == C1 & test$C2 == C2)
end = cumsum(inds$lengths)
start = c(1, head(end, -1) + 1)
data.frame(start, end)[inds$values, ]
}
get_inds(test, 'aa', 'J')
答案 0 :(得分:1)
@markus提供的链接可以解决您的问题,您需要根据需要进行修改。
[13:01:35] components/catalog/category/_landing.scss:455 [E] NestingDepth: Nesting should be no greater than 3, but was 4
[13:01:35] components/catalog/category/_landing.scss:495 [E] Indentation: Line should be indented 2 spaces, but was indented 4 spaces
[13:01:35] components/catalog/category/_layer.scss:46 [E] ColorVariable: Color literals like `#fff` should only be used in variable declarations; they should be referred to via variable everywhere else.
[13:01:35] components/catalog/product/_product-bikes.scss:41 [E] SelectorDepth: Selector should have depth of applicability no greater than 3, but was 4
[13:04:30] base/_dropdown.scss:80 [E] ImportantRule: !important should not be used
[13:04:30] components/account/_store-stock-checker.scss:69 [E] IdSelector: Avoid using id selectors
[13:04:30] vendor/font-awesome/_variables.scss:120 [E] StringQuotes: Prefer single quoted strings
您需要更改get_inds <- function(test, a, b) {
test <- subset(test, C1 == a)
inds <- rle(test$C1 == a & test$C2 == b)
end = cumsum(inds$lengths)
start = c(1, head(end, -1) + 1)
df = data.frame(start, end)[inds$values, ]
row.names(df) <- NULL
df
}
get_inds(test, 'aa', 'J')
# start end
#1 1 3
#2 5 8
#3 10 11
的条件,并删除不满足条件的行。