我目前正在尝试根据字符串中的单词对数据集进行子集化。使用stringr包,我尝试使用str_detect进行子集,如下所示:
subdat <- dat %>% filter(str_detect(de, index$Full[1]))
这产生了第一个&#34; Full&#34;的子集的正确数据表。 in index是检测到的内容。但是,当相同的代码输入到for循环中时,用&#34; i&#34;替换索引。要遍历所有名称,子集不再检测到正确的字符串。
for (i in 1){
subdat <- dat %>% filter(str_detect(de, index$Full[i]))
}
除此之外,每次迭代都会检测到相同的错误子集。在测试&#34; i&#34;在for循环之外的变量,str_detect中出现同样的问题。运行以下代码时,如果i等于1,则R返回TRUE:
index$Name[i] == index$Full[1]
但是,为以下代码再次返回不同的数据集:
subdati <- dat %>% filter(str_detect(de, index$Full[i]))
subdat1 <- dat %>% filter(str_detect(de, index$Full[1]))
由于我的索引长约70个条目,我希望能够完成for循环以最终为子集写入CSV(这不是编码方面的问题)。我希望这已经足够了,因为这是我第一次询问,如果需要,可以帮助澄清任何事情。
为可重现的示例添加了输出输出:
> dput(droplevels(dat))
structure(list(evt = structure(c(3L, 4L, 1L, 5L, 2L), .Label = c("112",
"150", "22", "41", "320"), class = "factor"), cl = structure(c(2L,
1L, 5L, 4L, 3L), .Label = c("08:49", "10:32", "11:21", "10:31",
"02:28"), class = "factor"), de = c("[BOS] Tatum Foul: Defense 3 Second (1 PF) (1 FTA) (B Forte)",
"[BOS] Hayward Foul: Shooting (1 PF) (2 FTA) (B Forte)", "[OKC] Westbrook Foul: Shooting (1 PF) (1 FTA) (K Scott)",
"[SAS] Paul Foul: Personal (2 PF) (B Forte)", "[DAL] Harris Foul: Shooting (2 PF) (1 FTA) (B Forte)"
), i = c(1, 1, 36, 383, 461)), .Names = c("evt", "cl", "de",
"i"), row.names = c(1L, 4L, 1599L, 16358L, 18269L), class = "data.frame")
> dput(droplevels(index))
structure(list(First = structure(1:2, .Label = c("B", "K"), class = "factor"),
Last = structure(1:2, .Label = c("Forte", "Scott"), class = "factor"),
Full = c("B Forte", "K Scott")), .Names = c("First", "Last",
"Full"), row.names = c(1L, 36L), class = "data.frame")
有了这个,我得到了当前的输出:
> subdat <- dat %>% filter(str_detect(de, index$Full[1]))
> subdat
evt cl de i
1 22 10:32 [BOS] Tatum Foul: Defense 3 Second (1 PF) (1 FTA) (B Forte) 1
2 41 08:49 [BOS] Hayward Foul: Shooting (1 PF) (2 FTA) (B Forte) 1
3 320 10:31 [SAS] Paul Foul: Personal (2 PF) (B Forte) 383
4 150 11:21 [DAL] Harris Foul: Shooting (2 PF) (1 FTA) (B Forte) 461
> for (i in 1){
+ subdatloop <- dat %>% filter(str_detect(de, index$Full[i]))
+ }
> subdatloop
evt cl de i
1 22 10:32 [BOS] Tatum Foul: Defense 3 Second (1 PF) (1 FTA) (B Forte) 1
2 41 08:49 [BOS] Hayward Foul: Shooting (1 PF) (2 FTA) (B Forte) 1
> index$Full[i] == index$Full[1]
[1] TRUE
> subdati <- dat %>% filter(str_detect(de, index$Full[i]))
> subdati
evt cl de i
1 22 10:32 [BOS] Tatum Foul: Defense 3 Second (1 PF) (1 FTA) (B Forte) 1
2 41 08:49 [BOS] Hayward Foul: Shooting (1 PF) (2 FTA) (B Forte) 1
> subdat1
evt cl de i
1 22 10:32 [BOS] Tatum Foul: Defense 3 Second (1 PF) (1 FTA) (B Forte) 1
2 41 08:49 [BOS] Hayward Foul: Shooting (1 PF) (2 FTA) (B Forte) 1
3 320 10:31 [SAS] Paul Foul: Personal (2 PF) (B Forte) 383
4 150 11:21 [DAL] Harris Foul: Shooting (2 PF) (1 FTA) (B Forte) 461
编辑:添加了可重现的示例和预期的输出。