Question

我有一个小数据集，我正在尝试使用grepl函数对data.frame进行子集化。

我有;

year_list <- list("2013", "2014", "2015", "2016", "2017")

test.2013 <- subset(searches[, 1:2], grepl(year_list[1], searches$date))
test.2014 <- subset(searches[, 1:2], grepl(year_list[2], searches$date))
test.2015 <- subset(searches[, 1:2], grepl(year_list[3], searches$date))
test.2016 <- subset(searches[, 1:2], grepl(year_list[4], searches$date))
test.2017 <- subset(searches[, 1:2], grepl(year_list[5], searches$date))

我正在尝试创建一个循环，以便将第1列到第2列（date列和hits列）子集化为新的data.frame。

我正在尝试date中的year_lists，将grepl函数应用于date中的searches data.frame列，并将这些值返回到新data.frame但是使用循环函数或者比我现在更少重复的东西。

数据帧

         date hits         keyword   geo gprop category
1: 2013-01-06   23  Price world   web        0
2: 2013-01-13   23  Price world   web        0
3: 2013-01-20   40  Price world   web        0
4: 2013-01-27   25  Price world   web        0
5: 2013-02-03   21  Price world   web        0
6: 2013-02-10   19  Price world   web        0

Answer 1

如果我的理解是正确的，您希望根据日期列中的条目将data.frame拆分为多个data.frames，那么您可能会考虑以下解决方案，该解决方案会生成一个列表使用data.frame的所需split子集。我已经使用了您的数据（而不是data.table）并引入了代表额外年份的两行。我希望我的理解是正确的。

df <- read.table(text = "
date hits         keyword   geo gprop category
2013-01-06   23  Price world   web        0
2013-01-13   23  Price world   web        0
2013-01-20   40  Price world   web        0
2013-01-27   25  Price world   web        0
2013-02-03   21  Price world   web        0
2013-02-10   19  Price world   web        0
2014-02-03   21  Price world   web        0
2014-02-10   19  Price world   web        0
", header = T, stringsAsFactors = F)

#extract only the four first digits from date column
#to generate splitting groups
df_split <- split(df[, c("date", "hits")], gsub("(\\d{4})(.*$)", "\\1", df$date))

df_split
# $`2013`
#       date    hits
# 1 2013-01-06   23
# 2 2013-01-13   23
# 3 2013-01-20   40
# 4 2013-01-27   25
# 5 2013-02-03   21
# 6 2013-02-10   19
# 
# $`2014`
#       date    hits
# 7 2014-02-03   21
# 8 2014-02-10   19

将以下内容转换为循环函数

1 个答案: