Question

我正试图在r中提取我的部分数据，以解决另一个问题。我不确定如何提取文件夹读取的数据的子集。

当前通过以下代码读取我的数据：

library(data.table, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

################
## PARAMETERS ##
################

# Set path of major source folder for raw transaction data
in_directory <- "C:/Users/NAME/Documents/Raw Data/"

# List names of sub-folders (currently grouped by first two characters of 
CUST_ID)
in_subfolders <- list("AA-CA", "CB-HZ", "IA-IL", "IM-KZ", "LA-MI", "MJ-MS",
                  "MT-NV", "NW-OH", "OI-PZ", "QA-TN", "TO-UZ",
                  "VA-WA", "WB-ZZ")

# Set location for output
out_directory <- "C:/Users/NAME/Documents/YTD Master/"
out_filename <- "OUTPUT.csv"

# Set beginning and end of date range to be collected - year-month-day format
date_range <- interval(as.Date("2018-01-01"), as.Date("2018-05-31"))

# Enable or disable filtering of raw files to only grab items bought within 
certain months to save space.
# If false, all files will be scanned for unique items, which will take 
longer and be a larger file.
date_filter <- TRUE

我希望提供一个数据集，以便我能给出一个可重复的示例。

我处理大量数据，因此我从数据库中提取信息并将其按日期存储在文件夹中。然后进行设置，以便可以从数据中提取所需的任何日期。

我在代码中提供了超出必要的内容，但这是我使用代码进行操作之前的第一部分。

如何获取文件夹读取的数据子集？

0 个答案: