我已经阅读了很多stackoverflow问题和答案,但我仍然无法找到解决我问题的方法:我想读5列左右。 80 .csv文件到R中,无需手动键入所有代码,然后将这些文件合并到一个数据帧中。然后,该数据帧需要与具有相同数量列的另一个数据帧组合。
所以我想用for循环来做这件事,但是我无法进行进一步的计算。我这样做了,我看到了正在阅读的文件:
filenames <- list.files(path = getwd(), pattern = "*.csv")
for (i in filenames) {
filepath <- file.path(getwd(), paste (i, sep = ""))
assign(i, fread(filepath, select = c(1,2,3,25,29), sep = ","))
我不知道如何访问刚刚读入的文件,即输入变量名称(例如df2)。我如何将这些组合成一个数据帧,我可以为其分配我想要与之合并的另一个数据帧的列名?
答案 0 :(得分:0)
好吧,您可以选择CSV文件。
filename <- file.choose()
data <- read.csv(filename, skip=1)
name <- basename(filename)
或者,对路径进行硬编码。
# Read CSV into R
MyData <- read.csv(file="c:/your_path_here/Data.csv", header=TRUE, sep=",")
对于加入和合并,这里有一些很好的经验法则。
Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId") to make sure that you were matching on only the fields you desired. You can also use the by.x and by.y parameters if the matching variables have different names in the different data frames.
Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)
Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)
Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)
Cross join: merge(x = df1, y = df2, by = NULL)
有关详细信息,请参阅以下链接。
How to join (merge) data frames (inner, outer, left, right)?
答案 1 :(得分:0)
您可以使用map_df
purrr
filenames <- list.files(path = getwd(), pattern = "*.csv", full.names = TRUE)
reader = function (x) {
fread(x, select = c(1,2,3,25,29), sep = ",")
}
reading_files <- map_df(filenames, reader)
map_df将读入您的所有文件,并使用非常高效的bind_rows