我有四个excel文件,已使用list.files加载到R中,并使用lapply读取它们。 我的代码是:
my_files <- list.files(pattern = '*.xlsx')
my_list <- lapply(my_files ,read_excel)
文件包含许多不同的列:
lapply(my_list ,colnames)
> lapply(my_list ,colnames)
[[1]]
[1] "JobCard Branch" "Customer Name" "Primary Contact No" "Alt No 1"
[5] "Alt No 2" "Reg No"
[[2]]
[1] "CUSTOMER" "Primary Contact No" "Alt No 1" "REG NO#"
[5] "VehModel" "Last Service Outlet"
[[3]]
[1] "Company Name" "JobCard Branch" "Service_Branch"
[4] "HUB" "Customer Code" "Address"
[7] "Address Line2" "Primary Contact No" "Alt No 1"
[10] "Alt No 2" "Alt No 3" "Zip"
[13] "Source" "City" "Vehicle Model"
[16] "Make" "Reg No" "Chasis No"
[[4]]
[1] "Last Call Date" "Reg.No" "Model" "Customer Name" "Contact Number" "Booked Outlet"
>
有人可以让我知道是否可以使用rbind或任何其他功能从所有这些小标题中仅提取注册号列(“ Reg No”,“ REG NO#”,“ Reg No”,“ Reg.No”)
答案 0 :(得分:1)
您可以尝试在不区分大小写的模式下使用grep
:
lapply(my_list, function(x) {
y <- colnames(x)
y[grep("\\breg\\b", y, ignore.case=TRUE)]
})
这在不区分大小写的模式下使用正则表达式模式\breg]b
,以查找与所需内容匹配的列名。
答案 1 :(得分:0)
我们可以创建一个我们要提取的列名(cols
)的向量,然后使用lapply
遍历数据帧列表,并对与cols
匹配的列进行子集化。 / p>
cols <- c("Reg No","REG NO#","Reg No","Reg.No")
data.frame(unlist(lapply(my_list, function(x)
x[names(x) %in% cols]), use.names = FALSE))
可复制的示例
df1 <- data.frame(a = 1:5, b = 2:6)
df2 <- data.frame(a1 = 1:4, new_s = 2:5)
df3 <- data.frame(abc = 1:4)
list_df <- list(df1, df2, df3)
cols <- c("a", "a1", "abc")
data.frame(new = unlist(lapply(list_df, function(x)
x[names(x) %in% cols]),use.names = FALSE))
# new
# 1 1
# 2 2
# 3 3
# 4 4
# 5 5
# 6 1
# 7 2
# 8 3
# 9 4
#10 1
#11 2
#12 3
#13 4