我正在读取2个大的.TXT文件,并根据特定代码对其进行过滤。代码位于每个文件的第16列。
Colleges <- read.table("Colleges.txt", sep ="|", fill = TRUE)
Majors <- read.table("Majors.txt", sep ="|", fill = TRUE)
数据看起来像这样
bld_name dpt_name majors admin code college year
MLK English Literature Ms. W T A&S 18
Freedom Math Stats Ms. B R STEM 18
MLK Math CALC Ms. B P STEM 18
创建子集并附加两个文件之后。我想使用bld_name和dpt_name创建一个唯一ID。
college_sub <- subset(colleges,colleges[[16]] %in% c("T", "R"), drop = TRUE)
majors_sub <- subset(majors,majors[[16]] %in% c("T", "R"), drop = TRUE)
combine <- do.call(rbind,list(college_sub,majors_sub)) #Append both files
uniqueID$id <- paste(combine$dpt_name,"-",combine$bld_name)
cols_g <- c("dpt_name", "Majors", "Admin", "Year")
combine <- combine[,cols_g]
它应该像这样:
Unique ID majors admin code college year
MLK-English Literature Ms. W T A&S 18