我有两个带有SQL数据的文件,我希望摆脱第二个文件中具有匹配的课程代码和学号的数据。文件看起来像这样:
文件1:
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','3941021693');
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','5044463260');
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','5923001715');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','7539643746');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','9604636424');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','9649249670');
文件2:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3941021693','1354811709');
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','5044463260','1378352712');
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','9604636424','1374587707');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','9649249670','1370542279');
我操纵了文件,以便课程和学生字段在文件的前两行和最后两行匹配。在第一行中,您可以看到它们具有相同的课程(BKE974)和学生(3941021693)值。如果这些值不匹配,我想将整行从File2打印到新文件。
我一直在尝试使用一些bash脚本来解决这个问题,因为我想了解更多有关bash的信息,所以我很乐意使用bash解决方案。我尝试过使用grep,awk和cut来尝试解决这个问题,但我对bash的了解非常缺乏:P
编辑: 因此,我希望最终得到的结果应该是将这两行打印到一个新文件中:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');
答案 0 :(得分:1)
试试这个
#!/bin/bash
while read line
do
x=`echo "$line" | sed -n "s/.*VALUES (//p" | sed -n "s/);//p"`;
sed -i '/'$x'/d' file2.txt
done<file1.txt
答案 1 :(得分:1)
这是使用GNU awk
的一种方式:
awk -F "[()]" 'FNR==NR { a[$(NF-1)]++; next } !(gensub(/(.*),.*/,"\\1","g",$(NF-1)) in a)' File1 File2
结果:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');