我有两个文件
F1
bar
foo
egg
F2
"egg","apple","green"
"egg","orange","red"
"egg","apple","green"
"bar","spam","orange"
"bar","orange","blue"
"bacon","red","orange"
"foo","apple","green"
"foo","blue","apple"
"spam","apple","yellow"
"spam","green","egg"
并且我想根据F1对F2进行排序,因此F2中的每一行中都应包含F1中不存在的第一个元素。这样我得到:
"bar","spam","orange"
"bar","orange","blue"
"foo","apple","green"
"foo","blue","apple"
"egg","apple","green"
"egg","orange","red"
"egg","apple","green"
"bacon","red","orange"
"spam","apple","yellow"
"spam","green","egg"
我很喜欢python3解决方案。但是我也愿意在awk中寻求解决方案。
答案 0 :(得分:1)
能否请您尝试以下操作,如果有帮助,请告诉我。假设您想将F2文件的第一个字段的第一个字段与F1文件的第一个字段(根据显示的示例本身只有一个字段)进行匹配
awk -F'"' '
FNR==NR{
a[$2]=(a[$2]?a[$2] ORS:"")$0;
b[$2];
next
}
($0 in b){
print a[$0];
c[$0]
}
END{
for(i in a){
if(!(i in c)){ print a[i] }
}}' F2 F1
答案 1 :(得分:1)
list1=['bar','foo','egg']
list2=[["egg","apple","green"],
["egg","orange","red"],
["egg","apple","green"],
["bar","spam","orange"],
["bar","orange","blue"],
["bacon","red","orange"],
["foo","apple","green"],
["foo","blue","apple"],
["spam","apple","yellow"],
["spam","green","egg"]]
list_to_sort=[]
list_not_to_sort=[]
for element in list2:
if(element[0].split(',')[0] in list1):
list_to_sort.append(element)
else:
not_to_sort.append(element)
list_to_sort.sort()
print(list_to_sort+not_to_sort)
输出:
[['bar', 'orange', 'blue'],
['bar', 'spam', 'orange'],
['egg', 'apple', 'green'],
['egg', 'apple', 'green'],
['egg', 'orange', 'red'],
['foo', 'apple', 'green'],
['foo', 'blue', 'apple'],
['bacon', 'red', 'orange'],
['spam', 'apple', 'yellow'],
['spam', 'green', 'egg']]