按包含数字的行对行进行排序,忽略附加到字母的数字
我需要对文件中的行进行排序,以便包含至少一个数字(0-9)的行,在这些字母之一(“a”,“e”)之前不包括数字1-5 “g”,“i”,“n”,“o”,“r”,“u”,“v”或“u:”(u + :))被移动到文件的末尾。< / p>
以下是一个示例文件:
I want to buy some food.
I want 3 chickens.
I have no3 basket for the eggs.
I have no3 basket which can hold 24 eggs.
Move the king to A3.
Can you move the king to a6?
在示例文件中,以下是与哪些匹配的注释:
I want to buy some food. % does not match
I want 3 chickens. % matches
I have no3 basket for the eggs. % does not match, because "3" is preceded by "o"
I have no3 basket which can hold 24 eggs. % matches, because contains "24"
Move the king to A3. % matches, words preceded by "A" are not ignored.
Can you move the king to a6? % matches, 6 is not 1-5
输出会将所有匹配的行放在底部:
I want to buy some food.
I have no3 basket for the eggs.
I want 3 chickens.
Move the king to A3.
Can you move the king to a6?
I have no3 basket which can hold 24 eggs.
优选地(尽管不是必需的),解决方案将包含最多匹配数字的行分类到末尾。例如。 “我有10只鸡和12只蝙蝠。”在“我有99只鸡”之后出现(4位数)。 (2位数)。
使用BASH,Perl,Python 2.7,Ruby,sed
,awk
或grep
的解决方案都很好。
答案 0 :(得分:5)
如果您的grep
支持-P
(perl-regexp)选项:
pat='(?<=[^0-9]|^)((?<!u:)(?<![aeginoruv])[1-5]|[06-9])'
{ grep -vP "$pat" input.txt; grep -P "$pat" input.txt; } >output.txt
如果您安装了ssed
(super sed):
ssed -nR '
/(?<=[^0-9]|^)((?<!u:)(?<![aeginoruv])[1-5]|[06-9])/{
H
$!d
}
$!p
${
g
s/\n//
p
}' input.txt
答案 1 :(得分:3)
在您的数据集上运行此程序时:
#!/usr/bin/env perl
use strict;
use warnings;
my @moved = ();
my $pat = qr{
[67890] # these big digits anywhere, or else...
| (?<! [aeginoruv] ) # none of those letters before
(?<! u: ) # nor a "u:" before
[12345] # these little digits
}x;
while (<>) {
if (/$pat/) {
push @moved, $_;
} else {
print;
}
}
print @moved;
它产生您想要的输出:
I want to buy some food.
I want 3 chickens.
I have no3 basket for the eggs.
I have no3 basket which can hold 24 eggs.
Move the king to A3.
Can you move the king to a6?
要合并排序,请将最终打印更改为:
print for sort {
$a =~ y/0-9// <=> $b =~ y/0-9//
} @moved;
现在输出将是:
I want to buy some food.
I have no3 basket for the eggs.
I want 3 chickens.
Move the king to A3.
Can you move the king to a6?
I have no3 basket which can hold 24 eggs.
答案 2 :(得分:1)
这听起来像是perl的工作!
说真的,sed很难满足将“u:”移动到文件末尾的要求。 sed真的是基于行的。 awk可以做到,但perl可能更好。
使用\ d +匹配数字
的行然后使用[aeginorv] \ d +过滤掉你的字母
u:\ d +来处理你的特殊情况u:stuff(你将不得不缓冲它[例如只是在数组中存储匹配的行]所以你可以在最后输出它)
答案 3 :(得分:1)
[编辑,因为其他人都有一个接受文件参数的代码:]
对于Python中的非正则表达式解决方案,
怎么样?import sys
def keyfunc(s):
ignores = ("a", "e", "g", "i", "n", "o", "r", "u", "v", "u:")
return sum(c.isdigit() and not (1 <= int(c) <= 5 and s[:i].endswith(ignores))
for i,c in enumerate(s))
with open(sys.argv[1]) as infile:
for line in sorted(infile, key=keyfunc):
print line,
产生:
I want to buy some food.
I have no3 basket for the eggs.
I want 3 chickens.
Move the king to A3.
Can you move the king to a6?
I have no3 basket which can hold 24 eggs.
I have 99 chickens.
I have 10 chickens and 12 bats.
答案 4 :(得分:1)
use strict;
use v5.10.1;
my @matches;
my @no_matches;
while (my $line = <DATA>) {
chomp $line;
if ($line =~ / \d+\W/) {
#say "MATCH $line";
push @matches, $line;
}
elsif ($line =~ /u:[1-5]+\b/) {
#say "NOMATCH $line";
push @no_matches, $line;
}
elsif ($line =~ /[^aeginoruv][1-5]+\b/) {
#say "MATCH $line";
push @matches, $line;
}
elsif ($line =~ /.[6-90]/) {
#say "MATCH $line";
push @matches, $line;
}
else {
#say "NOMATCH $line";
push @no_matches, $line;
}
}
foreach (@no_matches){
say $_;
}
foreach (@matches){
say $_;
}
__DATA__
I want to buy some food.
I want 3 chickens.
I have no3 basket for the eggs.
I have no3 basket which can hold 24 eggs.
What is u:34? <- custom test
Move the king to A3.
Can you move the king to a6?
提示&GT; perl regex.pl
I want to buy some food.
I have no3 basket for the eggs.
What is u:34? <- custom test
I want 3 chickens.
I have no3 basket which can hold 24 eggs.
Move the king to A3.
Can you move the king to a6?
答案 5 :(得分:1)
(修改:现在包含可选种类)
matches = []
non_matches = []
File.open("lines.txt").each do |line|
if line.match(/[67890]|(?<![aeginoruv])(?<!u:)[12345]/)
matches.push line
else
non_matches.push line
end
end
puts non_matches + matches.sort_by{|m| m.scan(/\d/).length}
产生
I want to buy some food.
I want 3 chickens.
I have no3 basket for the eggs.
Move the king to A3.
Can you move the king to a6?
I have no3 basket which can hold 24 eggs.
答案 6 :(得分:1)
这可能对您有用:
sed 'h;s/[aeginoruv][1-5]\|u:[1-5]//g;s/[^0-9]//g;s/^$/0/;G;s/\n/\t/' file |
sort -sn |
sed 's/^[^\t]*\t//'
I want to buy some food.
I have no3 basket for the eggs.
I want 3 chickens.
Move the king to A3.
Can you move the king to a6?
I have no3 basket which can hold 24 eggs.
基本上是一步三步:
-s