考虑到以下总统名单,尽可能在最小的计划中排名前十位:
INPUT FILE
Washington Washington Adams Jefferson Jefferson Madison Madison Monroe Monroe John Quincy Adams Jackson Jackson Van Buren Harrison DIES Tyler Polk Taylor DIES Fillmore Pierce Buchanan Lincoln Lincoln DIES Johnson Grant Grant Hayes Garfield DIES Arthur Cleveland Harrison Cleveland McKinley McKinley DIES Teddy Roosevelt Teddy Roosevelt Taft Wilson Wilson Harding Coolidge Hoover FDR FDR FDR FDR Dies Truman Truman Eisenhower Eisenhower Kennedy DIES Johnson Johnson Nixon Nixon ABDICATES Ford Carter Reagan Reagan Bush Clinton Clinton Bush Bush Obama
以 bash 97 字符
开始cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10
输出:
2 Nixon 2 Reagan 2 Roosevelt 2 Truman 2 Washington 2 Wilson 3 Bush 3 Johnson 4 FDR 7 DIES
如你所愿,打破关系!快乐的第四个!
对于那些关心总统信息的人,可以找到here。
答案 0 :(得分:12)
C#,153:
在p
读取文件并将结果打印到控制台:
File.ReadLines(p)
.SelectMany(s=>s.Split(' '))
.GroupBy(w=>w)
.OrderBy(g=>-g.Count())
.Take(10)
.ToList()
.ForEach(g=>Console.WriteLine(g.Count()+"|"+g.Key));
如果仅生成列表但不打印到控制台,则为93个字符。
6|DIES
4|FDR
3|Johnson
3|Bush
2|Washington
2|Adams
2|Jefferson
2|Madison
2|Monroe
2|Jackson
答案 1 :(得分:11)
更短的shell版本:
xargs -n1 < input.txt | sort | uniq -c | sort -nr | head
如果您想要不区分大小写的排名,请将uniq -c
更改为uniq -ci
。
稍微短一点,如果你对排名被逆转感到高兴,可读性因缺乏空间而受损。这个时钟有46个字符:
xargs -n1<input.txt|sort|uniq -c|sort -n|tail
(如果允许您首先将输入文件重命名为“i”,则可以将其删除为38。)
观察到,在这种特殊情况下,没有任何单词出现超过9次,我们可以通过从最终排序中删除'-n'参数来削减3个字符:
xargs -n1<input.txt|sort|uniq -c|sort|tail
将此解决方案降至43个字符而不重命名输入文件。 (或35,如果你这样做。)
使用xargs -n1
将文件拆分为每行一个单词,优于tr \ \\n
解决方案,因为这会产生大量空白行。这意味着该解决方案不正确,因为它错过了Nixon并显示一个显示256次的空白字符串。但是,空白字符串不是“单词”。
答案 2 :(得分:7)
vim 60
:1,$!tr " " "\n"|tr -d "\t "|sort|uniq -c|sort -n|tail -n 10
答案 3 :(得分:7)
Vim 36
:%s/\W/\r/g|%!sort|uniq -c|sort|tail
答案 4 :(得分:5)
Haskell,102个字符(哇,非常接近原始字符):
import List
(take 10.map snd.sort.map(\(x:y)->(-length y,x)).group.sort.words)`fmap`readFile"input.txt"
J,只有55个字符:
10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
(我还没弄明白如何优雅地在J中执行文本操作...它在数组结构数据方面要好得多。)
NB. read the file <1!:1<'input.txt' +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------... | Washington Washington Adams Jefferson Jefferson Madison Madison Monroe Monroe John Quincy Adams Jackson Jackson Van Buren Harrison DIES Tyler Polk Taylor DIES Fillmore Pierce ... +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------... NB. split into lines <;._2[1!:1<'input.txt' +--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----... | Washington| Washington| Adams| Jefferson| Jefferson| Madison| Madison| Monroe| Monroe| John Quincy Adams| Jackson| Jackson| Van Buren| Harrison DIES| Tyler| Polk| Taylor DIES| Fillmore| Pierce| ... +--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----... NB. split into words ;;:&.><;._2[1!:1<'input.txt' +----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---... |Washington|Washington|Adams|Jefferson|Jefferson|Madison|Madison|Monroe|Monroe|John|Quincy|Adams|Jackson|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|DIES|Fillmore|Pierce|Buchanan|Lincoln|Lincoln|DIES|Johnson|Grant|Grant|Hayes|Garfield|DIES|Arthur|Cle... +----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---... NB. count reptititions |:~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt' +----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------... |2 |2 |2 |2 |2 |1 |1 |2 |1 |1 |2 |6 |1 |1 |1 |1 |1 |1 |2 |3 |2 |1 |1 |1 |2 |2 |2 |1 |2 |1 |1 |1 |4 |2 |2 ... +----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------... |Washington|Adams|Jefferson|Madison|Monroe|John|Quincy|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|Fillmore|Pierce|Buchanan|Lincoln|Johnson|Grant|Hayes|Garfield|Arthur|Cleveland|McKinley|Roosevelt|Taft|Wilson|Harding|Coolidge|Hoover|FDR|Truman|Eisenh... +----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------... NB. sort |:\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt' +----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----... |6 |4 |3 |3 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |2 |1 |1 |1 |1 |1 |1 |1 |1 |1 |1 |1 |1 |1 |1 ... +----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----... |DIES|FDR|Johnson|Bush|Wilson|Washington|Truman|Roosevelt|Reagan|Nixon|Monroe|McKinley|Madison|Lincoln|Jefferson|Jackson|Harrison|Grant|Eisenhower|Clinton|Cleveland|Adams|Van|Tyler|Taylor|Taft|Quincy|Polk|Pierce|Obama|Kennedy|John|Hoover|Hayes|Harding|Garf... +----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----... NB. take 10 10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt' +-+----------+ |6|DIES | +-+----------+ |4|FDR | +-+----------+ |3|Johnson | +-+----------+ |3|Bush | +-+----------+ |2|Wilson | +-+----------+ |2|Washington| +-+----------+ |2|Truman | +-+----------+ |2|Roosevelt | +-+----------+ |2|Reagan | +-+----------+ |2|Nixon | +-+----------+
答案 5 :(得分:3)
Perl:90
Perl:114 (包括perl,命令行开关,单引号和文件名)
perl -nle'$h{$_}++for split/ /;END{$i++<=10?print"$h{$_} $_":0for reverse sort{$h{$a}cmp$h{$b}}keys%h}' input.txt
答案 6 :(得分:3)
缺乏AWK令人不安。
xargs -n1<input.txt|awk '{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}'
75个字符。
如果你想获得更多的AWKy,你可以忘记xargs:
awk -v RS='[^a-zA-Z]' /./'{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}' input.txt
答案 7 :(得分:2)
这显然不是最小的解决方案,但无论如何我决定发布它,只是为了好玩。 :)注意:批处理文件使用名为 $ 的临时文件来存储临时结果。
包含评论的原始未压缩版本:
@echo off
setlocal enableextensions enabledelayedexpansion
set infile=%1
set cnt=%2
set tmpfile=$
set knownwords=
rem Calculate word count
for /f "tokens=*" %%i in (%infile%) do (
for %%w in (%%i) do (
rem If the word hasn't already been processed, ...
echo !knownwords! | findstr "\<%%w\>" > nul
if errorlevel 1 (
rem Count the number of the word's occurrences and save it to a temp file
for /f %%n in ('findstr "\<%%w\>" %infile% ^| find /v "" /c') do (
echo %%n^|%%w >> %tmpfile%
)
rem Then add the word to the known words list
set knownwords=!knownwords! %%w
)
)
)
rem Print top 10 word count
for /f %%i in ('sort /r %tmpfile%') do (
echo %%i
set /a cnt-=1
if !cnt!==0 goto end
)
:end
del %tmpfile%
压缩&amp;混淆版, 317 字符:
@echo off&setlocal enableextensions enabledelayedexpansion&set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>$&set l=!l! %%w
for /f %%i in ('sort /r $')do echo %%i&set /a n-=1&if !n!==0 del $&exit /b
如果echo已经关闭且命令扩展和延迟变量扩展打开,则可以缩短为 258 个字符:
set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>$&set l=!l! %%w
for /f %%i in ('sort /r $')do echo %%i&set /a n-=1&if !n!==0 del $&exit /b
用法:
> filename.bat input.txt 10 & pause
输出:
6|DIES
4|FDR
3|Johnson
3|Bush
2|Wilson
2|Washington
2|Truman
2|Roosevelt
2|Reagan
2|Nixon
答案 8 :(得分:2)
<强>红宝石强>
115个字符
w = File.read($*[0]).split
w.uniq.map{|x| [w.select{|y|x==y}.size,x]}.sort.last(10).each{|z| puts "#{z[1]} #{z[0]}"}
答案 9 :(得分:2)
Ruby 66B
puts (a=$<.read.split).uniq.map{|x|"#{a.count x} "+x}.sort.last 10
答案 10 :(得分:2)
Perl
86个字符94,如果计算输入文件名。
perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for@{[sort{$_{$b}<=>$_{$a}}keys%_]}[0..10]}' test.in
如果你不关心你得到多少结果,那么它只有75,不包括文件名。
perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for sort{$_{$b}<=>$_{$a}}keys%_}' test.in
答案 11 :(得分:2)
previous entry上的修订版应保存10个字符:
h = {}
File.open('f.1').each {|l|l.split(/ /).each{|e|h[e]==nil ?h[e]=1:h[e]+=1}}
h.sort{|a,b|a[1]<=>b[1]}.last(10).each{|e|puts"#{e[1]} #{e[0]}"}
答案 12 :(得分:2)
vim 38 ,适用于所有输入
:%!xargs -n1|sort|uniq -c|sort -n|tail
答案 13 :(得分:2)
Python 2.6, 104 字符:
l=open("input.txt").read().split()
for c,n in sorted(set((l.count(w),w) for w in l if w))[-10:]:print c,n
答案 14 :(得分:2)
python 3.1(88个字符)
import collections
collections.Counter(open('input.txt').read().split()).most_common(10)
答案 15 :(得分:2)
到目前为止,我最好尝试使用红宝石,166个字符:
h = Hash.new
File.open('f.l').each_line{|l|l.split(/ /).each{|e|h[e]==nil ?h[e]=1:h[e]+=1}}
h.sort{|a,b|a[1]<=>b[1]}.last(10).each{|e|puts"#{e[1]} #{e[0]}"}
我很惊讶没有人发布疯狂的J解决方案。
答案 16 :(得分:2)
这是shell脚本的压缩版本,观察到对输入数据(无前导或尾随空白)的合理解释,原始中的第二个'tr'和'sed'命令不会更改数据(通过在适当的位置插入'tee out.N'并检查输出文件大小来验证 - 相同)。 shell需要的空间比人类少 - 并且使用cat而不是输入I / O重定向会浪费空间。
tr \ \\n<input.txt|sort|uniq -c|sort -n|tail -10
这个重量为50个字符,包括脚本末尾的换行符。
还有两个观察结果(取自其他人的答案):
tail
本身相当于“tail -10
”和这可以通过另外7个字符缩小(到43包括尾随换行符):
tr \ \\n<input.txt|sort|uniq -c|sort|tail
使用'xargs -n1
'(没有给出命令前缀)代替'tr
'是非常聪明的;它处理前导,尾随和多个嵌入空间(这个解决方案没有)。