我有一个带有形容词列表A-Z
的文件 如何打印以A开头的第一个单词,然后是以B开头的第一个单词...一直到Z?
我认为grep可能就是这样。但是对其他人开放,awk,python ......其他。
一些示例输出:
$ cat adjectives.txt | head
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive
$ cat adjectives.txt | grep '^[ABCDE]' | head
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive
所以我的示例输出将是:
Adamant: unyielding; a very hard substance
Baleful: deadly, foreboding
...
Irksome: annoying
Jejune: dull, puerile
...
Wheedling: flattering
Zealous: eager, devoted
从here
完整填写文件$ cat adjectives.txt
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive
Calamitous: disastrous
Caustic: corrosive, sarcastic; a corrosive substance
Cerulean: sky blue
Comely: attractive
Concomitant: accompanying
Contumacious: rebellious
Corpulent: obese
Crapulous: immoderate in appetite
Defamatory: maliciously misrepresenting
Didactic: conveying information or moral instruction
Dilatory: causing delay, tardy
Dowdy: shabby, old-fashioned; an unkempt woman
Efficacious: producing a desired effect
Effulgent: brilliantly radiant
Egregious: conspicuous, flagrant
Endemic: prevalent, native, peculiar to an area
Equanimous: even, balanced
Execrable: wretched, detestable
Fastidious: meticulous, overly delicate
Feckless: weak, irresponsible
Fecund: prolific, inventive
Friable: brittle
Fulsome: abundant, overdone, effusive
Garrulous: wordy, talkative
Guileless: naive
Gustatory: having to do with taste or eating
Heuristic: learning through trial-and-error or problem solving
Histrionic: affected, theatrical
Hubristic: proud, excessively self-confident
Incendiary: inflammatory, spontaneously combustible, hot
Insidious: subtle, seductive, treacherous
Insolent: impudent, contemptuous
Intransigent: uncompromising
Inveterate: habitual, persistent
Invidious: resentful, envious, obnoxious
Irksome: annoying
Jejune: dull, puerile
Jocular: jesting, playful
Judicious: discreet
Lachrymose: tearful
Limpid: simple, transparent, serene
Loquacious: talkative
Luminous: clear, shining
Mannered: artificial, stilted
Mendacious: deceptive
Meretricious: whorish, superficially appealing, pretentious
Minatory: menacing
Mordant: biting, incisive, pungent
Munificent: lavish, generous
Nefarious: wicked
Noxious: harmful, corrupting
Obtuse: blunt, stupid
Parsimonious: frugal, restrained
Pendulous: suspended, indecisive
Pernicious: injurious, deadly
Pervasive: widespread
Petulant: rude, ill humored
Platitudinous: resembling or full of dull or banal comments
Precipitate: steep, speedy
Propitious: auspicious, advantageous, benevolent
Puckish: impish
Querulous: cranky, whining
Quiescent: inactive, untroublesome
Rebarbative: irritating, repellent
Recalcitrant: resistant, obstinate
Redolent: aromatic, evocative
Rhadamanthine: harshly strict
Risible: laughable
Ruminative: contemplative
Sagacious: wise, discerning
Salubrious: healthful
Sartorial: relating to attire, especially tailored fashions
Sclerotic: hardening
Serpentine: snake-like, winding, tempting or wily
Spasmodic: having to do with or resembling a spasm, excitable,
intermittent
Strident: harsh, discordant; obtrusively loud
Taciturn: closemouthed, reticent
Tenacious: persistent, cohesive,
Tremulous: nervous, trembling, timid, sensitive
Trenchant: sharp, penetrating, distinct
Turbulent: restless, tempestuous
Turgid: swollen, pompous
Ubiquitous: pervasive, widespread
Uxorious: inordinately affectionate or compliant with a wife
Verdant: green, unripe
Voluble: glib, given to speaking
Voracious: ravenous, insatiable
Wheedling: flattering
Withering: devastating
Zealous: eager, devoted
答案 0 :(得分:8)
awk
救援!
$ awk '!a[tolower(substr($0,1,1))]++' file
这为每个初始字符创建一个计数器,仅在计数为零(即第一个实例)时打印。 tolower()
可以使其不区分大小写,如果不需要,您可以删除。 substr($0,1,1)
从行中提取第一个字符。有一个隐式循环将对输入文件的所有行重复此操作。
稍微更改脚本
$ awk '++a[substr($0,1,1)]==2' file
您可以获得第二条记录(如果存在)或使用<3
代替==2
前2条记录。
如果您的文件已经排序且案例一致,您可以选择更简单的脚本
$ uniq -w1 file
uniq
命令提取比较值的第一个实例,此处仅限于第一个字符。因此,它将立即提取所有字母中的第一个。如果案例不一致,请添加-i
忽略案例标记。
扫描文件一次就足够了,不需要多次扫描......
答案 1 :(得分:3)
Python版本:
import itertools
with open('adjectives.txt') as fp:
# Group lines by first letter. If the lines weren't already sorted,
# you could replace fp with sorted(fp).
groups = itertools.groupby(fp, key=lambda line: line[0])
for first_letter, group in groups:
print(next(group), end='')
答案 2 :(得分:1)
也许,用bash:
for i in {A..Z}; do grep -m1 ^$i adjectives.txt; done
答案 3 :(得分:0)
with open("adjectives.txt") as f:
lines = f.readlines()
# get rid of trailing \n
lines = [x.strip() for x in lines]
# stable sort
lines.sort(key = lambda s: s[0])
d = {}
for line in lines:
key = line[0]
# only the first occurence
if not key in d:
d[key] = line
for key in sorted(d.keys()):
print(d[key])