我有一个序列文件,格式化为一行信息,后跟序列,例如:
someinformation length=50
JJJIJJJJJJJJJIJGIJJJJJJIJJIJJJJJIJJJJHHHHHFFFFFCCC
someotherinformation length=50
GEFE?BEDHCBBACEBHAFEBFEBFHFFDDDFD@@@
[...]
我想用下一行的实际长度替换长度= 50(可能是不同的数字)(没有下一行字符)。所以像这样:
sed -i "s/length=[0-9]+/length=length_next_line/" infile
是否可以在sed中获取下一行的长度?
使用20行input.txt文件进行了一次非常简单的时间测试(对于i在$(seq 10000);使用input.txt> / dev / null; done)
Suku的回答:
real 0m54.932s
user 0m4.678s
sys 0m35.969s
埃德莫顿回答:
real 0m53.983s
user 0m3.789s
sys 0m33.574s
real 0m55.565s
user 0m5.929s
sys 0m36.049s
NeronLeVelu第一个回答和第二个回答:
real 0m54.688s
user 0m3.812s
sys 0m36.884s
real 0m55.066s
user 0m3.929s
sys 0m36.850s
答案 0 :(得分:1)
import os, re
import pandas as pd
EMDASH = '—'
with open('scrubbed_file','wt') as outfile:
with open('original_file_location','rt') as infile:
for line in infile:
outfile.write(re.sub(EMDASH,'-',line))
df = pd.read_csv('scrubbed_file', engine='python',
encoding='utf_16_le',
names=['Country', 'Date', 'Delivery', 'Region'],
delimiter='\t',
quotechar='"',
skiprows=2, skip_footer=2, thousands = ',')
无法获得一条线的长度。您可以改为使用sed
:
awk
答案 1 :(得分:1)
$ cat input.txt
someinformation length=50
JJJIJJJJJJJJJIJGIJJJJJJIJJIJJJJJIJJJJHHHHHFFFFFCCC
someotherinformation length=50
GEFE?BEDHCBBACEBHAFEBFEBFHFFDDDFD@@@
$ awk '{ if(NR%2 == 1) {sub(/=[0-9]+$/,"=",$0); s=$0; next} print s length($0) ORS $0 }' input.txt
someinformation length=50
JJJIJJJJJJJJJIJGIJJJJJJIJJIJJJJJIJJJJHHHHHFFFFFCCC
someotherinformation length=36
GEFE?BEDHCBBACEBHAFEBFEBFHFFDDDFD@@@
sub
在awk
内会改变输入字符串,即$0
{li> next
在awk
内将带您到下一行
答案 2 :(得分:1)
不,sed是针对各行的简单替换,就是全部。更有趣的是awk的工作:
vb.customize ["modifyvm", :id, "--memory", "4096"]
vb.customize ["modifyvm", :id, "--vram", "64"]
vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
vb.customize ["modifyvm", :id, "--autostart-enabled", "off"]
vb.customize ["modifyvm", :id, "--cpuexecutioncap", "40"]
答案 3 :(得分:1)
仅适用于时间竞争的挑战: - )
awk -F "length=" 'NF > 1 {Head=$1;next}
{print Head " length=" length($0) ORS $0}' YourFile
与其他调整
awk -F "length=" 'NF > 1 {printf "%s length=", $1;next}
{print length($0) ORS $0}' YourFile