我有这个包含FeII排放过渡线的文本文件。头部是:n_high,n_low,波长,强度(其中n_high和n_low是上下转换,从
开始) 2 --> 1,,,371 --> 1,3 --> 2,,,371 --> 2,,, (and so on till the last chunk) 371 --> 370
输入文件如下:
#n_hi n_lo WL(A) logI
2 1 259811.86 1.158
3 1 149730.41 -2.054
4 1 115894.98 -2.134
5 1 102320.80 -2.389
6 1 53387.13 0.256
7 1 41138.69 -0.277
8 1 35226.70 -1.585
9 1 32068.36 -1.741
10 1 12566.77 2.323
.
.
.
.
369 1 1069.66 1.461
370 1 1065.75 -7.901
371 1 1065.64 -8.011
3 2 353390.47 0.759
4 2 209224.17 -2.390
5 2 168797.89 -2.607
.
.
.
370 369 291200.84 -10.337
371 369 283465.88 -10.436
371 370 10672868.00 -12.012
共有68635行。
这里的任务是我只想选择波长范围内的特定转换,比如[x1,x2],并将整行打印到另一个文件中。
所以,我能做的就是准备一个算法来做到这一点:
for n_low from 1 to 370:
for n_hi from n_low+1 to 371:
if x2 <= wavelength <= x1:
print this row to file
else:
exit
我想用python执行它。
答案 0 :(得分:3)
您可以使用功能强大的pandas
我使用io.StringIO
来模拟data
的文件,但您必须使用filename
代替f
data = '''2 1 259811.86 1.158
3 1 149730.41 -2.054
4 1 115894.98 -2.134
5 1 102320.80 -2.389
6 1 53387.13 0.256
7 1 41138.69 -0.277
8 1 35226.70 -1.585
9 1 32068.36 -1.741
10 1 12566.77 2.323
369 1 1069.66 1.461
370 1 1065.75 -7.901
371 1 1065.64 -8.011
3 2 353390.47 0.759
4 2 209224.17 -2.390
5 2 168797.89 -2.607
370 369 291200.84 -10.337
371 369 283465.88 -10.436
371 370 10672868.00 -12.012'''
import pandas as pd
# simulate file
import io
f = io.StringIO(data)
# use filename instead of `f`
# it reads data from file using spaces as separators
# and add headers 'n_hi','n_lo', 'WL(A)', 'logI'
df = pd.read_csv(f, names=['n_hi','n_lo', 'WL(A)', 'logI'], sep='\s+')
#print(df)
# get rows which have 1000 < WL < 25000
selected = df[ df['WL(A)'].between(1000, 25000) ]
print(selected)
selected.to_csv('result.csv', sep=' ', header=False)
答案 1 :(得分:3)
如果你想使用标准的python,下面的函数应该可以工作(假设数据是制表符分隔的):
def filter_wavelength(x1, x2, input_path, output_path):
with open(output_path, 'w') as output_file:
with open(input_path) as input_file:
for line in input_file:
try:
tokens = line.split('\t')
wave_length = float(tokens[2])
if x1 <= wave_length <= x2:
output_file.write(line)
except Exception, e:
print(str(e))
这样称呼:
filter_wavelength(1,2,'path/to/input', 'path/to/output')
答案 2 :(得分:1)
如果你唯一担心的是WL(A),你不需要关心n_hi和n_lo,试试这个:
def extract_wave_lengths(x1, x2, input_file, output_file):
with open(input_file, 'r') as ifile, open(output_file, 'w') as ofile:
next(ifile) # Skip header
for line in ifile:
parts = line.split()
wave_length = float(parts[2])
if x2 <= wave_length <= x1:
ofile.write(line)
然后你可以这样称呼它:
extract_wave_lengths(100000, 5000, "/path/to/input/file", "/path/to/output/file")