我在Python中格式化一些代码时遇到了困难: 我的代码在这里:
keys = ['(Lag)=(\d+\.?\d*)','\t','(Autocorrelation Index): (\d+\.?\d*)', '(Autocorrelation Index): (\d+\.?\d*)', '(Semivariance): (\d+\.?\d*)']
import re
string1 = ''.join(open("dummy.txt").readlines())
found = []
for key in keys:
found.extend(re.findall(key, string1))
for result in found:
print '%s = %s' % (result[0],result[1])
raw_input()
到目前为止,我得到了这个输出:
滞后= 1
滞后= 2
滞后= 3
自相关指数= #value
...
...
Semivariance = #value
但我想要的输出是:
Lag AutoCorrelation Index AutoCorrelation Index Semivariance
1 #value #value #value
2 #value #value #value
3 #value #value #value
如果可以在CSV文件或txt文件中输出,那就太棒了!
我认为这是你应该如何输出循环的一种方式,但我对循环并不是那么好。
基于@mutzmatron回答
keys = ['(Lag)=(\d+\.?\d*)',
'(Autocorrelation Index): (\d+\.?\d*)',
'(Semivariance): (\d+\.?\d*)']
import re
string1 = open("dummy.txt").readlines().join()
found = []
for key in keys:
found.extend(re.findall(key, string1))
raw_input()
for result in found:
print '%s = %s' % (result[0], result[1])
raw_input()
尚未编译!我正在使用IDLE python 2.6,因为我不知道提示中的pause命令所以不知道错误信息!
我是python的新手并且有一个问题。我正在尝试处理一个大文本文件。 这只是它的一小部分:
Band: WDRVI20((0.2*b4-b3)/((0.2*b4)+b3))
Basic Statistics:
Min: -0.963805
Max: 0.658219
Mean: 0.094306
Standard Deviation: 0.131797
Spatial Statistics, ***Lag=1***:
Total Number of Observations (Pixels): 769995
Number of Neighboring Pairs: 1538146
Moran's I:
***Autocorrelation Index: 0.8482564597***
Expected Value, if band is uncorrelated: -0.000001
Standard Deviation of Expected Value (Normalized): 0.000806
Standard Deviation of Expected Value (Randomized): 0.000806
Z Significance Test (Normalized): 1052.029088
Z Significance Test (Randomized): 1052.034915
Geary's C:
***Autocorrelation Index: 0.1517324729***
Expected Value, if band is uncorrelated: 1.000000
Standard Deviation of Expected Value (Normalized): 0.000807
Standard Deviation of Expected Value (Randomized): 0.000809
Z Significance Test (Normalized): 1051.414163
Z Significance Test (Randomized): 1048.752451
***Semivariance: 0.0026356529***
Spatial Statistics, Lag=2:
Total Number of Observations (Pixels): 769995
Number of Neighboring Pairs: 3068924
Moran's I:
Autocorrelation Index: 0.6230691635
Expected Value, if band is uncorrelated: -0.000001
Standard Deviation of Expected Value (Normalized): 0.000571
Standard Deviation of Expected Value (Randomized): 0.000571
Z Significance Test (Normalized): 1091.521976
Z Significance Test (Randomized): 1091.528022
Geary's C:
Autocorrelation Index: 0.3769372504
Expected Value, if band is uncorrelated: 1.000000
Standard Deviation of Expected Value (Normalized): 0.000574
Standard Deviation of Expected Value (Randomized): 0.000587
Z Significance Test (Normalized): 1085.700399
Z Significance Test (Randomized): 1061.931158
Semivariance: 0.0065475488
我需要在star ***值(例如:Autocorrelation Index
,Semivariance
值)之间提取信息并对其进行处理,也可以将其写入不同的文本文件或excel文件。我能这样做吗?非常感谢帮助。
答案 0 :(得分:1)
填充您要查找的密钥列表(regular expressions)。例如,
keys = ['(Lag)=(\d+\.?\d*)',
'(Autocorrelation Index): (\d+\.?\d*)',
'(Semivariance): (\d+\.?\d*)']
然后使用正则表达式
搜索这些内容import re
string1 = ''.join(open(FILE).readlines())
found = []
for key in keys:
found.extend(re.findall(key, string1))
for result in found:
print '%s = %s' % (result[0], result[1])
然后你应该有一个你想要的条目列表,你可以用它来做你需要的东西!
结果:
Lag = 1
Autocorrelation Index = 0.8482564597
Autocorrelation Index = 0.1517324729
Semivariance = 0.0026356529
<强> CSV 强>
要输出到CSV,请使用csv
模块;
import csv
outfile = open('fileout.csv', 'w')
wrt = csv.writer(outfile)
wrt.writerows(found)
outfile.close()
答案 1 :(得分:1)
为了按部分格式化数据,也许最容易处理段如下
keys =['(Lag)=(\d+\.?\d*)',
'(Autocorrelation Index): (\d+\.?\d*)',
'(Semivariance): (\d+\.?\d*)']
import re
string1 = ''.join(open("dummy.txt").readlines())
sections = string1.split('Spatial Statistics')
output = []
heads = []
for isec, sec in enumerate(sections):
found = []
output.append([])
for key in keys:
found.extend(re.findall(key, sec))
for result in found:
print '%s = %s' % (result[0],result[1])
output[-1].append(result[1])
if len(found) > 0 & len(heads) == 0:
heads = [result[0] for result in found]
fout = open('output.csv', 'w')
wrt = csv.writer(fout)
wrt.writerow(heads)
wrt.writerows(outputs)
fout.close()