我是Python的初学者,我发现很难为这个问题找到正确的解决方案。我浏览了stackoverflow中的所有类似帖子,但找不到解决方案 我有一个“.ext”文件。我需要跳过前两行。第三行包含表的列名 我需要搜索列omega(n,n)和Sigma(n,n)列名,其中n可以是任意数字(例如:sigma(1,1),omega(2,2))。分析列名为“sigma(n,n)”和“omega(n,n)”的列,并检查以“-1000000000”开头的行的这些列的值。如果值为<0.001,则输出“真正的”。
我的代码是:
import numpy as np
array=[]
array1=[]
b = np.genfromtxt(r'C:/nm73/proj/one.ext', delimiter=' ', names=True,dtype=None)[3:,:]
for n in range(len(b)-1):
array=b['Sigma(n,n)']
array1=b['omega(n,n)']
我不知道如何检查元素。
One.ext文件如下所示:如果文件格式不正确,我深表歉意。我是stackoverflow的新手。任何帮助都非常感谢。
TABLE NO. 1: First Order Conditional Estimation with Interaction: Goal Function=MINIMUM VALUE OF OBJECTIVE FUNCTION: Problem=1 Subproblem=0 Superproblem1=0 Iteration1=0 Superproblem2=0 Iteration2=0
ITERATION THETA1 THETA2 SIGMA(1,1) SIGMA(2,1) SIGMA(2,2) OMEGA(1,1) OMEGA(2,1) OMEGA(2,2) OBJ
0 2.50000E-01 1.00000E+01 1.00000E-01 0.00000E+00 1.00000E-01 1.00000E-01 0.00000E+00 1.00000E-01 9436.65314342255
5 2.34948E-01 3.67675E+00 9.04159E-02 0.00000E+00 2.74933E+00 1.98686E-01 0.00000E+00 1.75724E-01 8745.97204613658
10 2.11090E-01 4.30565E+00 1.34312E-01 0.00000E+00 1.12619E+00 1.32484E-01 0.00000E+00 1.36824E-02 8595.43106384756
15 2.10696E-01 4.35495E+00 1.23897E-01 0.00000E+00 1.29124E+00 1.28600E-01 0.00000E+00 1.24441E-02 8591.51400321872
20 2.11129E-01 4.36325E+00 1.24283E-01 0.00000E+00 1.28733E+00 1.28815E-01 0.00000E+00 1.24211E-02 8591.50022332770
-1000000000 2.11129E-01 4.36325E+00 1.24283E-01 0.00000E+00 1.28733E+00 1.28815E-01 0.00000E+00 1.24211E-02 8591.50022332770
-1000000001 8.07565E-03 6.97861E-02 5.28558E-03 1.00000E+10 4.20370E-01 1.78706E-02 1.00000E+10 3.15324E-03 0.000000000000000E+000
-1000000004 0.00000E+00 0.00000E+00 3.52538E-01 0.00000E+00 1.13460E+00 3.58908E-01 0.00000E+00 1.11450E-01 0.000000000000000E+000
-1000000005 0.00000E+00 0.00000E+00 7.49648E-03 1.00000E+10 1.85250E-01 2.48957E-02 1.00000E+10 1.41465E-02 0.000000000000000E+000
答案 0 :(得分:1)
如果您未指定delimiter
,则所有连续空格将被理解为一个分隔符。如果您指定delimiter=' '
,则字面每个空间将充当分隔符。这会导致ValueError,因为genfromtxt
会出现错误的列数。
所以如果你使用:
In [396]: b = np.genfromtxt(filename, names=True, dtype=None, skip_header=1)
然后你最终得到一个像这样的结构化数组:
In [397]: b
Out[397]:
array([(0, 0.25, 10.0, 0.1, 0.0, 0.1, 0.1, 0.0, 0.1, 9436.65314342255),
(5, 0.234948, 3.67675, 0.0904159, 0.0, 2.74933, 0.198686, 0.0, 0.175724, 8745.97204613658),
(10, 0.21109, 4.30565, 0.134312, 0.0, 1.12619, 0.132484, 0.0, 0.0136824, 8595.43106384756),
(15, 0.210696, 4.35495, 0.123897, 0.0, 1.29124, 0.1286, 0.0, 0.0124441, 8591.51400321872),
(20, 0.211129, 4.36325, 0.124283, 0.0, 1.28733, 0.128815, 0.0, 0.0124211, 8591.5002233277),
(-1000000000, 0.211129, 4.36325, 0.124283, 0.0, 1.28733, 0.128815, 0.0, 0.0124211, 8591.5002233277),
(-1000000001, 0.00807565, 0.0697861, 0.00528558, 10000000000.0, 0.42037, 0.0178706, 10000000000.0, 0.00315324, 0.0),
(-1000000004, 0.0, 0.0, 0.352538, 0.0, 1.1346, 0.358908, 0.0, 0.11145, 0.0),
(-1000000005, 0.0, 0.0, 0.00749648, 10000000000.0, 0.18525, 0.0248957, 10000000000.0, 0.0141465, 0.0)],
dtype=[('ITERATION', '<i4'), ('THETA1', '<f8'), ('THETA2', '<f8'), ('SIGMA11', '<f8'), ('SIGMA21', '<f8'), ('SIGMA22', '<f8'), ('OMEGA11', '<f8'), ('OMEGA21', '<f8'), ('OMEGA22', '<f8'), ('OBJ', '<f8')])
注意最后的dtype
。列名称不包含括号或逗号,因此SIGMA(1,1)
代替SIGMA11
。In [398]: b['SIGMA11']
Out[398]:
array([ 0.1 , 0.0904159 , 0.134312 , 0.123897 , 0.124283 ,
0.124283 , 0.00528558, 0.352538 , 0.00749648])
。您可以像这样访问此列:
{{1}}
答案 1 :(得分:1)
import pandas as p
f = 'C:\Documents and Settings\Joaquin\Escritorio\one.ext'
# read your table and set the first column as index
table = p.read_csv(f, sep=' ', header=1,skipinitialspace=True )
table = table.set_index('ITERATION')
# get the two cells corresponding to the columns you wan at row -100000000
print table.xs(-1000000000)[['SIGMA(1,1)', 'OMEGA(1,1)']]
给出:
SIGMA(1,1) 0.124283
OMEGA(1,1) 0.128815
Name: -1000000000, dtype: float64