Question

我正在尝试提取文件的一些数据。为此目的已经创建了一个脚本来读取文件，如果检测到某个关键字，它会开始复制，然后，当找到一个空行时，它会停止复制。我认为这不是太糟糕，但是没有用。

我写的python脚本是：

import matplotlib.path as mpath
import matplotlib.pyplot as plt
import numpy as np

import cartopy.crs as ccrs
import cartopy.feature

lats = np.linspace(60,90,30)
lons = np.linspace(0,360,200)
X,Y = np.meshgrid(lons,lats)
Z = np.random.normal(size = X.shape)

def main():
    fig = plt.figure(figsize=[10, 5])
    ax = plt.subplot(1, 1, 1, projection=ccrs.NorthPolarStereo())
    fig.subplots_adjust(bottom=0.05, top=0.95,
                        left=0.04, right=0.95, wspace=0.02)

    # Limit the map to -60 degrees latitude and below.
    ax.set_extent([-180, 180, 60, 60], ccrs.PlateCarree())

    ax.gridlines()

    ax.add_feature(cartopy.feature.OCEAN)
    ax.add_feature(cartopy.feature.LAND)

    # Compute a circle in axes coordinates, which we can use as a boundary
    # for the map. We can pan/zoom as much as we like - the boundary will be
    # permanently circular.
    theta = np.linspace(0, 2*np.pi, 100)
    center, radius = [0.5, 0.5], 0.5
    verts = np.vstack([np.sin(theta), np.cos(theta)]).T
    circle = mpath.Path(verts * radius + center)

    ax.set_boundary(circle, transform=ax.transAxes)
    ax.pcolormesh(X,Y,Z,transform=ccrs.PlateCarree())


    plt.show()


if __name__ == '__main__':
    main()

此函数用于从此文件中提取名为def out_to_mop (namefilein, namefileout): print namefilein filein=open(namefilein, "r") fileout=open(namefileout, "w") lines = filein.readlines() filein.close() #look for keyword "CURRENT.." to start copying try: indexmaxcycle = lines.index(" CURRENT BEST VALUE OF HEAT OF FORMATION") indexmaxcycle += 5 except: indexmaxcycle = 0 if indexmaxcycle != 0: while lines[indexmaxcycle]!=" \n": linediv = lines[indexmaxcycle].split() symbol = linediv[0] x = float(linediv[1]) indexmaxcycle += 1 fileout.write("%s \t %3.8f 1 \n" %(symbol, x)) else: print "structure not found" exit() fileout.close()的信息：

file1.out

但它打印＆＃34;找不到结构＆＃34;

你会帮我一点吗？

Answer 1

您尝试使用代码行

查找结构的开头

indexmaxcycle = lines.index("          CURRENT BEST VALUE OF HEAT OF FORMATION")

index方法的文档说：“在值为x的第一个项目列表中返回从零开始的索引。如果没有这样的项目，则引发ValueError。”但是，您要搜索的那一行不是文件行之一。实际的文件行是

          CURRENT BEST VALUE OF HEAT OF FORMATION =  -1161.249249

请注意结尾处的数字，该数字不在您的搜索字符串中。因此，index方法会引发异常，并且您的indexmaxcycle值为零。

由于您显然事先并不知道文件行的全部内容，因此您应自行遍历输入行并使用in运算符查找包含的行搜索字符串。您也可以这样使用startswith字符串方法：

for j, line in enumerate(lines):
    if line.startswith("          CURRENT BEST VALUE OF HEAT OF FORMATION"):
        indexmaxcycle = j + 5
        break
else:
    indexmaxcycle = 0

我在这里删除了try..except结构，因为我认为不会为此代码引发异常。当然，我可能是错的。

Answer 2

您正在寻找完全匹配，但文本文件中的行比您要查找的模式长。尝试搜索行的开头：

pattern = "          CURRENT BEST VALUE OF HEAT OF FORMATION"
try:
    indexmaxcycle = [i for (i,s) in enumerate(lines) if s.startswith(pattern)][0]
    indexmaxcycle += 5 
etc.

[i for (i,s) in enumerate(lines) if s.startswith(pattern)]为您提供以您的模式开头的所有元素索引。如果您添加[0]，则会获得第一个。

我只是注意到如果使用生成器表达式而不是列表推导，你可以加快速度：

pattern = "          CURRENT BEST VALUE OF HEAT OF FORMATION"
try:
    indexmaxcycle = next((i for (i,s) in enumerate(lines) if s.startswith('foo'))) + 5
except:
    etc.

这只会搜索列表，直到找到第一个匹配项。

无法正确提取文件的信息

2 个答案: