对仅在一半文件

时间:2015-08-26 15:24:25

标签: python text pandas

我使用以下代码循环浏览文本文件并对其进行一些更改,然后将它们保存到新文件夹中。由于某种原因,代码在我的txt文件列表中间停止运行。我有54​​个文件,在返回此错误之前,只有30个文件使用此代码进行操作:

IOError: Initializing from file failed

我使用的代码是:

import pandas as pd
import os

d={'Band 1$': '1984137',
    'Band 2$': '1984185',
    'Band 3$': '1984249',
    'Band 4$': '1985139',
    'Band 5$': '1985171',
    'Band 6$': '1986206',
    'Band 7$': '1986238',
    'Band 8$': '1987241',
    'Band 9$': '1987257',
    'Band 10$': '1987273',
    'Band 11$': '1988212'}



pth = r'D:\Sheyenne\Statistics\NDVI_allotment\Text' # path to files
new = os.path.join(pth,"new") 
os.mkdir(new)  # create new dir for new files
# loop over each file and update
for f in os.listdir(pth):
     if not os.path.isfile(os.path.join(pth,f)):
          df = pd.read_csv(os.path.join(pth, f), sep='\t', nrows=80,    skiprows=2)
          #replace string names
          df=df.replace(d)
          #sort data
          df.sort(columns='Basic Stats', axis=0, ascending=True, inplace=True)
          #save data to csv
          df.to_csv(os.path.join(new, "new_{}".format(f)), index=False, sep="\t")

print 'Done Processing'

其中一个文本文件的前1000个字符的示例是:

'Filename: F:\\Sheyenne\\Atmospherically Corrected Landsat\\Indices\\Main\\NDVI\\NDVI_stack\nROI: EVF: Layer: Main_allotments.shp (allotment1=A. Annex) [White] 3984 points\n\nBasic Stats\t      Min\t     Max\t    Mean\t   Stdev\t  Num\tEigenvalue\n     Band 1\t 0.428944\t0.843916\t0.689923\t0.052534\t    1\t  0.229509\n     Band 2\t-0.000000\t0.689320\t0.513170\t0.048885\t    2\t  0.119217\n     Band 3\t 0.336438\t0.743478\t0.592622\t0.052544\t    3\t  0.059111\n     Band 4\t 0.313259\t0.678561\t0.525667\t0.048047\t    4\t  0.051338\n     Band 5\t 0.374522\t0.746828\t0.583513\t0.055989\t    5\t  0.027913\n     Band 6\t-0.000000\t0.749325\t0.330068\t0.314351\t    6\t  0.022561\n     Band 7\t-0.000000\t0.819288\t0.600136\t0.170060\t    7\t  0.018126\n     Band 8\t-0.000000\t0.687823\t0.450559\t0.084678\t    8\t  0.012942\n     Band 9\t 0.332637\t0.776398\t0.549870\t0.085212\t    9\t  0.009261\n    Band 10\t 0.386589\t0.848977\t0.635024\t0.087712\t   10\t  0.006628\n    Band 11\t 0.265165\t0.822361\t0.594286\t0.075730\t   11\t  0.004517\n    Band 12\t 0.191882\t0.539559\t0.343836\t0.0'

编辑:

返回的完整错误是:

runfile('F:/docs/ESSP 502/Final Project/Codes/try2.py', wdir='F:/docs/ESSP 502/Final Project/Codes')
Traceback (most recent call last):

  File "<ipython-input-7-95e6eea0c3e4>", line 1, in <module>
    runfile('F:/docs/ESSP 502/Final Project/Codes/try2.py', wdir='F:/docs/ESSP 502/Final Project/Codes')

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
    execfile(filename, namespace)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "F:/docs/ESSP 502/Final Project/Codes/try2.py", line 18, in <module>
    df = pd.read_csv(os.path.join(pth, f), sep='\t', nrows=80, skiprows=2)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\io\parsers.py", line 474, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\io\parsers.py", line 250, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\io\parsers.py", line 566, in __init__
    self._make_engine(self.engine)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\io\parsers.py", line 705, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1072, in __init__
    self._reader = _parser.TextReader(src, **kwds)

  File "pandas\parser.pyx", line 350, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:3173)

  File "pandas\parser.pyx", line 595, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:5926)

IOError: Initializing from file failed

1 个答案:

答案 0 :(得分:1)

错误可能是由于在处理中包含输出目录并尝试将其作为csv处理。只需通过编辑for循环省略目录以包含检查以确保目录条目实际上是文件:

for f in os.listdir(pth):
    if not os.path.isfile(os.path.join(pth,f)):
        continue

    df = pd.read_csv(os.path.join(pth, f), sep='\t', nrows=80, skiprows=2)
    # ...