从python中的文本文件中读取文件名(双反斜杠问题)

时间:2013-12-10 08:17:34

标签: python numpy

我正在尝试从文本文件中读取文件列表。我使用以下代码来执行此操作:

filelist = input("Please Enter the filelist: ")
flist = open (os.path.normpath(filelist),"r")
fname = []
for curline in flist:
    # check if its a coment - do comment parsing in this if block
    if curline.startswith('#'): 
        continue
    fname.append(os.path.normpath(curline));
flist.close() #close the list file

# read the slave files 100MB at a time to generate stokes vectors
tmp = fname[0].rstrip()
t = np.fromfile(tmp,dtype='float',count=100*1000)

这完全正常,我得到以下数组:

'H:\\Shaunak\\TerraSAR_X- Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\i_HH_mst_08Oct2012.bin\n'
'H:\\Shaunak\\TerraSAR_X- Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\i_HH_mst_08Oct2012.bin\n'
'H:\\Shaunak\\TerraSAR_X- Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\q_HH_slv3_08Oct2012.bin\n'
'H:\\Shaunak\\TerraSAR_X- Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\q_VV_slv3_08Oct2012.bin'

问题是'\'字符被转义,字符串中有一个尾随'\n'。我使用str.rstrip()来摆脱'\ n' - 这是有效的,但是留下了两个反斜杠的问题。

我使用以下方法尝试摆脱这些:

  1. 使用codecs.unicode_escape_decode()但我收到此错误: UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 56-57: malformed \N character escape。显然这不是正确的方法,因为我只想解码背面,而不是字符串的其余部分。

  2. 这也不起作用:tmp = fname[0].rstrip().replace(r'\\','\\');

  3. 有没有办法让readline()读取原始字符串?


    更新:

    基本上我有一个包含4个文件名的文本文件,我想打开并从python中读取数据。文本文件包含:

    H:\Shaunak\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\NEST_oregistration\Glacier_coreg_Cnv\i_HH_mst_08Oct2012.bin
    H:\Shaunak\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\NEST_oregistration\Glacier_coreg_Cnv\i_HH_mst_08Oct2012.bin
    H:\Shaunak\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\NEST_oregistration\Glacier_coreg_Cnv\q_HH_slv3_08Oct2012.bin
    H:\Shaunak\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\NEST_oregistration\Glacier_coreg_Cnv\q_VV_slv3_08Oct2012.bin 
    

    我想逐个打开每个文件,并从中读取100MB的数据。 当我使用此命令时:np.fromfile(flist[0],dtype='float',count=100)我收到此错误:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    FileNotFoundError: [Errno 2] No such file or directory: 'H:\\Shaunak\\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\i_HH_mst_08Oct2012.bin'
    

    更新

    完整追溯:

    Please Enter the filelist: H:/Shaunak/TerraSAR_X- Sep2012-Glacier_Velocity_Gangotri/NEST_oregistration/Glacier_coreg_Cnv/filelist.txt
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "G:\WinPython-32bit-3.3.2.3\python-3.3.2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 581, in runfile
        execfile(filename, namespace)
      File "G:\WinPython-32bit-3.3.2.3\python-3.3.2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 41, in execfile
        exec(compile(open(filename).read(), filename, 'exec'), namespace)
      File "H:/Shaunak/Programs/Arnab_glacier_vel/Stokes_generation_2.py", line 28, in <module>
        t = np.fromfile(tmp,dtype='float',count=100*1000)
    FileNotFoundError: [Errno 2] No such file or directory: 'H:\\Shaunak\\TerraSAR_X-Sep2012-Glacier_Velocity_Gangotri\\NEST_oregistration\\Glacier_coreg_Cnv\\i_HH_mst_08Oct2012.bin'
    >>> 
    

2 个答案:

答案 0 :(得分:0)

正如@volcano所说,双斜线只是一个内部表示。如果你打印它们就会消失。如果将其写入文件,则只有一个'\'。

>>> string_with_double_backslash = "Here is a double backslash: \\"
>>> print(string_with_double_backslash)
Here is a double backslash: \

答案 1 :(得分:0)

尝试一下:

a_escaped = 'attachment; filename="Nuovo Cinema Paradiso 1988 Director\\\'s Cut"'
a_unescaped = codecs.getdecoder("unicode_escape")(a)[0]

收益:

'attachment; filename="Nuovo Cinema Paradiso 1988 Director\'s Cut"'