Python使用glob.glob支持多种文件类型

时间:2016-08-02 14:24:17

标签: python python-2.7 glob

我正在尝试使用glob.glob来支持多种文件类型。我的代码应该带有扩展名为'.pdf','。xls'和'.xlsx'的文件驻留在目录'/ mnt / Test'中,并在找到文件匹配后执行下面的代码。< / p>

当我用

替换现有的for循环时
for filename in glob.glob("*.xlsx"):
     print filename

它运作得很好。

尝试运行以下代码时:

def main():
    os.chdir("/mnt/Test")
    extensions = ("*.xls", ".xlsx", ".pdf")
    filename = []
    for files in extensions:
        filename.extend(glob.glob(files))
        print filename
        sys.stdout.flush()
        doc_id, version = doc_placeholder(filename)

        print 'doc_id:', doc_id, 'version:', version

        workspace_upload(doc_id, version, filename)

        print "%s has been found. Preparing next phase..." % filename
        ftp_connection.cwd(remote_path)
        fh = open(filename, 'rb')
        ftp_connection.storbinary('STOR %s' % timestr + '_' + filename, fh)
        fh.close()

        send_email(filename)

我遇到以下错误:

Report /mnt/Test/fileTest.xlsx has been added.
[]
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/dist-    packages/watchdog/observers/api.py", line 199, in run
self.dispatch_events(self.event_queue, self.timeout)
File "/usr/local/lib/python2.7/dist- packages/watchdog/observers/api.py", line 368, in dispatch_events
handler.dispatch(event)
File "/usr/local/lib/python2.7/dist-packages/watchdog/events.py", line 330, in dispatch
_method_map[event_type](event)
File "observe.py", line 14, in on_created
fero.main()
File "/home/tesuser/project-a/testing.py", line 129, in main
doc_id, version = doc_placeholder(filename)
File "/home/testuser/project-a/testing.py", line 58, in doc_placeholder
payload = {'documents':[{'document':{'name':os.path.splitext(filename)[0],'parentId':parent_id()}}]}
File "/usr/lib/python2.7/posixpath.py", line 105, in splitext
return genericpath._splitext(p, sep, altsep, extsep)
File "/usr/lib/python2.7/genericpath.py", line 91, in _splitext
sepIndex = p.rfind(sep)
AttributeError: 'list' object has no attribute 'rfind'

如何编辑上面的代码以实现我的需求?

先谢谢大家。感谢帮助。

1 个答案:

答案 0 :(得分:0)

doc_placeholder包含此代码段os.path.splitext(filename)。假设filename是您传入的列表,当您期待字符串时,您已向os.path.splittext提供了一个列表。

通过迭代每个文件名而不是尝试一次处理整个列表来解决此问题。

def main():
    os.chdir("/mnt/Test")
    extensions = ("*.xls", "*.xlsx", "*.pdf")
    filenames = []  # made 'filename' plural to indicate it's a list

    # building list of filenames moved to separate loop
    for files in extensions: 
        filenames.extend(glob.glob(files)) 

    # iterate over filenames    
    for filename in filenames: 
        print filename
        sys.stdout.flush()
        doc_id, version = doc_placeholder(filename)

        print 'doc_id:', doc_id, 'version:', version

        workspace_upload(doc_id, version, filename)

        print "%s has been found. Preparing next phase..." % filename
        ftp_connection.cwd(remote_path)
        fh = open(filename, 'rb')
        ftp_connection.storbinary('STOR %s' % timestr + '_' + filename, fh)
        fh.close()

        send_email(filename)