如何打开文件夹中的每个文件?

时间:2013-08-15 21:36:23

标签: python file pipe stdout stdin

我有一个python脚本parse.py,它在脚本中打开一个文件,比如file1,然后做一些事情可能会打印出总字符数。

filename = 'file1'
f = open(filename, 'r')
content = f.read()
print filename, len(content)

现在,我正在使用stdout将结果定向到我的输出文件 - 输出

python parse.py >> output

但是,我不想手动通过文件执行此文件,有没有办法自动处理每个文件?像

ls | awk '{print}' | python parse.py >> output 

然后问题是如何从standardin读取文件名? 或者已经有一些内置函数可以轻松完成ls和那些工作?

谢谢!

6 个答案:

答案 0 :(得分:281)

您可以使用以下方式列出当前目录中的所有文件:

import os
for filename in os.listdir(os.getcwd()):
   # do your stuff

或者您只能列出一些文件,具体取决于使用glob模块的文件模式:

import glob
for filename in glob.glob('*.txt'):
   # do your stuff

它不必是您可以在任何所需路径中列出它们的当前目录:

path = '/some/path/to/file'

for filename in os.listdir(path):
    # do your stuff

for filename in glob.glob(os.path.join(path, '*.txt')):
    # do your stuff

或者您甚至可以使用fileinput

指定的管道
import fileinput
for line in fileinput.input():
    # do your stuff

然后用它来管道:

ls -1 | python parse.py

答案 1 :(得分:28)

你应该尝试使用os.walk

yourpath = 'path'

import os
for root, dirs, files in os.walk(yourpath, topdown=False):
    for name in files:
        print(os.path.join(root, name))
        stuff
    for name in dirs:
        print(os.path.join(root, name))
        stuff

答案 2 :(得分:7)

您实际上只需使用os module即可:

  1. 列出文件夹中的所有文件
  2. 按文件类型,文件名等对文件进行排序。
  3. 这是一个简单的例子:

    import os #os module imported here
    location = os.getcwd() # get present working directory location here
    counter = 0 #keep a count of all files found
    csvfiles = [] #list to store all csv files found at location
    filebeginwithhello = [] # list to keep all files that begin with 'hello'
    otherfiles = [] #list to keep any other file that do not match the criteria
    
    for file in os.listdir(location):
        try:
            if file.endswith(".csv"):
                print "csv file found:\t", file
                csvfiles.append(str(file))
                counter = counter+1
    
            elif file.startswith("hello") and file.endswith(".csv"): #because some files may start with hello and also be a csv file
                print "csv file found:\t", file
                csvfiles.append(str(file))
                counter = counter+1
    
            elif file.startswith("hello"):
                print "hello files found: \t", file
                filebeginwithhello.append(file)
                counter = counter+1
    
            else:
                otherfiles.append(file)
                counter = counter+1
        except Exception as e:
            raise e
            print "No files found here!"
    
    print "Total files found:\t", counter
    

    现在,您不仅列出了文件夹中的所有文件,还将它们(可选)按起始名称,文件类型等进行排序。刚才迭代每个列表并做你的事情。

答案 3 :(得分:6)

我一直在寻找这个答案:

import os,glob
folder_path = '/some/path/to/file'
for filename in glob.glob(os.path.join(folder_path, '*.htm')):
  with open(filename, 'r') as f:
    text = f.read()
    print (filename)
    print (len(text))

您也可以选择“* .txt”或文件名的其他两端

答案 4 :(得分:1)

import pyautogui
import keyboard
import time
import os
import pyperclip

os.chdir("target directory")

# get the current directory
cwd=os.getcwd()

files=[]

for i in os.walk(cwd):
    for j in i[2]:
        files.append(os.path.abspath(j))

os.startfile("C:\Program Files (x86)\Adobe\Acrobat 11.0\Acrobat\Acrobat.exe")
time.sleep(1)


for i in files:
    print(i)
    pyperclip.copy(i)
    keyboard.press('ctrl')
    keyboard.press_and_release('o')
    keyboard.release('ctrl')
    time.sleep(1)

    keyboard.press('ctrl')
    keyboard.press_and_release('v')
    keyboard.release('ctrl')
    time.sleep(1)
    keyboard.press_and_release('enter')
    keyboard.press('ctrl')
    keyboard.press_and_release('p')
    keyboard.release('ctrl')
    keyboard.press_and_release('enter')
    time.sleep(3)
    keyboard.press('ctrl')
    keyboard.press_and_release('w')
    keyboard.release('ctrl')
    pyperclip.copy('')

答案 5 :(得分:0)

以下代码读取包含我们正在运行的脚本的目录中所有可用的文本文件。然后,它将打开每个文本文件,并将文本行中的单词存储到列表中。存储单词后,我们逐行打印每个单词

import os, fnmatch

listOfFiles = os.listdir('.')
pattern = "*.txt"
store = []
for entry in listOfFiles:
    if fnmatch.fnmatch(entry, pattern):
        _fileName = open(entry,"r")
        if _fileName.mode == "r":
            content = _fileName.read()
            contentList = content.split(" ")
            for i in contentList:
                if i != '\n' and i != "\r\n":
                    store.append(i)

for i in store:
    print(i)