我有一个名为documents的文件夹,其中有3,000个文本文件和两个子目录:其中包含数千个文本文件。
我正在尝试对其进行编码,以便搜索目录和子目录中的内容。
例如: 我希望python脚本在所有文本文件中搜索字符串,如果找到,则输出路径文本文件名和字符串。
到目前为止我得到的代码是:
import os
import glob
os.chdir("C:\Users\Dawn Philip\Documents\documents")
for files in glob.glob( "*.txt" ):
f = open( files, 'r' )
file_contents = f.read()
if "x" in file_contents:
print f.name
当我运行它时,它向我显示包含“x”的所有文本文件名称,但我需要它在文本文件中搜索字符串并输出包含字符串的文件的路径方式。
我的问题是'我如何获取代码来搜索文本文件中的(字符串)内容并打印“String Found> Path C:/ X / Y / Z?”
答案 0 :(得分:0)
至少对我来说,glob.glob()只搜索了顶级目录。
import os
import glob
# Sets the main directory
main_path = "C:\\Users\\Dawn Philip\\Documents\\documents"
# Gets a list of everything in the main directory including folders
main_directory = os.listdir(main_path)
# This list will hold all of the folders to search through, including the main folder
sub_directories = []
# Adds the main folder to to the list of folders
sub_directories.append(main_path)
# Loops through everthing in the main folder, searching for sub folders
for item in main_directory:
# Creates the full path to each item)
item_path = os.path.join(main_path, item)
# Checks each item to see if it is a directory
if os.path.isdir(item_path) == True:
# If it is a folder it is added to the list
sub_directories.append(item_path)
for directory in sub_directories:
for files in glob.glob(os.path.join(directory,"*.txt")):
f = open( files, 'r' )
file_contents = f.read()
if "x" in file_contents:
print f.name