我需要一些关于如何做到这一点的投入,非常感谢您的投入,我查看了其他帖子,但没有一个符合我的要求。
How to remove line from the file in python Remove lines from textfile with python
我需要根据提供的输入字符串匹配文件中的多行注释。
实施例: -
让我们说如果文件“test.txt”有以下注释,如果inputstring =“这是一个测试,脚本编写”这个注释需要从文件中删除
import os
import sys
import re
import fnmatch
def find_and_remove(haystack, needle):
pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
return re.sub(pattern, "", haystack)
for path,dirs,files in os.walk(sys.argv[1]):
for fname in files:
for pat in ['*.cpp','*.c','*.h','*.txt']:
if fnmatch.fnmatch(fname,pat):
fullname = os.path.join(path,fname)
with open(fullname, "r") as f:
find_and_remove(f, r"This is a test, script written")
错误: -
Traceback (most recent call last):
File "comment.py", line 16, in <module>
find_and_remove(f, r"This is a test, script written")
File "comment.py", line 8, in find_and_remove
return re.sub(pattern, "", haystack)
File "/usr/lib/python2.6/re.py", line 151, in sub
return _compile(pattern, 0).sub(repl, string, count)
TypeError: expected string or buffer
答案 0 :(得分:3)
当我看到这个问题时,首先想到的是“状态机”,每当我想到python中的“状态机”时,首先想到的是“生成器”a.k.a. yield:
def skip_comments(f):
"""
Emit all the lines that are not part of a multi-line comment.
"""
is_comment = False
for line in f:
if line.strip().startswith('/*'):
is_comment = True
if line.strip().endswith('*/'):
is_comment = False
elif is_comment:
pass
else:
yield line
def print_file(file_name):
with file(file_name, 'r') as f:
skipper = skip_comments(f)
for line in skipper:
print line,
编辑:user1927396通过指定它只是要排除的特定块(包含特定文本)来提高赌注。由于它在评论栏内,如果我们需要拒绝该块,我们将不会预先知道。
我的第一个想法是缓冲。确认。便便。我的第二个想法是令人难以忘怀的克制,我已经在脑子里扛了15年,从来没有使用到现在:“堆叠状态机”......
def squelch_comment(f, first_line, exclude_if):
"""
Comment is a multi-line comment that we may want to suppress
"""
comment = [first_line]
if not first_line.strip().endswith('*/'):
for line in f:
if exclude_if in line:
comment = None
if comment and len(comment):
comment.append(line)
if line.strip().endswith('*/'):
break
if comment:
for comment_line in comment:
yield '...' + comment_line
def skip_comments(f):
"""
Emit all the lines that are not part of a multi-line comment.
"""
for line in f:
if line.strip().startswith('/*'):
# hand off to the nested, comment-handling, state machine
for comment_line in squelch_comment(f, line, 'This is a test'):
yield comment_line
else:
yield line
def print_file(file_name):
with file(file_name, 'r') as f:
for line in skip_comments(f):
print line,
答案 1 :(得分:1)
这应该适用于原则
def skip(file, lines):
cline = 0
result = ""
for fileLine in file.read():
if cline not in lines:
result += fileLine
cline += 1
return result
行必须是数字列表,文件必须是打开的文件
答案 2 :(得分:1)
这个在请求中执行:删除包含所需字符串的所有多行注释:
将其放在名为program.txt
/*
* This is a test, script written
* This is a comment line
* Multi-line comment
* Last comment
*
*/
some code
/*
* This is a comment line
* And should
* not be removed
*
*/
more code
然后搜索并替换。只需确保needle
不会引入一些正则表达式特殊字符。
import re
def find_and_remove(haystack, needle):
pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
return re.sub(pattern, "", haystack)
# assuming your program is in a file called program.txt
program = open("program.txt", "r").read()
print find_and_remove(program, r"This is a test, script written")
结果:
some code
/*
* This is a comment line
* And should
* not be removed
*
*/
more code
中的正则表达式
编辑代码中的最后一部分:
for path,dirs,files in os.walk(sys.argv[1]):
for fname in files:
for pat in ['*.cpp','*.c','*.h','*.txt']:
if fnmatch.fnmatch(fname,pat):
fullname = os.path.join(path,fname)
# put all the text into f and read and replace...
f = open(fullname).read()
result = find_and_remove(f, r"This is a test, script written")
new_name = fullname + ".new"
# After testing, then replace newname with fullname in the
# next line in order to replace the original file.
handle = open(new_name, 'w')
handle.write(result)
handle.close()
确保在needle
中转义所有正则表达式特殊字符,例如().
如果您的文字包含括号,例如(any text)
,则它们应显示在needle
\(any text\)