我有一个大文件的文本文件,我想读取其中的几行,并将这些行作为一行写到文本文件中。例如,我想从某个开始的单词开始逐行阅读,并以一个单独的括号结束。因此,如果我的起始词是“ CAR”,那么我想开始阅读,直到读取一个带换行符的括号。起始词和结束词也要保留。
实现此目标的最佳方法是什么?我已经尝试过模式匹配并避免使用正则表达式,但是我认为这是不可能的。
代码:
array = []
f = open('text.txt','r') as infile
w = open(r'temp2.txt', 'w') as outfile
for line in f:
data = f.read()
x = re.findall(r'CAR(.*?)\)(?:\\n|$)',data,re.DOTALL)
array.append(x)
outfile.write(x)
return array
文字可能是什么样
( CAR: *random info*
*random info* - could be many lines of this
)
答案 0 :(得分:1)
使用正则表达式完全可以解决这类问题。当模式包含递归时,就不能使用它们,例如从括号中获取内容:((text1)(text2))。
您可以使用以下正则表达式:>>> User.query.all()
[<User foo@domain.com>, <User bar@domain.com>]
>>> # DB restarted
...
>>> User.query.all()
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 312, in _query
db.query(q)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 224, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (2006, 'MySQL server has gone away')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3161, in all
return list(self)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3317, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3342, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 312, in _query
db.query(q)
File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 224, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (2006, 'MySQL server has gone away')
[SQL: SELECT user.id AS user_id, user.email AS user_email, user.token_id AS user_token_id
FROM user]
(Background on this error at: http://sqlalche.me/e/e3q8)
>>> User.query.all()
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1177, in _execute_context
conn = self._revalidate_connection()
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 463, in _revalidate_connection
"Can't reconnect until invalid "
sqlalchemy.exc.InvalidRequestError: Can't reconnect until invalid transaction is rolled back
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3161, in all
return list(self)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3317, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3342, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
e, util.text_type(statement), parameters, None, None
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1177, in _execute_context
conn = self._revalidate_connection()
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 463, in _revalidate_connection
"Can't reconnect until invalid "
sqlalchemy.exc.StatementError: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
[SQL: SELECT user.id AS user_id, user.email AS user_email, user.token_id AS user_token_id
FROM user]
[parameters: [{}]]
>>> db.rollback()
>>> User.query.all()
[<User foo@domain.com>, <User bar@domain.com>]
答案 1 :(得分:1)
我们可以使用正则表达式模式((CAR.*)\)
和标志gms
)来匹配您感兴趣的文本。
然后,我们只需要从结果匹配中删除换行符并将它们写入文件即可。
with open("text.txt", 'r') as f:
matches = re.findall(r"(CAR.*)\)", f.read(), re.DOTALL)
with open("output.txt", 'w') as f:
for match in matches:
f.write(" ".join(match.split('\n')))
f.write('\n')
输出文件如下:
CAR: *random info* *random info* - could be many lines of this
编辑: 更新了代码以在输出文件中的匹配项之间放置换行符