这是我的代码:
with open(root_dir+"/trials/classify/training_queries.txt","r") as f:
queries = f.readlines()
#queries = f.read()
上面的代码从文件中逐行读取内容,并为我的案例提供每行的结果。
我想显示整个文件内容的结果(一次阅读整个段落),该功能是什么?
我认为queries = f.read()
会有所帮助,但它会逐字逐句地考虑。
更新
示例输入:
Hell, the Orioles' Opening Day game could easily be the largest in history
if we had a stadium with 80,000 seats. But unfortunely the Yards (a
definitely excellent ballpark) only holds like 45,000 with 275 SRO spots.
Ticket sales for the entire year is moving fast. Bleacher seats are almost
gone for every game this year. Athist does not believe in any religion whether hinduis islam or chirstianism
输出方案:
对于readLine() - 它是逐行处理的
我想要做的是考虑整个文件内容。
代码段:
if __name__ == '__main__':
#CallDomainDetection().callDomainDetection(sys.argv[1])
root_dir = os.getcwd()
query_no = 1
with open(root_dir+"/trials/classify/training_queries.txt","r") as f:
#queries = f.readlines() # this processes line in files
queries = f.read() # now it consider each character.
for qu in queries:
CallDomainDetection().callDomainDetection(qu)
if query_no == 40:
break
query_no += 1
答案 0 :(得分:3)
f.read()
就是你想要的。您可能需要将其拆分两个换行符,将其划分为段落 - split('\n\n')
。您所描述的内容听起来就像是在迭代字符串本身 - 这将意味着char迭代。
答案 1 :(得分:2)
queries = f.read()
会将整个文件读入字符串queries
。只有迭代该字符串,您才能获得单个字符(如for c in queries:
中所示)。
待办事项
with open(root_dir+"/trials/classify/training_queries.txt","r") as f:
queries = f.read()
print(queries)
并看到queries
是一个字符串。
答案 2 :(得分:1)
您必须将“段落”定义为通过加入a形成的字符串 非隔离线的非空序列,与任何相邻的段落分开 通过非空的分隔线序列。
def paragraphs(lines, is_separator=str.isspace, joiner=''.join):
paragraph = [ ]
for line in lines:
if is_separator(line):
if paragraph:
yield joiner(paragraph)
paragraph = [ ]
else:
paragraph.append(line)
if paragraph:
yield joiner(paragraph)
if __name__ == '__main__':
with open(root_dir+"/trials/classify/training_queries.txt","r") as f:
queries = f.readlines()
for p in paragraphs(queries): print repr(p)