我得到了以下输出:
体育(6个位置)穆里尼奥热衷于加强长期的合作交易
意见(5个空间)朝鲜作为核大国的现实
当我写一个.txt文件时,如何让它们变成运动(1个空格)......和意见(1个空格)......
这是我的代码:
the_frame = pdsql.read_sql_query("SELECT category, title FROM training;", conn)
pd.set_option('display.max_colwidth', -1)
print(the_frame)
the_frame = the_frame.replace('\s+', ' ', regex=True)#tried to remove multiple spaces
base_filename = 'Values.txt'
with open(os.path.join(base_filename),'w') as outfile:
df = pd.DataFrame(the_frame)
df.to_string(outfile, index=False, header=False)
答案 0 :(得分:1)
我认为你的解决方案很好,只应简化:
还测试了多个标签,它也很好用。
the_frame = pdsql.read_sql_query("SELECT category, title FROM training;", conn)
the_frame = the_frame.replace('\s+', ' ', regex=True)
base_filename = 'Values.txt'
the_frame.to_csv(base_filename, index=False, header=False)
<强>示例强>:
the_frame = pd.DataFrame({
'A': ['sports mourinho keen to tie up long-term de gea deal',
'opinion the reality of north korea as a nuclear power'],
'B': list(range(2))
})
print (the_frame)
A B
0 sports mourinho keen to tie up long-term ... 0
1 opinion the reality of north korea as a nu... 1
the_frame = the_frame.replace('\s+', ' ', regex=True)
print (the_frame)
A B
0 sports mourinho keen to tie up long-term de ge... 0
1 opinion the reality of north korea as a nuclea... 1
编辑:我认为您需要将两个列与空格连接,并将输出写入file
而不使用sep
参数。
the_frame = pd.DataFrame({'category': {0: 'sports', 1: 'sports', 2: 'opinion', 3: 'opinion', 4: 'opinion'}, 'title': {0: 'mourinho keen to tie up long-term de gea deal', 1: 'suarez fires barcelona nine clear in sociedad fightback', 2: 'the reality of north korea as a nuclear power', 3: 'the real fire fury', 4: 'opposition and dr mahathir'}} )
print (the_frame)
category title
0 sports mourinho keen to tie up long-term de gea deal
1 sports suarez fires barcelona nine clear in sociedad ...
2 opinion the reality of north korea as a nuclear power
3 opinion the real fire fury
4 opinion opposition and dr mahathir
the_frame = the_frame['category'] + ' ' + the_frame['title']
print (the_frame)
0 sports mourinho keen to tie up long-term de ge...
1 sports suarez fires barcelona nine clear in so...
2 opinion the reality of north korea as a nuclea...
3 opinion the real fire fury
4 opinion opposition and dr mahathir
dtype: object
base_filename = 'Values.txt'
the_frame.to_csv(base_filename, index=False, header=False)
答案 1 :(得分:0)
您可以尝试以下操作而不是
the_frame = the_frame.replace('\s+', ' ', regex=True)
#use the below syntax
the_frame = the_frame.str.replace('\s+', ' ', regex=True)# this will remove multiple whitespaces .