我有一个超过一千行的文本文件,对于某个过程,我需要用逗号分隔单词。我想要帮助在python中开发这个算法,因为我从语言
开始ENTRADA
input phrase of the file to exemplify
赛达
input, phrase, of, the, file, to, exemplify
我这样想:
import pandas as pd
sampletxt = pd.read_csv('teste.csv' , header = None)
output = sampletxt.replace(" ", ", ")
print output
答案 0 :(得分:3)
dog_file = open("Dogs.txt", "r")
dogs = dog_file.readlines()
# you want to strip away the spaces and new line characters
content = [x.strip() for x in dogs]
data = input("Enter a name: ")
# since dogs here is a list
if data in dogs:
print("Success")
else:
print("Sorry that didn't work")
答案 1 :(得分:3)
你的行可能只是一个字符串,所以你可以使用:
line.replace(" ",", ")
答案 2 :(得分:1)
复杂性你应该用逗号直接替换空格,而不是多次遍历短语。
the_list = entrada.replace(' ', ', ')
答案 3 :(得分:1)
首先,您需要read your input on line at a time。 然后你只需使用str.replace():
sampletxt = "input phrase of the file to exemplify"
output = sampletxt.replace(" ", ", ")
你已经完成了。
答案 4 :(得分:1)
根据您添加的代码示例,您尝试回答的问题是如何将' '
替换为', '
中pandas dataframe
的每一行。
这是一种方法:
import pandas as pd
sampletxt = pd.read_csv('teste.csv' , header = None)
output = sampletxt.replace('\s+', ', ', regex=True)
print(output)
示例:强>
In [24]: l
Out[24]:
['input phrase of the file to exemplify',
'input phrase of the file to exemplify 2',
'input phrase of the file to exemplify 4']
In [25]: sampletxt = pd.DataFrame(l)
In [26]: sampletxt
Out[26]:
0
0 input phrase of the file to exemplify
1 input phrase of the file to exemplify 2
2 input phrase of the file to exemplify 4
In [27]: output = sampletxt.replace('\s+', ', ', regex=True)
In [28]: output
Out[28]:
0
0 input, phrase, of, the, file, to, exemplify
1 input, phrase, of, the, file, to, exemplify, 2
2 input, phrase, of, the, file, to, exemplify, 4
OLD回答
您也可以使用re.sub(..)
,如下所示:
In [3]: import re
In [4]: st = "input phrase of the file to exemplify"
In [5]: re.sub(' ',', ', st)
Out[5]: 'input, phrase, of, the, file, to, exemplify'
re.sub(...)
比str.replace(..)
In [6]: timeit re.sub(' ',', ', st)
100000 loops, best of 3: 1.74 µs per loop
In [7]: timeit st.replace(' ',', ')
1000000 loops, best of 3: 257 ns per loop
如果您有多个空格分隔两个单词,则基于str.replace(' ',',')
的所有答案的输出都将是错误的。例如
In [15]: st
Out[15]: 'input phrase of the file to exemplify'
In [16]: re.sub(' ',', ', st)
Out[16]: 'input, phrase, of, the, file, to, , exemplify'
In [17]: st.replace(' ',', ')
Out[17]: 'input, phrase, of, the, file, to, , exemplify'
要解决此问题,您需要使用匹配一个或多个空格的正则表达式(正则表达式),如下所示:
In [22]: st
Out[22]: 'input phrase of the file to exemplify'
In [23]: re.sub('\s+', ', ', st)
Out[23]: 'input, phrase, of, the, file, to, exemplify'