我清理了包含50000行文本的csv文件,并标记了每一行。但是,在每一行上,单词都被分为不同的列:
202MAY ||失败||卡车||对等||返回||新||里夫森
但是我希望每一行都被连接起来,而不是被每个标记词分开:
202MAY失败的小队将重回新的传奇
每一行的每一列都
每行的单词数量不同,因此列的数量也不同,我该如何解决这个问题?
答案 0 :(得分:0)
str = "202MAY || DEFEATED || LORDS || PEERS || BACK || NEW || LEVESON"
print str.replace(' ||', '')
答案 1 :(得分:0)
您想要这样的东西吗?
some_text = "202MAY || DEFEATED || LORDS || PEERS || BACK || NEW || LEVESON".split("||")
print("".join(some_text))
#expected output:
#202MAY DEFEATED LORDS PEERS BACK NEW LEVESON
答案 2 :(得分:0)
import re
text = "202MAY||DEFEATED||LORDS||PEERS||BACK||NEW||LEVESON"
combined_text = re.sub(r"\|\|", " ", text)
print(combined_text)
有几种方法可以做到这一点。上面的代码使用正则表达式替换“ ||”带有空格(“”)。输出将是:202MAY DEFEATED LORDS PEERS BACK NEW LEVESON。
答案 3 :(得分:0)
class CustomWidgetForm(forms.Form):
SOME_CHOICES = (
(1, "Havana"),
(2, "New York"),
(3, "Changai"),
(4, "London"),
)
multiselect = forms.MultipleChoiceField(choices=SOME_CHOICES, widget=forms.CheckboxSelectMultiple())
因此,与其将文件作为结构化内容读取, 只需将文件读取为纯文本文件,然后将逗号替换为null。