尝试替换文本文件中的子字符串时,其中每行引用双引号(" line"),如下所示:
" @user can't wait! love the orsen welles/citizen kane teaser reference in the promo vid! positive"
此代码无效:
import re
with open(in_file, "r") as infile, open("Output.txt", "w") as outfile:
for line in infile:
#line = re.sub("@user", " ", line)
line = line.replace("@user"," ")
outfile.write(line)
Output.txt文件包含与输入文件完全相同的行。怎么了?
*更新-2 *
链接到文本文件来源的存档:
http://www.mpi-inf.mpg.de/~smukherjee/data/twitter-data.tar.gz
要获取此文件,只需保存' Manually-Annotated-Tweets.csv'到文本文件。
*更新*
输入和输出文件名不同。文件可以正常读取。尝试了几种不同的输出文件名。
配置:
Server Information:
You are using Jupyter notebook.
The version of the notebook server is 4.3.1 and is running on:
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)]
Current Kernel Information:
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.