在标准输入时,我提供以下文件:
#123 595739778 "neutral" Won the match #getin
#164 595730008 "neutral" Good girl
数据#2看起来像这样:
labels 1 0 -1
-1 0.272653 0.139626 0.587721
1 0.0977782 0.0748234 0.827398
我想看看数据#2文件中的-1是否替换为负数,1然后是正数,0则是中性
以下是我的问题:
如果我这样做,如下所示(请注意打印声明):
if binary == "-1":
senti = str.replace(senti.strip('"'),"negative")
elif binary == "1":
senti = str.replace(senti.strip('"'),"positive")
elif binary == "0":
senti = str.replace(senti.strip('"'),"neutral")
print id, "\t", num, "\t", senti, "\t", sent
但如果我这样做(请注意打印),那么它就不会进入'if conditions':
if binary == "-1":
senti = str.replace(senti.strip('"'),"negative")
elif binary == "1":
senti = str.replace(senti.strip('"'),"positive")
elif binary == "0":
senti = str.replace(senti.strip('"'),"neutral")
打印ID,“\ t”,num,“\ t”,senti,“\ t”,已发送
我如何打印呢。 我得到的输出: #123 595739778“中立”赢得了比赛#getin #164 595730008“中立”好女孩
output expected (replace just replaces the negative, positive & neutral as per data# file:
#123 595739778 negative Won the match #getin
#164 595730008 positive Good girl
错误:
Traceback (most recent call last):
File "./combine.py", line 17, in <module>
senti = str.replace(senti.strip('"'),"negative")
TypeError: replace() takes at least 2 arguments (1 given)
这是我的代码:
for line in sys.stdin:
(id,num,senti,sent) = re.split("\t+",line.strip())
tweet = re.split("\s+", sent.strip().lower())
f = open("data#2.txt","r")
for line1 in f:
(binary,rest,rest1,test2) = re.split("\s", line1.strip())
if binary == "-1":
senti = str.replace(senti.strip('"'),"negative")
elif binary == "1":
senti = str.replace(senti.strip('"'),"positive")
elif binary == "0":
senti = str.replace(senti.strip('"'),"neutral")
print id, "\t", num, "\t", senti, "\t", sent
答案 0 :(得分:3)
你实际上错过了替换的论点;因为它是字符串本身的一种方法,你可以这样做:
In [72]: str.replace('one','o','1')
Out[72]: '1ne'
或
In [73]: 'one'.replace('o','1')
Out[73]: '1ne'
在您的代码中,您可能想要,例如
if binary == "-1":
senti = senti.strip('"').replace("-1","negative")
要跳过数据#2文件的第一行,一个选项是
f = open("data#2.txt","r")
for line1 in f.readlines()[1:]: # skip the first line
#rest of your code here
编辑:聊天对话后,我想你想要的更像是以下内容:
f = open("data#2.txt","r")
datalines = f.readlines()[1:]
count = 0
for line in sys.stdin:
if count == len(datalines): break # kill the loop if we've reached the end
(tweetid,num,senti,tweets) = re.split("\t+",line.strip())
tweet = re.split("\s+", tweets.strip().lower())
# grab the right index from our list
(binary,rest,rest1,test2) = re.split("\s", datalines[count].strip())
if binary == "-1":
sentiment = "negative"
elif binary == "1":
sentiment = "positive"
elif binary == "0":
sentiment = "neutral"
print tweetid, "\t", num, "\t", sentiment, "\t", tweets
count += 1 # add to our counter