python根据条件替换单词

时间:2013-12-10 02:12:58

标签: python regex file file-io

在标准输入时,我提供以下文件:

    #123     595739778       "neutral"       Won the match #getin
    #164     595730008       "neutral"      Good girl

数据#2看起来像这样:

    labels 1 0 -1
    -1 0.272653 0.139626 0.587721
    1 0.0977782 0.0748234 0.827398

我想看看数据#2文件中的-1是否替换为负数,1然后是正数,0则是中性

以下是我的问题:

  1. 从第2行的数据#2文件开始
  2. 我在替换方面遇到了麻烦。我想替换如下,但它显示一个错误,它预计还有1个参数,但我已经有2个参数。
  3. 如果我这样做,如下所示(请注意打印声明):

    if binary == "-1":
      senti = str.replace(senti.strip('"'),"negative")
    elif binary == "1":
      senti = str.replace(senti.strip('"'),"positive")
    elif binary == "0":
      senti = str.replace(senti.strip('"'),"neutral")
    print id, "\t", num, "\t", senti, "\t", sent
    

    但如果我这样做(请注意打印),那么它就不会进入'if conditions':

    if binary == "-1":
       senti = str.replace(senti.strip('"'),"negative")
    elif binary == "1":
       senti = str.replace(senti.strip('"'),"positive")
    elif binary == "0":
       senti = str.replace(senti.strip('"'),"neutral")
    

    打印ID,“\ t”,num,“\ t”,senti,“\ t”,已发送

  4. 我如何打印呢。      我得到的输出:         #123 595739778“中立”赢得了比赛#getin         #164 595730008“中立”好女孩

     output expected (replace just replaces the negative, positive & neutral as per data# file:
    
        #123     595739778       negative       Won the match #getin
        #164     595730008       positive       Good girl
    

    错误:

     Traceback (most recent call last):
       File "./combine.py", line 17, in <module>
         senti = str.replace(senti.strip('"'),"negative")
     TypeError: replace() takes at least 2 arguments (1 given)
    

    这是我的代码:

    for line in sys.stdin:
        (id,num,senti,sent) = re.split("\t+",line.strip())
        tweet = re.split("\s+", sent.strip().lower())
        f = open("data#2.txt","r")
        for line1 in f:
           (binary,rest,rest1,test2) = re.split("\s", line1.strip())
           if binary == "-1":
              senti = str.replace(senti.strip('"'),"negative")
           elif binary == "1":
              senti = str.replace(senti.strip('"'),"positive")
           elif binary == "0":
              senti = str.replace(senti.strip('"'),"neutral")
           print id, "\t", num, "\t", senti, "\t", sent
    

1 个答案:

答案 0 :(得分:3)

你实际上错过了替换的论点;因为它是字符串本身的一种方法,你可以这样做:

In [72]: str.replace('one','o','1')
Out[72]: '1ne'

In [73]: 'one'.replace('o','1')
Out[73]: '1ne'

在您的代码中,您可能想要,例如

   if binary == "-1":
      senti = senti.strip('"').replace("-1","negative")

要跳过数据#2文件的第一行,一个选项是

f = open("data#2.txt","r")
for line1 in f.readlines()[1:]: # skip the first line
   #rest of your code here

编辑:聊天对话后,我想你想要的更像是以下内容:

f = open("data#2.txt","r")
datalines = f.readlines()[1:]

count = 0

for line in sys.stdin:
    if count == len(datalines): break # kill the loop if we've reached the end
    (tweetid,num,senti,tweets) = re.split("\t+",line.strip())
    tweet = re.split("\s+", tweets.strip().lower())
    # grab the right index from our list
    (binary,rest,rest1,test2) = re.split("\s", datalines[count].strip())
    if binary == "-1":
        sentiment = "negative"
    elif binary == "1":
        sentiment = "positive"
    elif binary == "0":
        sentiment = "neutral"
    print tweetid, "\t", num, "\t", sentiment, "\t", tweets
    count += 1 # add to our counter