Question

我已经编写了此代码来替换带有标题的网址。它确实根据需要用标题替换url，但它会在下一行打印它们的标题。

twfile.txt包含以下行：

link1 http://t.co/HvKkwR1c
no link line

输出tw2file：

link1
Instagram
no link line

但我希望以这种形式输出：

link1 Instagram
no link line

我该怎么办？

我的代码：

from bs4 import BeautifulSoup
import urllib

output = open('tw2file.txt','w')

with open('twfile.txt','r') as inputf:
    for line in inputf:
        try:
            list1 = line.split(' ')
            for i in range(len(list1)):

                if "http" in list1[i]:
                    ##print list1[i]
                    response = urllib.urlopen(list1[i])
                    html = response.read()
                    soup = BeautifulSoup(html)
                    list1[i] = soup.html.head.title
                    ##print list1[i]


                    list1[i] = ''.join(ch for ch in list1[i])
                else:
                    list1[i] = ''.join(ch for ch in list1[i])
            line = ' '.join(list1)
            print line
            output.write(line)
        except:
            pass


inputf.close()
output.close()

Answer 1

关于写入文件的内容

fileobject = open("bar", 'w' )
fileobject.write("Hello, World\n") # newline is inserted by '\n'
fileobject.close()

关于控制台输出

将print line更改为print line,

Python编写＆＃39; \ n＆＃39;最后的字符，除非print语句以逗号结尾。

Answer 2

试试这段代码:(见这里，这里和这里）

from bs4 import BeautifulSoup
import urllib

with open('twfile.txt','r') as inputf, open('tw2file.txt','w') as output:
    for line in inputf:
        try:
            list1 = line.split(' ')
            for i in range(len(list1)):
                if "http" in list1[i]:
                    response = urllib.urlopen(list1[i])
                    html = response.read()
                    soup = BeautifulSoup(html)
                    list1[i] = soup.html.head.title
                    list1[i] = ''.join(ch for ch in list1[i]).strip() # here
                else:
                    list1[i] = ''.join(ch for ch in list1[i]).strip() # here
            line = ' '.join(list1)
            print line
            output.write('{}\n'.format(line))  # here
        except:
            pass

顺便说一句，您使用的是Python 2.7.x +，在同一个open子句中表达了两个with。他们的close也是不必要的。

Python：用标题替换url

2 个答案: