我现在开始学习python一个多星期了,而且我一直在关注一个我一直在努力的小项目......
我创建了一个脚本,可以抓取网站上的某些内容(帖子标题,网址)......然后使用WP API将这些内容发布到Wordpress ....
一切都在顺利进行,直到遇到一个我无法弄清楚的障碍......
import urllib
import urllib.request
from bs4 import BeautifulSoup
import io
import time
import re
from bs4 import BeautifulSoup
from wordpress_xmlrpc import Client, WordPressPost
from wordpress_xmlrpc import *
from wordpress_xmlrpc.methods import media, posts
from wordpress_xmlrpc.methods.posts import GetPosts, NewPost
modified_lines = []
#Function pulls all the post Titles and saves it in a txt file called "posts"
def get_posts(pageurl):
SoupRequest(pageurl)
with io.open('posts.txt', 'w', encoding='utf8') as logfile:
for posts in soup.findAll('header',{"class":"entry-header"}):
whatamidoing = (posts.find('a').text)
ithinkiknow = str(whatamidoing)
logfile.write(ithinkiknow +"\n")
get_posts("link")
with open('posts.txt') as f:
scrubbed = list(map(replace_symbols, f.readlines()))
#Function that gets the link of each post on wordpress homepage
def get_posturl(pageurl):
SoupRequest(pageurl)
links = soup.findAll('h2',{"class":"entry-title"})
for l in links:
global postlink
postlink = l.find('a')['href']
modified_line = postlink
modified_lines.append(modified_line)
get_posturl('link')
#Function to pull Iframes from post
def get_iframe(pageurl):
for url in pageurl:
file_name = url.replace('https://', '').replace('.', '_').replace('/','_')
SoupRequest(url)
global iframexx
iframexx = (soup.find_all('iframe'))
get_iframe(modified_lines)
#Function to Post to Wordpress
def post_to_wordpress(title, allVideoLinks):
tags = []
wp = Client(wordpressxml, 'USERNAME', 'Password')
wp.call(GetPosts())
title = str(title)
content = str(allVideoLinks)
post = WordPressPost()
post.title = title
post.content = content
post.post_type = "post"
post.post_status = 'publish'
post.id = wp.call(NewPost(post))
wp.call(posts.EditPost(post.id, post))
for i in scrubbed:
post_to_wordpress(i,iframexx)
#Successfully makes posts containing all the different titles....however, each post has the same last iframe link???
我将代码的相关部分放在该粘贴框中,并评论了每个函数的功能或应该做的事情。 。 虽然它确实循环并制作包含“scrubbed”中所有标题的帖子,但它对iframe链接的作用却不同。相反,所有帖子都有相同的iframe帖子内容(列表中的最后一件事)
为什么会这样?如果有人帮助我会永远感激
编辑:为了进一步说明,我想要的是创建多个帖子,其中包含来自(擦洗)的标题和来自(iframexx)的内容
*** POST1 title:title1
POST2 标题:title2
post3 标题:title3 url:content3 ***
另外,旁注,为什么我的iframe打印出来的是[和]