我一直在努力让这两个脚本都作为一个脚本运行。我试图在Windows 7环境中运行它。
# Beginning of Program One
import webbrowser
import time
tme = "Start time - " + time.strftime("%A, %d %b %Y %I:%M %S")
print(tme)
new = 2 # open in a new tab, if possible
# open a public URL, in this case, the webbrowser docs
url = "http://www.foxsports.com/nba"
webbrowser.open(url,new=new)
# END OF PROGRAM 1
# BEGINNING OF PROGRAM 2
import win32gui
import win32ui
import win32con
import win32api
import os
import re
import xlrd
import win32com.client
import sys
import subprocess
import time
# WindowMgr - allows me to send control to the Chrome Webpage
class WindowMgr:
"""Encapsulates some calls to the winapi for window management"""
def __init__ (self):
"""Constructor"""
self._handle = None
def find_window(self, class_name, window_name = None):
"""find a window by its class_name"""
self._handle = win32gui.FindWindow(class_name, window_name)
def _window_enum_callback(self, hwnd, wildcard):
'''Pass to win32gui.EnumWindows() to check all the opened windows'''
if re.match(wildcard, str(win32gui.GetWindowText(hwnd))) != None:
self._handle = hwnd
def find_window_wildcard(self, wildcard):
self._handle = None
win32gui.EnumWindows(self._window_enum_callback, wildcard)
def set_foreground(self):
"""put the window in the foreground"""
win32gui.SetForegroundWindow(self._handle)
# Activate the webpage
w = WindowMgr()
w.find_window_wildcard(".*Videos*")
w.set_foreground()
shell = win32com.client.Dispatch("WScript.Shell")
shell.SendKeys("^a") # CTRL+A
shell.SendKeys("^c") # CTRL+C
try:
# Python2
import Tkinter as tk
except ImportError:
# Python3
import tkinter as tk
root = tk.Tk()
clip_text = root.clipboard_get()
print (clip_text)
# END OF PROGRAM2
答案 0 :(得分:1)
您能解释一下您希望通过此计划实现的目标吗?
根据代码我可以告诉您,您希望在网络浏览器标签中打开http://www.foxsports.com/nba
,
然后将焦点设置到此浏览器,然后执行CTRL + A
选择页面中的所有文本,然后CTRL + C
将所述文本复制到剪贴板中。
最后,您提取您复制的内容并将其打印出来。
如果我的假设是正确的,您出于某种原因只想要网页的文字,那么我建议使用urllib来获取源代码并使用Beautifulsoup解析源代码
一个例子:
import urllib
from bs4 import BeautifulSoup, SoupStrainer
source = urllib.urlopen("http://www.foxsports.com/nba").read()
soup = BeautifulSoup(source, parse_only=SoupStrainer("a")) # a is anchor tag in <a href=""></a>
links_list = soup.findAll("a")
for link in links_list[:10]:
print link.text
这将查找页面上的所有链接,并打印仅与前10个相关联的文本。您可以使用BeautifulSoup
调整要解析的数据(基于标记)。