我不能让两个程序一起运行

时间:2017-06-24 21:32:38

标签: python tkinter web-scraping

我一直在努力让这两个脚本都作为一个脚本运行。我试图在Windows 7环境中运行它。

# Beginning of Program One

import webbrowser
import time

tme = "Start time - " + time.strftime("%A, %d %b %Y %I:%M %S")
print(tme)

new = 2 # open in a new tab, if possible

# open a public URL, in this case, the webbrowser docs
url = "http://www.foxsports.com/nba"
webbrowser.open(url,new=new)


# END OF PROGRAM 1

# BEGINNING OF PROGRAM 2
import win32gui
import win32ui
import win32con
import win32api
import os
import re
import xlrd
import win32com.client
import sys
import subprocess
import time


 # WindowMgr - allows me to send control to the Chrome Webpage 
class WindowMgr:
"""Encapsulates some calls to the winapi for window management"""
def __init__ (self):
    """Constructor"""
    self._handle = None

def find_window(self, class_name, window_name = None):
    """find a window by its class_name"""
    self._handle = win32gui.FindWindow(class_name, window_name)

def _window_enum_callback(self, hwnd, wildcard):
    '''Pass to win32gui.EnumWindows() to check all the opened windows'''
    if re.match(wildcard, str(win32gui.GetWindowText(hwnd))) != None:
        self._handle = hwnd

def find_window_wildcard(self, wildcard):
    self._handle = None
    win32gui.EnumWindows(self._window_enum_callback, wildcard)

def set_foreground(self):
    """put the window in the foreground"""
    win32gui.SetForegroundWindow(self._handle)


# Activate the webpage
w = WindowMgr()
w.find_window_wildcard(".*Videos*")
w.set_foreground()

shell = win32com.client.Dispatch("WScript.Shell")
shell.SendKeys("^a") # CTRL+A
shell.SendKeys("^c") # CTRL+C

try:
    # Python2
    import Tkinter as tk
except ImportError:
    # Python3
    import tkinter as tk
root = tk.Tk()
clip_text = root.clipboard_get()
print (clip_text)

# END OF PROGRAM2

1 个答案:

答案 0 :(得分:1)

您能解释一下您希望通过此计划实现的目标吗? 根据代码我可以告诉您,您希望在网络浏览器标签中打开http://www.foxsports.com/nba
 然后将焦点设置到此浏览器,然后执行CTRL + A选择页面中的所有文本,然后CTRL + C将所述文本复制到剪贴板中。
最后,您提取您复制的内容并将其打印出来。

如果我的假设是正确的,您出于某种原因只想要网页的文字,那么我建议使用urllib来获取源代码并使用Beautifulsoup解析源代码

一个例子:

import urllib
from bs4 import BeautifulSoup, SoupStrainer
source = urllib.urlopen("http://www.foxsports.com/nba").read()
soup = BeautifulSoup(source, parse_only=SoupStrainer("a"))  # a is anchor tag in <a href=""></a>
links_list = soup.findAll("a")
for link in links_list[:10]:
    print link.text

这将查找页面上的所有链接,并打印仅与前10个相关联的文本。您可以使用BeautifulSoup调整要解析的数据(基于标记)。