脚本在执行前打印args并等待我在终止前按[enter]

时间:2016-01-21 22:59:35

标签: python python-2.7

我有以下刮刀草稿:

from lxml import html
import requests
import sys

requestedURL = sys.argv[1]
page = requests.get(requestedURL)
tree = html.fromstring(page.content)

passage = ''
for tr in tree.cssselect("div [class='passage-content passage-class-0']"):
    for each in tr:
        for e in each:
            for x in e:
                if x.text_content() == 'Footnotes:' or x.text_content() == 'Cross references:': 
                    passage += '\n'
                    passage = passage.lstrip('\n')
                    sys.stdout.write(passage)
                    sys.exit(0)
                if not x.text_content()[0].isdigit():
                    passage += '\n\n'+x.text_content()+'\n\n'
                else:
                    passage += x.text_content()
            passage = passage.replace('\n\n\n', '\n\n')

当我运行它时,我确实得到了我想要的输出,但我也得到了两个不需要的事件:

  • 打印参数
  • 在我按Enter
  • 之前,脚本实际上并未结束

示例:

python bg_scrape.py https://www.biblegateway.com/passage/?search=John+3%3A1&version=ESV
[1] 48648

John 3:1

New International Version (NIV)

Jesus Teaches Nicodemus

3 Now there was a Pharisee, a man named Nicodemus who was a member of the Jewish ruling council.

// this line doesn't show up until I hit enter
[1]+  Done  python bg_scrape.py https://www.biblegateway.com/passage/?search=John+3%3A1

值得注意的是,一旦我将requestedURL作为sys.arg而不是代码中的静态字符串,这种情况才会开始发生。

1 个答案:

答案 0 :(得分:1)

可能是“&”在cmd行参数中。尝试将参数放在双引号python bg_scrape.py "https://www.biblegateway.com/passage/?search=John+3%3A1&version=ESV"

基本上发生的事情是你的shell实际上运行了两件事:

  • python bg_scrape.py https://www.biblegateway.com/passage/?search=John+3%3A1作为后台流程
  • 然后运行version=ESV,它分配一个shell变量

当你按回车键时,shell只会给你一个已完成的任何后台进程的更新(在这种情况下,你刚开始的那个)。