Question

我不对一般编程不熟悉，但我不熟悉较大的程序，需要创建/导入我自己的模块。我之前用c语言完成过它，但它是几年前的...而且这是Python。

我正在寻找组织方面的指导。我终于找到了 HOW 将.PY文件导入到我的项目中（以及它应该是什么样子）以及添加到windows变量的路径，但现在我很好奇我是不是做事＆＃39;纠正＆＃39;或者什么是最佳做法。下面，我列出了一系列我已经阅读过但没有回答我的问题的链接，但我认为我会尝试将这个帖子作为一站式服务，因为我已经看到这是一个这些年来一直是个热门话题。

我尝试制作一个多功能的模块，这个模块充满了抓取功能，所以我可以像在测试文件中那样做，只需编写 ONE 行来做我需要的。即传入一个URL并返回页面中所有HTML标记及其频率的排序列表。（这只是尝试学习组织和外部文件时的实验）这很痛苦，因为如果出现问题，我必须更改各种文件。

我收到的错误如下：＆＃34; request = scraper_tools.get_request（url，data = None，headers = scraper.reg_header）NameError：name＆＃39; scraper＆＃39;未定义＆＃34;

我做错了，还有更好的方法吗？（我假设有）：）

我的代码是这样的：

scraper_tools.py

#!my_modules/python
# Filename: scraper_tools.py

import requests import bs4 as bs

phone_header = {'user-agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 9_2 like mac OS X)'} reg_header = {'user-agent': 'Mozilla/5.0 (Windows NT
6.1; rv:52.0) Gecko/20100101 Firefox/52.0'}

def make_soup(request, parser):
    # Make soup
    return bs.BeautifulSoup(request.text, parser)

def get_request (url, data, headers, **kwargs):
    if kwargs and not headers:
        try:
            return requests.get(url, data=data)
        except Exception as e:
            print(e)
    elif headers and kwargs:
        return requests.get(url, headers=headers, data=data)


def get_all_items(soup, tag):
    return soup.find_all(tag)

def open_file_write(path, filename):
    save_path = path
    return open(os.path.join(save_path, filename), 'w')

def get_all_links(self, soup):
    href_tags = soup.find_all(href=True)
    link_list = []

    for tag in href_tags:
        if 'http' in tag['href'][0:4]:
            link_list.append(tag['href'])

    return link_list

get_all_tags.py

from stevens_tools import scraper_tools
import operator
import requests

'''
Author: Steven Smith
Email: StevenSmithCIS@gmail.com
Date: 9/1/2017
Description: This file uses an online resource website to dynamically get all 
common HTML tags in a list to be used to count list elements inside a specific
web page (and therefore know something about the quantity of each particular tag).
'''

html_tag_website_url = 'https://www.quackit.com/html/tags'
soup = None
tag_qnty_dict = None
request_object = None

def get_html_tags():
    #Get all the HTML tags currently from the website
    all_tags = []
    ul_lists = soup.find_all('ul', {'class': 'col-3 taglist'})
    for li in ul_lists:
        for item in li.find_all('a'):
            all_tags.append(item.text)
    return all_tags

def get_all_tags_from(url):
    #Returns a dictionary of all tags from HTML tag website in passsed in URL
    #with tag and quanity listed
    request = scraper_tools.get_request(url, data=None, headers=scraper_tools.reg_header)
    soup = scraper_tools.make_soup(request, 'lxml')
    tag_qnty_dict = {}
    tags = get_html_tags_from_file()
    if tags:
        for tag in tags:
            # If there is more than 0 items, add to list
            item_qnty = len(scraper_tools.get_all_items(soup, tag))
            if item_qnty > 0:
                tag_qnty_dict.update({tag: item_qnty})
    return tag_qnty_dict

def sort_items(reverse):
    #Sorts items in tag dictionary by quantity. In reverse (largest first)
    # if reverse is True
    return sorted(tag_qnty_dict.items(), key=operator.itemgetter(1), reverse=reverse)

def print_all():
    for item in sort_items(True):
        print('Tag = ' + item[0] + " Quantity: = " + str(item[1]))

test_tag_counter.py

from stevens_tools import get_all_tags

get_all_tags.print_all(get_all_tags.get_all_tags_from('https://www.goodreads.com/list/tag/best'))

^^^^^^^^^^^^^^^^^^这些名字并不太疯狂，但是......它们是描述性的！洛尔

**我去过的其他主题

Python Packages and Modules （..在Python中导入模块/包） http://mikegrouchy.com/blog/2012/05/be-pythonic-init__py.html （使用 __init .py表示模块/包标识符）create Python package and import modules （导入每个文件vs一次） Why installing package and module not same in Python? （导入版本问题-Python 3.4 vs 2x） What's the difference between a Python module and a Python package? （＆lt; - 见名字lol） What's the difference between "package" and "module" （＆lt; - 见名） Remove package and module name from sphinx function （删除模块名称） importing package and modules from another directory in python（＆lt; - 使用sys） Best practices when importing in IPython http://docs.python-guide.org/en/latest/writing/structure/#modules

使用Python导入包和模块使用/导入 - 层次结构和最佳实践。（外部文件新手）

0 个答案:

使用Python导入包和模块使用/导入 - 层次结构和最佳实践。 （外部文件新手）

0 个答案:

使用Python导入包和模块使用/导入 - 层次结构和最佳实践。（外部文件新手）