Question

我写了一个小的中间件来跟踪用户活动：

class AccessLogs(object):

def __init__(self, get_response):
    self.get_response = get_response

def __call__(self, request):
    response = self.get_response(request)

    if "/media/" not in request.path:
        try:
            ActivityLog(user=request.user, pageURL=request.path).save()
        except Exception as e:
            print(e)

    return response

使用这种中间件方法有什么办法可以获取页面标题？我在这里查找了很多东西，例如templateview，自定义响应，但似乎没有任何效果。是否有任何类或函数可以检索访问页面的标题？任何帮助将不胜感激。

编辑：我要寻找的是一种获取刚刚访问过的页面标题的方法，因此我可以将其与其他信息一起存储在此中间件的数据库中。

Answer 1

是，尽管并非所有响应都是HTTP响应，也不是所有HTTP响应本身都具有标题。但是我们可以尽最大努力从响应中获取标题。

为此，我们可以使用HTML抓取工具，例如beautifulsoup4 [PiPy]。您可能需要安装：

pip install beautifulsoup4 lxml

然后我们的目标是通过以下方式从响应中获取标题：

from bs4 import BeautifulSoup

def get_response_title(response):
    try:
        soup = BeautifulSoup(response.content, 'lxml')
        return soup.find('title').getText()
    except AttributeError:
        return None

因此，您可以在中间件中使用它，例如：

class AccessLogs(object):

    def __call__(self, request):
        response = self.get_response(request)
        if '/media/' not in request.path:
            try:
                title = get_response_title(response)
                ActivityLog(user=request.user, title=title, pageURL=request.path).save()
            except Exception as e:
                print(e)

话虽这么说，@IainShelvington says会减慢处理速度，因为我们每次都会看一下响应。诸如Yesod [yesodweb.com]之类的某些Web开发框架会将标题设置为在处理程序中传递的变量，从而使其更易于检测。

Django：从中间件获取页面标题

1 个答案: