Question

如果可能的话，如何用更少的代码将下面显示的代码的结尾部分移至另一行或更改文本，以实现所需的结果。我输入了以下代码：

import pandas as pd
import requests
from bs4 import BeautifulSoup
res = requests.get("http://web.archive.org/web/20070826230746/http://www.bbmf.co.uk/july07.html")
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]

df = pd.read_html(str(table))
df = df[1]
df = df.rename(columns=df.iloc[0])
df = df.iloc[2:]
df.head(15)

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H') | (df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') | df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'SS')] 
Southport

我想要实现的是要显示的以下数据：-仅显示，并且仅显示达科他州烈性人和飓风或达科他州和烈性人或达科他州和两个烈性人，如果它们显示在数据表时间表中，则为整个代码。从Southport =开始的那一行需要编辑：

运行代码时出现以下回溯错误，我认为这是由于代码行太长所致：

File "<ipython-input-1-518a9f1c8e98>", line 23
    Southport
            ^
SyntaxError: invalid syntax

我正在Internet程序Jupyter Notebook中运行代码

Answer 1

我认为这是典型的复制/粘贴错误；您只需要删除第一个(df[之后的第一个(df['Hurricane'] == 'H') |和第二个df[之后的|-那么至少应该不再有任何语法错误。

但是，逻辑过于冗长，因为df['Location'].str.contains('- Display')和(df['Lancaster'] == '')以及(df['Dakota'] == 'D')都是每个布尔值或分隔的布尔项的一部分。

此外，df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S')是df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H')的超集，这意味着如果您拥有第一个，则后者不会提供更多的行，因此可以完全将其保留。
因此，一切都已简化为

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') | df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'SS')]

可以表示为较短的

Southport = df[(df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D')) & ((df['Spitfire'] == 'S') | (df['Spitfire'] == 'SS'))]

因为A & B & C | A & B & D是A & B & (C | D)。

如果我没记错的话，像 =='S' or =='SS' 这样的模式应该用熊猫的isin更好地表达：

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & df['Spitfire'].isin(['S', 'SS'])]

Answer 2

您有一条荒谬的路线可以从拆分中受益：

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H') | (df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') | df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'SS')]

让我们这样做，我们也许可以看到错误：

Southport = df[
    df['Location'].str.contains('- Display') & 
    (df['Lancaster'] == '') & 
    (df['Dakota'] == 'D') & 
    (df['Spitfire'] == 'S') & 
    (df['Hurricane'] == 'H') | 
    (df[
        df['Location'].str.contains('- Display') & 
        (df['Lancaster'] == '') & 
        (df['Dakota'] == 'D') & 
        (df['Spitfire'] == 'S') | 
        df[
            df['Location'].str.contains('- Display') & 
            (df['Lancaster'] == '') & 
            (df['Dakota'] == 'D') & 
            (df['Spitfire'] == 'SS')]

对我来说，突出的是括号和括号未正确对齐-开括号（准确地说是两个开方括号和一个开括号）要比闭括号多。这与您将收到SyntaxError的原因是一致的。因此，让我们尝试解决此问题：

Southport = df[
        (
            df['Location'].str.contains('- Display') & 
            df['Lancaster'] == '' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'S' & 
            df['Hurricane'] == 'H'
        )
    ] | df[
        (
            df['Location'].str.contains('- Display') & 
            df['Lancaster'] == '' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'S'
        )
    ] | df[
        (
            df['Location'].str.contains('- Display') & 
            df['Lancaster'] == '' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'SS'
        )
    ]

我希望

与您尝试编写的内容更接近（尽管我不确定一开始您到底要编写的内容是什么）。请注意，将这样的代码分隔开来如何使更加容易来查看条件如何对齐，哪些比较嵌套在其他比较中，等等。

您的特定IDE可能会或可能不会对您大喊大叫的一般经验法则是将行长保持在一定阈值（例如120个字符）以下。有时，由于具有冗长的变量名，所以您有一个跨整个行的单数表达式，并且超出了该范围，这很好。但是，通常，存在一条经验法则来鼓励这种行为-以某种方式拆分行，以使人们更清楚地阅读代码。

Python代码行太长，需要分解

2 个答案: