Question

我想使用Python将过滤器应用于电子表格，哪个模块对Pandas或其他模块更有用？

Answer 1

在您的 Pandas 数据框中过滤可以使用 loc 完成（除了一些其他方法）。我认为您正在寻找的是一种将数据框导出到 excel 并在 excel 中应用过滤器的方法。

XLSXWRITER（由 John McNamara 编写）几乎满足所有 xlsx/pandas 用例，并且这里有很好的文档 --> https://xlsxwriter.readthedocs.io/。

自动过滤是一个选项:) https://xlsxwriter.readthedocs.io/worksheet.html?highlight=auto%20filter#worksheet-autofilter

Answer 2

我不确定我是否理解您的问题。也许pandas和 qgrid可能会对您有所帮助。

Answer 3

可以使用.loc DataFrame方法完成对熊猫的简单过滤。

In [4]: data = ({'name': ['Joe', 'Bob', 'Alice', 'Susan'],
    ...: 'dept': ['Marketing', 'IT', 'Marketing', 'Sales']})

In [5]: employees = pd.DataFrame(data)

In [6]: employees
Out[6]:
    name       dept
0    Joe  Marketing
1    Bob         IT
2  Alice  Marketing
3  Susan      Sales

In [7]: marketing = employees.loc[employees['dept'] == 'Marketing']

In [8]: marketing
Out[8]:
    name       dept
0    Joe  Marketing
2  Alice  Marketing

您还可以将.loc与.loc一起使用，以在同一列中选择多个值

In [9]: marketing_it = employees.loc[employees['dept'].isin(['Marketing', 'IT'])]

In [10]: marketing_it
Out[10]:
    name       dept
0    Joe  Marketing
1    Bob         IT
2  Alice  Marketing

您还可以使用and（＆）或or（|）语句将多个条件传递给.loc以从多个列中选择值

In [11]: joe = employees.loc[(employees['dept'] == 'Marketing') & (employees['name'] == 'Joe')]

In [12]: joe
Out[12]:
  name       dept
0  Joe  Marketing

Answer 4

以下是使用 XlsxWriter 向从 Pandas 导出的工作表添加自动过滤器的示例：

import pandas as pd

# Create a Pandas dataframe by reading some data from a space-separated file.
df = pd.read_csv('autofilter_data.txt', sep=r'\s+')

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_autofilter.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object. We also turn off the
# index column at the left of the output dataframe.
df.to_excel(writer, sheet_name='Sheet1', index=False)

# Get the xlsxwriter workbook and worksheet objects.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

# Get the dimensions of the dataframe.
(max_row, max_col) = df.shape

# Make the columns wider for clarity.
worksheet.set_column(0,  max_col - 1, 12)

# Set the autofilter.
worksheet.autofilter(0, 0, max_row, max_col - 1)

# Add an optional filter criteria. The placeholder "Region" in the filter
# is ignored and can be any string that adds clarity to the expression.
worksheet.filter_column(0, 'Region == East')

# It isn't enough to just apply the criteria. The rows that don't match
# must also be hidden. We use Pandas to figure our which rows to hide.
for row_num in (df.index[(df['Region'] != 'East')].tolist()):
    worksheet.set_row(row_num + 1, options={'hidden': True})

# Close the Pandas Excel writer and output the Excel file.
writer.save()

输出：

本例中使用的数据是here。

我们可以使用Pandas自动执行Excel的数据过滤器吗？

4 个答案: