我想使用Python将过滤器应用于电子表格,哪个模块对Pandas或其他模块更有用?
答案 0 :(得分:1)
在您的 Pandas 数据框中过滤可以使用 loc 完成(除了一些其他方法)。我认为您正在寻找的是一种将数据框导出到 excel 并在 excel 中应用过滤器的方法。
XLSXWRITER(由 John McNamara 编写)几乎满足所有 xlsx/pandas 用例,并且这里有很好的文档 --> https://xlsxwriter.readthedocs.io/。
自动过滤是一个选项:) https://xlsxwriter.readthedocs.io/worksheet.html?highlight=auto%20filter#worksheet-autofilter
答案 1 :(得分:0)
我不确定我是否理解您的问题。也许pandas
和
qgrid
可能会对您有所帮助。
答案 2 :(得分:0)
可以使用.loc DataFrame方法完成对熊猫的简单过滤。
In [4]: data = ({'name': ['Joe', 'Bob', 'Alice', 'Susan'],
...: 'dept': ['Marketing', 'IT', 'Marketing', 'Sales']})
In [5]: employees = pd.DataFrame(data)
In [6]: employees
Out[6]:
name dept
0 Joe Marketing
1 Bob IT
2 Alice Marketing
3 Susan Sales
In [7]: marketing = employees.loc[employees['dept'] == 'Marketing']
In [8]: marketing
Out[8]:
name dept
0 Joe Marketing
2 Alice Marketing
您还可以将.loc与.loc一起使用,以在同一列中选择多个值
In [9]: marketing_it = employees.loc[employees['dept'].isin(['Marketing', 'IT'])]
In [10]: marketing_it
Out[10]:
name dept
0 Joe Marketing
1 Bob IT
2 Alice Marketing
您还可以使用and(&)或or(|)语句将多个条件传递给.loc以从多个列中选择值
In [11]: joe = employees.loc[(employees['dept'] == 'Marketing') & (employees['name'] == 'Joe')]
In [12]: joe
Out[12]:
name dept
0 Joe Marketing
答案 3 :(得分:0)
以下是使用 XlsxWriter 向从 Pandas 导出的工作表添加自动过滤器的示例:
import pandas as pd
# Create a Pandas dataframe by reading some data from a space-separated file.
df = pd.read_csv('autofilter_data.txt', sep=r'\s+')
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_autofilter.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object. We also turn off the
# index column at the left of the output dataframe.
df.to_excel(writer, sheet_name='Sheet1', index=False)
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Get the dimensions of the dataframe.
(max_row, max_col) = df.shape
# Make the columns wider for clarity.
worksheet.set_column(0, max_col - 1, 12)
# Set the autofilter.
worksheet.autofilter(0, 0, max_row, max_col - 1)
# Add an optional filter criteria. The placeholder "Region" in the filter
# is ignored and can be any string that adds clarity to the expression.
worksheet.filter_column(0, 'Region == East')
# It isn't enough to just apply the criteria. The rows that don't match
# must also be hidden. We use Pandas to figure our which rows to hide.
for row_num in (df.index[(df['Region'] != 'East')].tolist()):
worksheet.set_row(row_num + 1, options={'hidden': True})
# Close the Pandas Excel writer and output the Excel file.
writer.save()
输出:
本例中使用的数据是here。