Question

我不知道这是否可能..还没有在网上看到。在excel中，我已经格式化了按位置/城市划分的交叉表数据，这些数据都在同一电子表格中用于数千行。下面是一个简单的例子。

我想运行一个python excel解析器，该解析器将使用此格式化的数据并将其以原始数据格式取消格式化，以便可以将其加载到数据库表中。这可能吗？所需的结果看起来像这样。

Answer 1

Pandas有一种读取Excel文件的方法，它很整洁，因为您可以从中获得一个数据框，并且这可能使扫描和自定义分析更加容易。

import pandas as pd

# Reads the excel file
xl = pd.ExcelFile(file_path)
# Parses the desired sheet
df = xl.parse(sheet_name)

# To host all your table title indices
tbl_title = []

# To locate the title of your tables, I think you can do a sampling of that column to ascertain all the row numbers that contain the table titles
for i, n in enumerate(df.loc[:, column_name]):
    if n == 'P': # The first column in your table header as the cue
        tbl_title.append(i - 1) # This would be the row index for Frisco, Dallas etc.

一旦有了所有表标题的索引，就可以创建另一个表读取器函数来遍历特定行的数据框。

读取Excel分段数据，进行转换，然后输出为数据库的原始格式

1 个答案: