Question

假设我有一个这样的Excel工作表，

如果我以大熊猫格式读取此文件，则可以获取Column1，Column2，Column3作为标题。

但是，我想知道/创建一个输出，可能是像这样的字典

{Column1: 'A', Column2: 'B', Column3: 'C'}

原因是我从主映射文件中又有了一个字典（已经有手动完成的每一列的引用），它具有对每个Column的所有引用，

{Column1: 'A', Column2: 'B', Column3: 'C', Column4: 'D'}

这样，我可以交叉检查键和值，然后如果有任何不匹配，我可以识别出那些不匹配。在将文件读入熊猫时，如何获得原始列名，例如A的{{1}}等？有什么想法吗？

Answer 1

您可以将html { position: relative; min-height: 100%; } body { /* Margin bottom by footer height */ margin-bottom: 300px; } .footer { position: absolute; width: 100%; height: 300px; } /* Taller footer on small screens */ @media (max-width: 34em) { body { margin-bottom: 500px; } .footer { height: 500px; } } footer { padding-top:30px; padding-bottom:20px; background-color: #2F4454; color:#bbb; font: 400 13px/1.2em 'Open Sans',sans-serif; } footer a { color: #999; text-decoration:none; } footer a:hover, footer a:focus { color: #aaa; text-decoration:none; border-bottom:1px dotted #999; } footer .form-control { background-color: #1f2022; box-shadow: 0 1px 0 0 rgba(255, 255, 255, 0.1); border: none; resize: none; color: #d1d2d2; padding: 0.7em 1em; } .form-control { font-size: 0.8em; }与dict一起使用，以将列名映射为字母。假设您最多有26列。

zip

对于26列以上的内容，您可以调整Repeating letters like excel columns?中可用的from string import ascii_uppercase df = pd.DataFrame(np.arange(9).reshape(3, 3), columns=['Column1', 'Column2', 'Column3']) d = dict(zip(df.columns, ascii_uppercase)) print(d) {'Column1': 'A', 'Column2': 'B', 'Column3': 'C'}解决方案

Answer 2

您可以使用Panadas重命名方法来使用现有的映射字典替换数据框列名称：

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html

import pandas as pd

df = pd.DataFrame({'Column1': [1, 2], 'Column2': [3, 4], 'Column3': [5, 6]})

existing_mapping = {'Column1': 'A', 'Column2': 'B', 'Column3': 'C', 'Column4': 'D'}

df = df.rename(columns=existing_mapping)

查找Excel / CSV文件的实际列引用-Python Pandas

2 个答案: