Question

这篇文章的要点是我有＆＃34; 23＆＃34;在我的原始数据中，我想要＆＃34; 23＆＃34;在我的结果中（不是＆＃34; 23.0＆＃34;）。以下是我试图用熊猫来处理它的方法。

我的Excel工作表有一个编码的Region列：

23
11
27
(blank)
25

最初，我创建了一个数据框，Pandas将Region的dtype设置为float64*

import pandas as pd
filepath = 'data_file.xlsx'
df = pd.read_excel(filepath, sheetname=0, header=0)
df

23.0
11.0
27.0
NaN
25.0

如果我使用object将NaN替换为空格似乎消除了小数，那么Pandas会将dtype转换为fillna()。

df.fillna('', inplace=True)
df

23
11
27
(blank)
25

除非我将数据帧转换为dict时仍然会得到小数：

data = df.to_dict('records')
data

[{'region': 23.0,},
 {'region': 27.0,},
 {'region': 11.0,},
 {'region': '',},
 {'region': 25.0,}]

有没有办法可以创建没有小数位的字典？顺便说一句，我正在编写一个通用实用程序，所以我不会总是知道列名和/或值类型，这意味着我正在寻找一个通用的解决方案（而不是显式处理Region））。

非常感谢任何帮助，谢谢！

Answer 1

问题是，fillna('') float后，尽管列类型为object

，但您的基础值仍为

s = pd.Series([23., 11., 27., np.nan, 25.])

s.fillna('').iloc[0]

23.0

apply

相反，s.apply('{:0.0f}'.format).replace('nan', '').to_dict() {0: '23', 1: '11', 2: '27', 3: '', 4: '25'}格式化，然后替换

{{1}}

Answer 2

使用自定义函数，处理整数并将字符串保存为字符串：

import pprint

def func(x):
    try:
        return int(x)
    except ValueError:
        return x

df = pd.DataFrame({'region': [1, 2, 3, float('nan')],
                   'col2': ['a', 'b', 'c', float('nan')]})
df.fillna('', inplace=True)
pprint.pprint(df.applymap(func).to_dict('records'))

输出：

[{'col2': 'a', 'region': 1},
 {'col2': 'b', 'region': 2},
 {'col2': 'c', 'region': 3},
 {'col2': '', 'region': ''}]

还将浮动保持为浮动的变体：

import pprint

def func(x):
    try:
        if int(x) == x:
            return int(x)
        else:
            return x
    except ValueError:
        return x

df = pd.DataFrame({'region1': [1, 2, 3, float('nan')],
                   'region2': [1.5, 2.7, 3, float('nan')],
                   'region3': ['a', 'b', 'c', float('nan')]})
df.fillna('', inplace=True)
pprint.pprint(df.applymap(func).to_dict('records'))

输出：

[{'region1': 1, 'region2': 1.5, 'region3': 'a'},
 {'region1': 2, 'region2': 2.7, 'region3': 'b'},
 {'region1': 3, 'region2': 3, 'region3': 'c'},
 {'region1': '', 'region2': '', 'region3': ''}]

Answer 3

您可以添加：dtype=str

import pandas as pd

filepath = 'data_file.xlsx'
df = pd.read_excel(filepath, sheetname=0, header=0, dtype=str)

如何从Pandas to_dict（）输出中删除小数

3 个答案: