Question

我不明白为什么pandas数据框会舍入我的列中的值，其中我将其他两列的值除以。我希望新列中的数字有两位小数，但值是四舍五入的。我检查了列的dtypes，两者都是＆＃34; float64＆＃34;。

import pandas as pd
import numpy as np


# CURRENT DIRECTORY 
cd = os.path.dirname(os.getcwd())

# concatenate csv files
dfList = []

for root, dirs, files in os.walk(cd):
    for fname in files:
        if re.match("output_contigs_SCMgenes.csv", fname):
            frame = pd.read_csv(os.path.join(root, fname))
            dfList.append(frame)    

df = pd.concat(dfList)

#replace nan in SCM column with 0
df['SCM'].fillna(0, inplace=True)

#add column with genes/SCM
df['genes/SCM'] = df['genes']/df['SCM']

输出如下：

    genome  contig  genes  SCM  genes/SCM
0    20900      48      1    0        inf
1    20900      37    130  103          1
2    20900      35      1    1          1
3    20900       1     79   66          1
4    20900      66      5    3          2

但是我希望我的最后一列不包含舍入值，而是包含至少2位小数的值。

Answer 1

我可以通过将pd.options.display.precision设置为0：

来重现此行为

In [4]: df['genes/SCM'] = df['genes']/df['SCM']

In [5]: df
Out[5]:
   genome  contig  genes  SCM  genes/SCM
0   20900      48      1    0        inf
1   20900      37    130  103   1.262136
2   20900      35      1    1   1.000000
3   20900       1     79   66   1.196970
4   20900      66      5    3   1.666667

In [6]: pd.options.display.precision = 0

In [7]: df
Out[7]:
   genome  contig  genes  SCM  genes/SCM
0   20900      48      1    0        inf
1   20900      37    130  103          1
2   20900      35      1    1          1
3   20900       1     79   66          1
4   20900      66      5    3          2

检查你的熊猫＆amp; Numpy选项

Answer 2

尝试df ['genes / SCM'] = df ['genes'] / df ['SCM']。round（2）

Answer 3

无法确定，因为我无法重现，但您可以尝试：

from __future__ import division

位于脚本的最顶层。

Answer 4

用于在小数点后四舍五入为所需的位数，例如问题中要求的小数点后2位数字

df.round({'genes/SCM': 2})

用于多列

df.round({'col1_name': 1, 'col2_name': 2})

此外，检查精度未设置为0，pd.set_option('precision', 5)可用于适当地设置精度。以5为例，十进制后需要的所需位数。

Answer 5

我遇到过类似的问题，如果您从 csv 读取数据，请使用选项 float_precision='round_trip' as

pd.read_csv(resultant_file, sep='\t',float_precision='round_trip')

It will hold of your precision, if you don't use this option it will limit the precision for speed. - 见@MarkDickinson 评论。

和如果它与在 jupyter notebook 中显示数据框有关，则将精度设置为 display.precisionfollowing

pd.set_option("precision", 20)

为什么python pandas数据框舍入我的值？

5 个答案: