Question

我正在寻找一个有效的Python实现，该函数采用十进制格式的字符串，例如

2.05000
200
0.012

并返回一个两个整数的元组，表示基数为10的浮点格式的输入的有效数和指数，例如。

(205,-2)
(2,2)
(12,-3)

列表理解将是一个很好的奖励。

我有一种直觉，认为存在一种有效的（也可能是Pythonic）方式，但这使我无法实现......

应用于pandas的解决方案

import pandas as pd
import numpy as np
ser1 = pd.Series(['2.05000', '- 2.05000', '00 205', '-205', '-0', '-0.0', '0.00205', '0', np.nan])

ser1 = ser1.str.replace(' ', '')
parts = ser1.str.split('.').apply(pd.Series)

# remove all white spaces
# strip leading zeros (even those after a minus sign)
parts.ix[:,0] = '-'*parts.ix[:,0].str.startswith('-') + parts.ix[:,0].str.lstrip('-').str.lstrip('0')

parts.ix[:,1] = parts.ix[:,1].fillna('')        # fill non-existamt decimal places
exponents = -parts.ix[:,1].str.len()
parts.ix[:,0] += parts.ix[:,1]                  # append decimal places to digit before decimal point

parts.ix[:,1] = parts.ix[:,0].str.rstrip('0')   # strip following zeros

exponents += parts.ix[:,0].str.len() - parts.ix[:,1].str.len()

parts.ix[:,1][(parts.ix[:,1] == '') | (parts.ix[:,1] == '-')] = '0'
significands = parts.ix[:,1].astype(float)

df2 = pd.DataFrame({'exponent': exponents, 'significand': significands})
df2

输入：

0      2.05000
1    - 2.05000
2       00 205
3         -205
4           -0
5         -0.0
6      0.00205
7            0
8          NaN
dtype: object

输出：

   exponent  significand
0        -2          205
1        -2         -205
2         0          205
3         0         -205
4         0            0
5         0            0
6        -5          205
7         0            0
8       NaN          NaN

[9 rows x 2 columns]

Answer 1

看看decimal.Decimal：

>>> from decimal import Decimal
>>> s = '2.05000'
>>> x = Decimal(s)
>>> x
Decimal('2.05000')
>>> x.as_tuple()
DecimalTuple(sign=0, digits=(2, 0, 5, 0, 0, 0), exponent=-5)

几乎所需，只需将DecimalTuple转换为您想要的格式，例如：

>>> t = Decimal('2.05000').as_tuple()
>>> (''.join(str(x) for i,x in enumerate(t.digits) if any(t.digits[i:])),
... t.exponent + sum(1 for i,x in enumerate(t.digits) if not 
... any (t.digits[i:])))
('205', -2)

只是草图，但满足你的三个测试用例。

在处理.normalize()之前，您可能需要Decimal .as_tuple()（感谢@georg），这会处理尾随的零。这样，您就不需要进行那么多格式化了：

>>> Decimal('2.05000').normalize().as_tuple()
DecimalTuple(sign=0, digits=(2, 0, 5), exponent=-2)

所以你的功能可以写成：

>>> def decimal_str_to_sci_tuple(s):
...  t = Decimal(s).normalize().as_tuple()
...  return (int(''.join(map(str,t.digits))), t.exponent)
... 
>>> decimal_str_to_sci_tuple('2.05000')
(205, -2)
>>> decimal_str_to_sci_tuple('200')
(2, 2)
>>> decimal_str_to_sci_tuple('0.012')
(12, -3)

（确保在支持负数时添加t.sign）。

Answer 2

如果您正在寻找科学记数法，可以使用小数和格式：

numbers = ['2.05000','200','0.01','111']
print ["{:.2E}".format(Decimal(n)) for n in numbers]

输出：

['2.05E+0', '2.00E+2', '1.00E-2']

如果您正在寻找，

获取右侧0以外的数字

获取科学记数字直到右侧数字

from decimal import  *
numbers = ['2.05000','200','0.01','111']
numbers = [ n.rstrip('0') if '.' in n else n  for n in numbers ] #strip right         zeros if found after .
for n in numbers:
    if '.' in n:
        num = n.split('.')[0]
        dec = n.split('.')[1]
        tenthNumber = len(dec)
        print (Decimal(num+dec), -1 * tenthNumber)
    elif n.endswith('0'): 
        tenthNumber = 0
        revN = n[::-1]
        for i in range(len(revN)):
            if revN[i]=='0':
                tenthNumber = tenthNumber + 1
            else:
                break
        print (n[:(len(n)-tenthNumber)], str(tenthNumber))

    else:
        print (n,0)

输出：

(Decimal('205'), -2)
('2', '2')
(Decimal('1'), -2)
('111', 0)

Answer 3

这是一个直接的字符串处理解决方案。

def sig_exp(num_str):
    parts = num_str.split('.', 2)
    decimal = parts[1] if len(parts) > 1 else ''
    exp = -len(decimal)
    digits = parts[0].lstrip('0') + decimal
    trimmed = digits.rstrip('0')
    exp += len(digits) - len(trimmed)
    sig = int(trimmed) if trimmed else 0
    return sig, exp

>>> for x in ['2.05000', '200', '0.012', '0.0']:
    print sig_exp(x)

(205, -2)
(2, 2)
(12, -3)
(0, 0)

我将处理负数作为读者的练习。

Answer 4

这是一种使用 venpa 的格式化字符串（因为所有功劳都归功于他）并以数字而不是字符串开头的方法。如果你能负担得起四舍五入的有效数（例如在 2 位数字之后），你可以简单地写：

def scd_exp(scnum):
    scnum = "{:.2e}".format(scnum)
    return (float(scnum[:4]),int(scnum[-3:]))


numbers = [2.05, 205, 0.0001576, 111]
for number in numbers:
    print(scd_exp(number))

结果是

(2.05, 0)
(2.05, 2)
(1.58, -4)
(1.11, 2)

如果你想在每次调用函数时自己设置有效数四舍五入（例如以6位为例），你可以写

def scd_exp(scnum, roundafter):
    formstr = "".join(("{:.",str(roundafter),"e}"))
    scnum = formstr.format(scnum)     
    return (float(scnum[:roundafter+2]),int(scnum[-3:]))


numbers = [2.05, 205, 0.000157595678, 111]
for number in numbers:
    print(scd_exp(number, 6))

回馈

(2.05, 0)
(2.05, 2)
(1.575957, -4)
(1.11, 2)

从十进制格式的字符串中提取基数为10的有效数和指数

应用于pandas的解决方案

4 个答案: