如何解析DataFrame列中的所有值?

时间:2016-05-20 16:09:02

标签: python pandas

DataFrame df有一个名为amount

的列
import pandas as pd
df = pd.DataFrame(['$3,000,000.00','$3,000.00', '$200.5', '$5.5'], columns = ['Amount'])

DF:

 ID | Amount
 0  | $3,000,000.00
 1  | $3,000.00
 2  | $200.5
 3  | $5.5

我想解析列量中的所有值并将数量提取为数字并忽略小数点。最终结果是DataFrame,如下所示:

 ID | Amount
 0  | 3000000
 1  | 3000
 2  | 200
 3  | 5

我该怎么做?

3 个答案:

答案 0 :(得分:7)

您可以str.replace使用astype进行双重投射:

df['Amount'] = (df.Amount.str.replace(r'[\$,]', '').astype(float).astype(int))
print (df)
    Amount
0  3000000
1     3000
2      200
3        5

答案 1 :(得分:3)

您需要在列上使用map函数并重新分配到同一列:

import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' )

df.Amount = df.Amount.map(lambda s: int(locale.atof(s[1:])))

PS:这使用How do I use Python to convert a string to a number if it has commas in it as thousands separators?中的代码将表示数千的分隔符的字符串转换为int

答案 2 :(得分:1)

代码 -

import pandas as pd

def format_amount(x):
    x = x[1:].split('.')[0]
    return int(''.join(x.split(',')))

df = pd.DataFrame(['$3,000,000.00','$3,000.00', '$200.5', '$5.5'], columns =
        ['Amount'])

df['Amount'] = df['Amount'].apply(format_amount)

print(df)

输出 -

    Amount
0  3000000
1     3000
2      200
3        5