DataFrame df有一个名为amount
的列import pandas as pd
df = pd.DataFrame(['$3,000,000.00','$3,000.00', '$200.5', '$5.5'], columns = ['Amount'])
DF:
ID | Amount
0 | $3,000,000.00
1 | $3,000.00
2 | $200.5
3 | $5.5
我想解析列量中的所有值并将数量提取为数字并忽略小数点。最终结果是DataFrame,如下所示:
ID | Amount
0 | 3000000
1 | 3000
2 | 200
3 | 5
我该怎么做?
答案 0 :(得分:7)
您可以str.replace
使用astype
进行双重投射:
df['Amount'] = (df.Amount.str.replace(r'[\$,]', '').astype(float).astype(int))
print (df)
Amount
0 3000000
1 3000
2 200
3 5
答案 1 :(得分:3)
您需要在列上使用map函数并重新分配到同一列:
import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' )
df.Amount = df.Amount.map(lambda s: int(locale.atof(s[1:])))
PS:这使用How do I use Python to convert a string to a number if it has commas in it as thousands separators?中的代码将表示数千的分隔符的字符串转换为int
答案 2 :(得分:1)
代码 -
import pandas as pd
def format_amount(x):
x = x[1:].split('.')[0]
return int(''.join(x.split(',')))
df = pd.DataFrame(['$3,000,000.00','$3,000.00', '$200.5', '$5.5'], columns =
['Amount'])
df['Amount'] = df['Amount'].apply(format_amount)
print(df)
输出 -
Amount
0 3000000
1 3000
2 200
3 5