有人可以解释以下三个数字在熊猫中的行为吗?我正在尝试加载以下值并正确表示它们。
随着尾随的0被删除,Pandas理解的值也会更改。似乎尾随数字正在影响值的重要性。由于不知道小数点后的非零值的实际长度,仅将第n个数字后的值截断是不可行的。
熊猫中是否有某些东西可以控制这种行为? 我曾尝试使用“ c”引擎,但输出相同。
正在从文本文件读取数据。
谢谢。
Loading sample_1.txt
Row : Raw Value : Pandas Value
0 : 954081199.495100000000000 : 954081199.4950998
1 : 954081199.49510000000000 : 954081199.4950998
2 : 954081199.4951000000000 : 954081199.4950998
3 : 954081199.495100000000 : 954081199.4951
4 : 954081199.49510000000 : 954081199.4951
5 : 954081199.4951000000 : 954081199.4951
6 : 954081199.495100000 : 954081199.4951
7 : 954081199.49510000 : 954081199.4951
8 : 954081199.4951000 : 954081199.4951
9 : 954081199.495100 : 954081199.4951
10 : 954081199.49510 : 954081199.4951
11 : 954081199.4951 : 954081199.4951
12 : 9449546861.291050000000000 : 9449546861.291044
13 : 9449546861.29105000000000 : 9449546861.291044
14 : 9449546861.2910500000000 : 9449546861.291046
15 : 9449546861.291050000000 : 9449546861.291046
16 : 9449546861.29105000000 : 9449546861.291048
17 : 9449546861.2910500000 : 9449546861.291048
18 : 9449546861.291050000 : 9449546861.291048
19 : 9449546861.29105000 : 9449546861.291048
20 : 9449546861.2910500 : 9449546861.29105
21 : 9449546861.291050 : 9449546861.29105
22 : 9449546861.29105 : 9449546861.29105
23 : 9752031802.626950000000000 : 9752031802.626955
24 : 9752031802.62695000000000 : 9752031802.626955
25 : 9752031802.6269500000000 : 9752031802.626951
26 : 9752031802.626950000000 : 9752031802.626951
27 : 9752031802.62695000000 : 9752031802.626951
28 : 9752031802.6269500000 : 9752031802.626951
29 : 9752031802.626950000 : 9752031802.626951
30 : 9752031802.62695000 : 9752031802.626951
31 : 9752031802.6269500 : 9752031802.62695
32 : 9752031802.626950 : 9752031802.62695
33 : 9752031802.62695 : 9752031802.62695
Done
产生上述输出的代码
#!/usr/bin/env python3
import pandas
def main():
file_name = 'sample_1.txt'
print ('Loading ' + file_name)
content_df = pandas.read_csv(file_name, delimiter='|', header=None, engine='python', skipinitialspace=True,skiprows=0,skipfooter=0)
num_rows = content_df.values.shape[0]
with open(file_name, 'r') as f:
lines_list = f.read().split('\n')
f.close()
rowcount = 0
print('Row : Raw Value' + ' '*22 + ': Pandas Value')
while rowcount < num_rows:
value_list = lines_list[rowcount].split('|')
print('{0:5d} : {1} : {2}'.format(rowcount, value_list[2].ljust(30, ' '), content_df.iloc[rowcount, 2]))
# print('row: ' + str(content_df.iloc[rowcount, 1]) + ': ' + str(content_df.iloc[rowcount, 2]) + ': ' + str(value_list[2]))
rowcount = rowcount +1
print ('Done')
if __name__ == '__main__':
main()
答案 0 :(得分:0)
这可以使用'c'引擎进行配置,但是float_precision选项设置为'high':float_precision ='high'。
非常感谢。
Loading sample_1.txt
Row : Raw Value : Pandas Value
0 : 954081199.495100000000000 : 954081199.4951
1 : 954081199.49510000000000 : 954081199.4951
2 : 954081199.4951000000000 : 954081199.4951
3 : 954081199.495100000000 : 954081199.4951
4 : 954081199.49510000000 : 954081199.4951
5 : 954081199.4951000000 : 954081199.4951
6 : 954081199.495100000 : 954081199.4951
7 : 954081199.49510000 : 954081199.4951
8 : 954081199.4951000 : 954081199.4951
9 : 954081199.495100 : 954081199.4951
10 : 954081199.49510 : 954081199.4951
11 : 954081199.4951 : 954081199.4951
12 : 9449546861.291050000000000 : 9449546861.29105
13 : 9449546861.29105000000000 : 9449546861.29105
14 : 9449546861.2910500000000 : 9449546861.29105
15 : 9449546861.291050000000 : 9449546861.29105
16 : 9449546861.29105000000 : 9449546861.29105
17 : 9449546861.2910500000 : 9449546861.29105
18 : 9449546861.291050000 : 9449546861.29105
19 : 9449546861.29105000 : 9449546861.29105
20 : 9449546861.2910500 : 9449546861.29105
21 : 9449546861.291050 : 9449546861.29105
22 : 9449546861.29105 : 9449546861.29105
23 : 9752031802.626950000000000 : 9752031802.626951
24 : 9752031802.62695000000000 : 9752031802.626951
25 : 9752031802.6269500000000 : 9752031802.626951
26 : 9752031802.626950000000 : 9752031802.626951
27 : 9752031802.62695000000 : 9752031802.626951
28 : 9752031802.6269500000 : 9752031802.626951
29 : 9752031802.626950000 : 9752031802.626951
30 : 9752031802.62695000 : 9752031802.626951
31 : 9752031802.6269500 : 9752031802.626951
32 : 9752031802.626950 : 9752031802.62695
33 : 9752031802.62695 : 9752031802.62695
Done
修改后的代码:
#!/usr/bin/env python3
import pandas
def main():
pandas.set_option('precision', 10)
file_name = 'sample_1.txt'
print ('Loading ' + file_name)
content_df = pandas.read_csv(file_name, delimiter='|', header=None, engine='c', skipinitialspace=True,skiprows=0,
float_precision='high')
num_rows = content_df.values.shape[0]
with open(file_name, 'r') as f:
lines_list = f.read().split('\n')
f.close()
rowcount = 0
print('Row : Raw Value' + ' '*22 + ': Pandas Value')
while rowcount < num_rows:
value_list = lines_list[rowcount].split('|')
print('{0:5d} : {1} : {2}'.format(rowcount, value_list[2].ljust(30, ' '), content_df.iloc[rowcount, 2]))
# print('row: ' + str(content_df.iloc[rowcount, 1]) + ': ' + str(content_df.iloc[rowcount, 2]) + ': ' + str(value_list[2]))
rowcount = rowcount +1
print ('Done')
if __name__ == '__main__':
main()