保留pandas数据帧的原始值

时间:2016-12-21 13:23:41

标签: python django csv pandas

我有一个csv文件,它采用以下格式:

A, -0.1234540756893158
B, 0.123450496711731
C, 0.12345994493484497
D, -0.12345484461784363
E, 12344656.0
F, -1234648.0
G, 12342316.0
H, 12552.37109375
I, 16247.228515625
J, -12.123796875
K, 1081104201
L, 123

我正在阅读:

df = pd.read_csv('/output.csv', header=None, names=['c1','c2'])

然后我会得到如下有趣的索引,并将其保存在csv:

my_list = [0,1,2,3,4,5,6,7,8,9,10,11]
df[df.index.isin(my_list)].to_csv(thefile2, sep=',', header=None, index = False)

但是当我检查“thefile2”的内容时,我得到了这样一个输出:

A,-0.123454075689
B,0.123450496712
C,0.123459944935
D,-0.123454844618
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.3710938
I,16247.2285156
J,-12.123797
K,1081104201.0
L,123.0

可以看出,A,B,C,D,H,I和J的值向上舍入,K和L在末尾具有0。在输出文件中。我的问题是,如何在第二列中获取原始值?

1 个答案:

答案 0 :(得分:1)

使用参数dtype=str将所有值投放到read_csv中的string

pd.read_csv('/output.csv', header=None, names=['c1','c2'], dtype=str)

样品:

import pandas as pd
from pandas.compat import StringIO

temp=u"""A,-0.1234540756893158
B,0.123450496711731
C,0.12345994493484497
D,-0.12345484461784363
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.37109375
I,16247.228515625
J,-12.123796875
K,1081104201
L,123"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), header=None, names=['c1','c2'], dtype=str)
print (df)
   c1                    c2
0   A   -0.1234540756893158
1   B     0.123450496711731
2   C   0.12345994493484497
3   D  -0.12345484461784363
4   E            12344656.0
5   F            -1234648.0
6   G            12342316.0
7   H        12552.37109375
8   I       16247.228515625
9   J         -12.123796875
10  K            1081104201
11  L                   123

print (type(df.loc[0, 'c2']))
<class 'str'>