我有一个csv文件,它采用以下格式:
A, -0.1234540756893158
B, 0.123450496711731
C, 0.12345994493484497
D, -0.12345484461784363
E, 12344656.0
F, -1234648.0
G, 12342316.0
H, 12552.37109375
I, 16247.228515625
J, -12.123796875
K, 1081104201
L, 123
我正在阅读:
df = pd.read_csv('/output.csv', header=None, names=['c1','c2'])
然后我会得到如下有趣的索引,并将其保存在csv:
中my_list = [0,1,2,3,4,5,6,7,8,9,10,11]
df[df.index.isin(my_list)].to_csv(thefile2, sep=',', header=None, index = False)
但是当我检查“thefile2”的内容时,我得到了这样一个输出:
A,-0.123454075689
B,0.123450496712
C,0.123459944935
D,-0.123454844618
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.3710938
I,16247.2285156
J,-12.123797
K,1081104201.0
L,123.0
可以看出,A,B,C,D,H,I和J的值向上舍入,K和L在末尾具有0。在输出文件中。我的问题是,如何在第二列中获取原始值?
答案 0 :(得分:1)
使用参数dtype=str
将所有值投放到read_csv
中的string
:
pd.read_csv('/output.csv', header=None, names=['c1','c2'], dtype=str)
样品:
import pandas as pd
from pandas.compat import StringIO
temp=u"""A,-0.1234540756893158
B,0.123450496711731
C,0.12345994493484497
D,-0.12345484461784363
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.37109375
I,16247.228515625
J,-12.123796875
K,1081104201
L,123"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), header=None, names=['c1','c2'], dtype=str)
print (df)
c1 c2
0 A -0.1234540756893158
1 B 0.123450496711731
2 C 0.12345994493484497
3 D -0.12345484461784363
4 E 12344656.0
5 F -1234648.0
6 G 12342316.0
7 H 12552.37109375
8 I 16247.228515625
9 J -12.123796875
10 K 1081104201
11 L 123
print (type(df.loc[0, 'c2']))
<class 'str'>