在csv文件中添加数组的结果

时间:2013-09-03 20:53:02

标签: python csv numpy pandas

我尝试制作这个算法:在0和1之间随机抽取(tir).si tir'<'pred然后Xestime2 = 1否则Xestime2 = 0。我希望在df ['X3']中应用此算法,但我在X3列的所有值中都有0。这解释了我的代码中有错误。 我的编码:

df = pd.read_csv(FNAME3, header=None)
print df[:15]
df['X2'] = df['X1'].round()
print df[:15]
s = StringIO()
df.to_csv("C:/Users/lenovo/Desktop/Nouveau dossier (2)/Resultats2.csv", header=None, index=False)
#print(s.getvalue())

##########################################""""""""
for row in df['X1']:
    x = np.random.randint(0,2,10)
    for row1 in x:
        if row1 < row:
            df['X3']=0
        else:
            df['X3']=1
        #print df[:15]
df.to_csv("C:/Users/lenovo/Desktop/Nouveau dossier (2)/Resultats2.csv", header=None, index=False)

3 个答案:

答案 0 :(得分:1)

由于您已标记此pandas,我将使用read_csv

In [1]: df = pd.read_csv('foo.csv', header=None)

In [2]: df
Out[2]: 
   0         1
0  0  0.487130
1  0  0.248932
2  0  0.248932
3  1  0.405285
4  1  0.405285
5  1  0.405285
6  1  0.405285

然后你可以round列(最接近的1):

In [3]: df[2] = df[1].round()

In [4]: df
Out[4]: 
   0         1  2
0  0  0.487130  0
1  0  0.248932  0
2  0  0.248932  0
3  1  0.405285  0
4  1  0.405285  0
5  1  0.405285  0
6  1  0.405285  0

如果任何值超过一半,它们将四舍五入为1.

由于您询问了如何将其发送到StringIO,它与您使用文件相同:

In [11]: s = StringIO()

In [12]: df.to_csv(s, header=None, index=False)
# alternatively write to file with df.to_csv('foo.csv', header=None, index=False)

In [13]: print(s.getvalue())
0.0,0.4871303471776849,0.0
0.0,0.2489319061991417,0.0
0.0,0.2489319061991417,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0

答案 1 :(得分:1)

首先,如果您使用genfromtxt,请不要使用pandasread_csv更加灵活。

from cStringIO import StringIO
from pandas import read_csv

sio = StringIO('''0.000000000000000000e+00,4.871303471776848859e-01
0.000000000000000000e+00,2.489319061991416837e-01
0.000000000000000000e+00,2.489319061991416837e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01''')

df = read_csv(sio, header=None, index_col=None)
df['Xestime'] = (df[1] > 0.5).astype(int)
df.to_csv('foo_with_Xestime.csv', index=False, header=False)

cat foo_with_Xestime.csv

0.0,0.4871303471776849,0
0.0,0.2489319061991417,0
0.0,0.2489319061991417,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0

答案 2 :(得分:0)

您必须创建一个新文件。如果你没有错误读取输入数据(FNAME3是一个2列矩阵)并且尺寸匹配(Xestime与FNAME3的行数完全相同),那么

NEW_MATRIX=np.column_stack((FNAME3, Xestime))

解决方案应该有效。但是,第三列不会写入文件,而是写入NEW_MATRIX变量。你必须将它写入一个新文件,你就完成了:

np.savetxt('foo2.csv',NEW_MATRIX)

顺便说一句,你在if条件中的每个循环中覆盖你的Xestime变量。尝试使用Xestime.append(5)将新项添加到列表中,而不是为变量赋值。