我尝试制作这个算法:在0和1之间随机抽取(tir).si tir'<'pred然后Xestime2 = 1否则Xestime2 = 0。我希望在df ['X3']中应用此算法,但我在X3列的所有值中都有0。这解释了我的代码中有错误。 我的编码:
df = pd.read_csv(FNAME3, header=None)
print df[:15]
df['X2'] = df['X1'].round()
print df[:15]
s = StringIO()
df.to_csv("C:/Users/lenovo/Desktop/Nouveau dossier (2)/Resultats2.csv", header=None, index=False)
#print(s.getvalue())
##########################################""""""""
for row in df['X1']:
x = np.random.randint(0,2,10)
for row1 in x:
if row1 < row:
df['X3']=0
else:
df['X3']=1
#print df[:15]
df.to_csv("C:/Users/lenovo/Desktop/Nouveau dossier (2)/Resultats2.csv", header=None, index=False)
答案 0 :(得分:1)
由于您已标记此pandas,我将使用read_csv
:
In [1]: df = pd.read_csv('foo.csv', header=None)
In [2]: df
Out[2]:
0 1
0 0 0.487130
1 0 0.248932
2 0 0.248932
3 1 0.405285
4 1 0.405285
5 1 0.405285
6 1 0.405285
然后你可以round列(最接近的1):
In [3]: df[2] = df[1].round()
In [4]: df
Out[4]:
0 1 2
0 0 0.487130 0
1 0 0.248932 0
2 0 0.248932 0
3 1 0.405285 0
4 1 0.405285 0
5 1 0.405285 0
6 1 0.405285 0
如果任何值超过一半,它们将四舍五入为1.
由于您询问了如何将其发送到StringIO,它与您使用文件相同:
In [11]: s = StringIO()
In [12]: df.to_csv(s, header=None, index=False)
# alternatively write to file with df.to_csv('foo.csv', header=None, index=False)
In [13]: print(s.getvalue())
0.0,0.4871303471776849,0.0
0.0,0.2489319061991417,0.0
0.0,0.2489319061991417,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0
1.0,0.4052854182229446,0.0
答案 1 :(得分:1)
首先,如果您使用genfromtxt
,请不要使用pandas
。 read_csv
更加灵活。
from cStringIO import StringIO
from pandas import read_csv
sio = StringIO('''0.000000000000000000e+00,4.871303471776848859e-01
0.000000000000000000e+00,2.489319061991416837e-01
0.000000000000000000e+00,2.489319061991416837e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01
1.000000000000000000e+00,4.052854182229445601e-01''')
df = read_csv(sio, header=None, index_col=None)
df['Xestime'] = (df[1] > 0.5).astype(int)
df.to_csv('foo_with_Xestime.csv', index=False, header=False)
cat foo_with_Xestime.csv
:
0.0,0.4871303471776849,0
0.0,0.2489319061991417,0
0.0,0.2489319061991417,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0
1.0,0.4052854182229446,0
答案 2 :(得分:0)
您必须创建一个新文件。如果你没有错误读取输入数据(FNAME3是一个2列矩阵)并且尺寸匹配(Xestime与FNAME3的行数完全相同),那么
NEW_MATRIX=np.column_stack((FNAME3, Xestime))
解决方案应该有效。但是,第三列不会写入文件,而是写入NEW_MATRIX变量。你必须将它写入一个新文件,你就完成了:
np.savetxt('foo2.csv',NEW_MATRIX)
顺便说一句,你在if条件中的每个循环中覆盖你的Xestime变量。尝试使用Xestime.append(5)将新项添加到列表中,而不是为变量赋值。