我有一个1620行7列的数据集。对于我的完整数据集,我想保留从第1行开始的每第5行值,并删除其他行。所以我想用Python将1st,6th,11th,16th ...存储在我的csv文件中完整数据集的行上。
我已经做到了,并将输出存储到一个csv文件中,但是我在输出的csv中没有获得行和列标签。我想在输出csv中获得行和列标签。
对此的修改: 我想将行标签设置为1,2,3,4,5..so而不是1,6,11,16,21 ...
数据集:
Serial,A,B,C,D,E,F
1,3.8,-5.9,-1.7,-1.4,8.3,-3.1
2,-5.4,-71.3,83.4,50.7,-1.3,88.4
3,3.0,5.3,1.4,5.7,6.6,2.3
4,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,0.0
6,-1.4,2.8,3.5,-6.7,-2.3,1.4
7,88.2,-0.1,-10.7,-36.2,88.1,-1.7
8,3.7,7.0,1.1,2.2,5.9,3.6
9,0.0,0.0,0.0,0.0,0.0,0.0
10,0.0,0.0,0.0,0.0,0.0,0.0
11,3.0,-1.6,-1.7,-4.2,7.8,-9.7
12,-7.1,-48.8,85.7,46.0,-2.8,-80.8
13,2.2,8.5,1.3,9.3,6.1,7.0
14,0.0,0.0,0.0,0.0,0.0,0.0
15,0.0,0.0,0.0,0.0,0.0,0.0
16,3.7,-6.2,-5.1,-2.5,0.0,-1.1
17,0.0,-60.3,88.8,45.1,0.0,90.0
18,2.9,9.3,3.9,8.3,6.9,8.6
19,0.0,0.0,0.0,0.0,0.0,0.0
20,0.0,0.0,0.0,0.0,0.0,0.0
21,3.7,-3.1,-8.3,-1.1,8.7,-3.3
import pandas as pd
import numpy as np
#importing straintest dataset with pandas
dataset=pd.read_csv('ABC.csv')
dataset = dataset.set_index('Serial')
X =dataset.iloc[::5, :].values
np.savetxt('Output.csv', X, delimiter= ',')
print("::::\n",X)
实际输出:
3.8,-5.9,-1.7,-1.4,8.3,-3.1
-1.4,2.8,3.5,-6.7,-2.3,1.4
3,-1.6,-1.7,-4.2,7.8,-9.7
3.7,-6.2,-5.1,-2.5,0,-1.1
3.7,-3.1,-8.3,-1.1,8.7,-3.3
预期输出:
Serial,A,B,C,D,E,F
1,3.8,-5.9,-1.7,-1.4,8.3,-3.1
2,-1.4,2.8,3.5,-6.7,-2.3,1.4
3,3,-1.6,-1.7,-4.2,7.8,-9.7
4,3.7,-6.2,-5.1,-2.5,0,-1.1
5,3.7,-3.1,-8.3,-1.1,8.7,-3.3
答案 0 :(得分:2)
如果需要列名,则可以使用pandas方法DataFrame.to_csv
:
#remove .values
X = dataset.iloc[::5]
X.to_csv('Output.csv')
或者:
dataset=pd.read_csv('ABC.csv')
X = dataset.iloc[::5]
X.to_csv('Output.csv', index=False)
编辑:您可以创建从0
开始的默认索引,必要时可以从1
开始添加rename
:
X = dataset.iloc[::5].reset_index(drop=True).rename(index = lambda x: x + 1)
X.to_csv('Output.csv')
或者:
dataset=pd.read_csv('ABC.csv')
X = dataset.iloc[::5].copy()
X['Serial'] = np.arange(1, len(X) + 1)
X.to_csv('Output.csv', index=False)