我正在尝试基于CSV文件中的某些数据创建一个新的DataFrame。
我的数据的格式为:
1, 81.99525117808678
2, 78.79210736916842
3, 69.33703048261454
4, 53.12612416937101
5, 48.8442549498639
6, 48.8442549498639
7, 38.96011640562207
8, 33.66251691693962
9, 29.202159649144907
10, 27.77726568480279
1, 81.99525117808678
2, 78.79210736916842
3, 69.33703048261454
4, 53.12612416937101
5, 48.8442549498639
6, 48.8442549498639
7, 38.96011640562207
8, 33.66251691693962
9, 29.202159649144907
10, 27.77726568480279
第一个数字代表索引,第二个数字代表值。我想为每个唯一的运行创建一个新列。例如:
Index: Run 1: Run 2:
1, 81.99525117808678, 81.99525117808678
2, 78.79210736916842, 78.79210736916842
3, 69.33703048261454, 69.33703048261454
4, 53.12612416937101, 53.12612416937101
5, 48.8442549498639, 48.8442549498639
6, 48.8442549498639, 48.8442549498639
7, 38.96011640562207, 38.96011640562207
8, 33.66251691693962, 33.66251691693962
9, 29.202159649144907, 29.202159649144907
10, 27.77726568480279, 27.77726568480279
到目前为止,我有以下内容:
df = pd.read_csv(path, header=None, names=['Generation', 'Fitness'], index_col=0)
这将产生结果:
0
1 81.995251
2 78.792107
3 69.337030
4 53.126124
5 48.844255
6 48.844255
7 38.960116
8 33.662517
9 29.202160
10 27.777266
1 81.995251
2 78.792107
3 69.337030
4 53.126124
5 48.844255
6 48.844255
7 38.960116
8 33.662517
9 29.202160
10 27.777266
答案 0 :(得分:2)
您可以创建一个大小为10的reader
迭代(有关详细信息,请参见docs),然后串联每个块:
reader = pd.read_csv('data.csv', sep=',', chunksize=10,
index_col=0, header=None, names=['Generation', 'Fitness'])
my_df = pd.concat((chunk for chunk in reader), axis=1)
>>> my_df
Fitness Fitness
Generation
1 81.995251 81.995251
2 78.792107 78.792107
3 69.337030 69.337030
4 53.126124 53.126124
5 48.844255 48.844255
6 48.844255 48.844255
7 38.960116 38.960116
8 33.662517 33.662517
9 29.202160 29.202160
10 27.777266 27.777266
如果您需要列名称,可以使用列表理解来重命名它们:
# python 3.6 or above
my_df.columns = [f'Run {i}' for i, _ in enumerate(my_df.columns,1)]
# Or:
my_df.columns = ['Run {}'.format(i) for i, _ in enumerate(my_df.columns,1)]
# Or:
my_df.columns = range(1,len(list(df))+1)
my_df = my_df.add_prefix('Run ')
>>> my_df
Run 1 Run 2
Generation
1 81.995251 81.995251
2 78.792107 78.792107
3 69.337030 69.337030
4 53.126124 53.126124
5 48.844255 48.844255
6 48.844255 48.844255
7 38.960116 38.960116
8 33.662517 33.662517
9 29.202160 29.202160
10 27.777266 27.777266