随机化2个CSV文件

时间:2019-06-04 06:13:16

标签: python csv random shuffle

我想使用python作为1对1函数同时随机化两个CSV文件。

File1.csv.                  File2.csv
1.                                A
2.                                B
3.                                C
4.                                D
5.                                E

输出将为

File1.csv.                 File2.csv
4.                               D
1.                               A
3.                               C
5.                               E
2.                               B

2 个答案:

答案 0 :(得分:0)

尝试使用numpy.random.shuffle wiki 例如:

import numpy as np

letters = ["A","B","C","D","E"]
numbers = [1,2,3,4,5,6]
np.random.shuffle(letters)
print(letters)
np.random.shuffle(numbers)
print(numbers)

输出在这里:

['A', 'C', 'B', 'E', 'D']
[2, 6, 4, 1, 5, 3]

答案 1 :(得分:0)

由于csv文件是静态平面文件,因此您无法直接将其改组。您需要将两个文件都读取为pd数据帧,将它们都洗牌,然后将它们写入csvs。这是代码:

df1 = pd.read_csv('datafile1.csv')
df2 = pd.read_csv('datafile2.csv')

# reset the index by row numbers, so that both dataframe has identical index
df1.reset_index(inplace=True)
df2.reset_index(inplace=True)

#Shuffle the rows
df1 = df1.sample(frac=1) # frac says what fraction of rows shall be returned, 1 means return all rows. This will ensure that all rows are shuffled randomly 
df2 = df2.loc[df1.index] # Since I am using index of df1 to order df2, I am ensuring same order 

# Put back the original indes
df1.set_index('index',drop=True, inplace=True)
df2.set_index('index',drop=True, inplace=True)

# Write back to original files
df1.to_csv('datafile1.csv')
df2.to_csv('datafile2.csv')