我有一个数据框,其中的值行已连接在一起,但用逗号分隔。
行1 foo,bar,test,case
Row2 底,球,篮,脚
目标是对每个字段值进行混洗/随机化,将保留行顺序(请勿混洗列,必须保留索引) 希望返回这样的内容:
Row1 测试,foo,case,bar
第2行球,脚,底,篮
解决方案:
Original_DF = # Our csv loaded data - the DF contains multiple columns of data attached to primary
data_list=[e for e in Original_DF['Data_List']] # each 'Data_List' field was one long string with a comma seperating words, we needed to make them a list
Shuff_DF=pd.DataFrame()
for i in range(len(data_list)):
myList=np.random.permutation(data_list[i].split(","))
myString = ",".join(myList)
Shuff_DF = Shuff_DF.append({'Data_List2': myString}, ignore_index=True)
Original_DF['Data_List2']=Shuff_DF['Data_List2'] # Append newly shuffled Lists to original df
答案 0 :(得分:0)
您可以使用numpy.random.permutation随机排列列表
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.permutation.html
示例:
import numpy.random
mydata = "foo,bar,baz,bat"
print(numpy.random.permutation(mydata.split(",")))
答案 1 :(得分:0)
另一种使用熊猫功能的方法(示例):
addiu
也许有一种更优雅的方法可以使用Apply来避免for循环。
PS:或者,通过@Simon Crane修改另一个答案:
sltu