我下面有一个代码示例。代码可以完美地工作,但是我的问题是,该代码不干净并且花费了太多代码,我相信可以使用方法或for循环来减少此代码,但我不知道该如何实现。代码段%90相同,仅变量侧发生更改。我只放了2个片段,但是我的代码却由5个片段组成
#KFOLD-1
all_fold_X_1 = pd.DataFrame(columns=['Sentence_txt'])
index = 0
for k, i in enumerate(dfNew['Sentence_txt'].values):
if k in kFoldsTrain1:
all_fold_X_1 = all_fold_X_1.append({index:i}, ignore_index=True)
X_train1 = count_vect.fit_transform(all_fold_X_1[0].values)
Y_train1 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1]
Y_train1 = np.asarray(Y_train1)
#KFOLD-2
all_fold_X_2 = pd.DataFrame(columns=['Sentence_txt'])
index = 0
for k, i in enumerate(dfNew['Sentence_txt'].values):
if k in kFoldsTrain2:
all_fold_X_2 = all_fold_X_2.append({index:i}, ignore_index=True)
X_train2 = count_vect.fit_transform(all_fold_X_2[0].values)
Y_train2 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain2]
Y_train2 = np.asarray(Y_train2)
答案 0 :(得分:1)
由于没有提供完整的示例,因此我在作一些假设。也许符合以下几点:
def train(dataVar, dfNew):
ret = {}
index = 0
for k, i in enumerate(dfNew['Sentence_txt'].values):
if k in kFoldsTrain1:
dataVar = dataVar.append({index:i}, ignore_index=True)
ret['x'] = count_vect.fit_transform(dataVar[0].values)
ret['y'] = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1]
ret['y'] = np.asarray(Y_train1)
return ret
#KFOLD-1
kfold1 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)
#KFOLD-2
kfold2 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)
您也许会明白。您可能不需要函数中的第二个参数,具体取决于变量'dfNew'是否为全局变量。我也离Python专家远! ;)