如何为非常相似的代码段编写方法或for循环

时间:2018-11-24 18:38:39

标签: python python-3.x for-loop machine-learning methods

我下面有一个代码示例。代码可以完美地工作,但是我的问题是,该代码不干净并且花费了太多代码,我相信可以使用方法或for循环来减少此代码,但我不知道该如何实现。代码段%90相同,仅变量侧发生更改。我只放了2个片段,但是我的代码却由5个片段组成

#KFOLD-1

all_fold_X_1 = pd.DataFrame(columns=['Sentence_txt'])
index = 0
for k, i in enumerate(dfNew['Sentence_txt'].values):
    if k in kFoldsTrain1:
        all_fold_X_1 = all_fold_X_1.append({index:i}, ignore_index=True)

X_train1 = count_vect.fit_transform(all_fold_X_1[0].values)

Y_train1 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1]
Y_train1 = np.asarray(Y_train1)

#KFOLD-2

all_fold_X_2 = pd.DataFrame(columns=['Sentence_txt'])
index = 0
for k, i in enumerate(dfNew['Sentence_txt'].values):
    if k in kFoldsTrain2:
        all_fold_X_2 = all_fold_X_2.append({index:i}, ignore_index=True)

X_train2 = count_vect.fit_transform(all_fold_X_2[0].values)

Y_train2 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain2]
Y_train2 = np.asarray(Y_train2)

1 个答案:

答案 0 :(得分:1)

由于没有提供完整的示例,因此我在作一些假设。也许符合以下几点:

def train(dataVar, dfNew):
    ret = {}
    index = 0
    for k, i in enumerate(dfNew['Sentence_txt'].values):
        if k in kFoldsTrain1:
            dataVar = dataVar.append({index:i}, ignore_index=True)

    ret['x'] = count_vect.fit_transform(dataVar[0].values)
    ret['y'] = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1]
    ret['y'] = np.asarray(Y_train1)

    return ret

#KFOLD-1
kfold1 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)

#KFOLD-2
kfold2 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)

您也许会明白。您可能不需要函数中的第二个参数,具体取决于变量'dfNew'是否为全局变量。我也离Python专家远! ;)