Question

当我参加预测建模练习时，我无法理解旗帜的使用。我用Google搜索了，但我无法找到最佳解释。

train = pd.read_csv('C:/Users/Analytics Vidhya/Desktop/challenge/Train.csv')
test = pd.read_csv('C:/Users/Analytics Vidhya/Desktop/challenge/Test.csv')
train['Type'] = 'Train' #Create a flag for Train and Test Data set
test['Type'] = 'Test'
fullData = pd.concat([train,test], axis=0) #Combined both Train and Test Data set

你能解释一下标志在Python pandas中意味着什么，以及标志的重要性。谢谢。

Answer 1

我想将它作为一个例子展示会更容易，更快：

In [102]: train = pd.DataFrame(np.random.randint(0, 5, (5, 3)), columns=list('abc'))

In [103]: test = pd.DataFrame(np.random.randint(0, 5, (3, 3)), columns=list('abc'))

In [104]: train
Out[104]:
   a  b  c
0  3  4  0
1  0  0  1
2  2  4  1
3  4  2  0
4  2  4  0

In [105]: test
Out[105]:
   a  b  c
0  1  0  3
1  3  3  0
2  4  4  3

让我们为每个DF添加Type列：

In [106]: train['Type'] = 'Train'

In [107]: test['Type'] = 'Test'

现在让我们加入/合并（垂直）两个DF - Type列将有助于区分两个不同DF的数据：

In [108]: fullData = pd.concat([train,test], axis=0)

In [109]: fullData
Out[109]:
   a  b  c   Type
0  3  4  0  Train
1  0  0  1  Train
2  2  4  1  Train
3  4  2  0  Train
4  2  4  0  Train
0  1  0  3   Test
1  3  3  0   Test
2  4  4  3   Test

使用＆＃34; flag＆＃34;是什么？在熊猫

1 个答案: