根据Python中其中一列的值组合2列

时间:2018-02-03 04:21:36

标签: python dataframe

Education'] = df ['Husband_Education']。astype(str)+ df ['Husband_Black']`我想将一列中的一些值与python中另一列的内容结合起来。该列的值为:“是”,“否”和“年”。我只想将所有“年份值迁移到另一列”我的数据如下所示:

我的数据:

+--------------------+---------------+
| Husband_education  | Husband black |
+--------------------+---------------+
| less than 12           years       |
+--------------------+---------------+
| 12 -15 years       | No            |
+--------------------+---------------+
| 12-15 years        | yes           |
+--------------------+---------------+

期望的输出:

+--------------------+---------------+
| Husband_education  | Husband black |
+--------------------+---------------+
| less than 12 years |     ---       |
+--------------------+---------------+
| 12 -15 years       | No            |
+--------------------+---------------+
| 12-15 years        | yes           |
+--------------------+---------------+

我希望所有等于“年”的单词移到第一列并在第二列中保留“是”和“否”的值,我有3,772行

我的代码看起来像这样的想法?

for row in df['Husband_Black']: if 'years' in row : df['Husband_Education'] = df['Husband_Education'].astype(str) + df['Husband_Black']

2 个答案:

答案 0 :(得分:0)

我只想将所有"年价值迁移到另一列"我的数据看起来像这样我从中理解的是,年份后面的值应该只迁移到另一列:提取年份值有一个糟糕但容易的值:

# read your data line by line.
data = open("yourdatafile","r")
counter = 1
column ={}
for line in data:
   year_record = []
   # split by year
   line= line.split('years')
   # your example becomes ['Husband_education Husband black less than 12 
   #', ' 12-15 ', ' No 12-15 ', ' yes']
   #  now record the value 
   temp = line[0]
   year_record.append(temp[-1])
   year_record.append([line[1])
   temp = line[2]
   year_record.append([temp[1])
   # Record this record_value to the respective column 
   column[counter] = year_record 
   counter =counter +1

答案 1 :(得分:0)

问题解决了,我能够使用np并将整个数据帧放在一行而不是迭代:

    df['Husband_Education'] = np.where(df['Husband_Black']=='years',  df['Husband_Education'].map(str) + ' ' +'years', df['Husband_Education'])