使用pandas拆分列并使用提取的值填充另一列

时间:2016-03-14 16:42:46

标签: python string pandas split

class_name列包含课程名称和同期群组编号。 我想将列拆分为两列(名称,同类群号)

FROM:

| class_name |

| introduction to programming 1th |
| introduction to programming 2th |
| introduction to programming 3th |
| introduction to programming 4th |
| algorithms and data structure 1th |
| algorithms and data structure 2th |
| object-oriented programming |
| database systems |

(我知道它应该像第1,第2,第3,但字符串是我的语言,我们在数字后反复使用相同的字符。)

TO:

| class_name | class_cohort |

| introduction to programming | 1 |
| introduction to programming | 2 |
| introduction to programming | 3 |
| introduction to programming | 4 |
| algorithms and data structure | 1 |
| alrogithms and data structure | 2 |
| object-oriented programming | 1 |
| database systems | 1 |

以下是我一直在处理的代码:

import pandas as pd

course_count = 100
df = pd.read_csv("course.csv", nrows=course_count)

cols_interest=['class_name', 'class_department', 'class_type', 'student_target', 'student_enrolled']

df = df[cols_interest]
df.insert(1, 'class_cohort', 0)

# this is how I extract the numbers
df['class_name'].str.extract('(\d)').head()

# but I cannot figure out a way to copy those values into column 'class_cohort' which I filled with 0's.

# once I figure that out, I plan to discard the last digits
df['class_name'] = df['class_name'].map(lambda x: str(x)[:-1])

我简要地检查了一个解决方案,我将在1号,2号,3号之前放置逗号,然后使用逗号作为分隔符拆分列,但我无法找到替换\ s1th的方法 - > ,所有数字的第1位。

1 个答案:

答案 0 :(得分:1)

你可以indexing by positions

df['class_cohort'] = df['class_name'].str[-3:-2]
df['class_name'] = df['class_name'].str[:-4]
print df
   class_name class_cohort
0       cs101            1
1       cs101            2
2       cs101            3
3       cs101            4
4  algorithms            1
5  algorithms            2

或使用str.extract

df['class_cohort'] = df['class_name'].str.extract('(\d)')
df['class_name'] = df['class_name'].str[:-4]
print df
                      class_name class_cohort
0    introduction to programming            1
1    introduction to programming            2
2    introduction to programming            3
3    introduction to programming            4
4  algorithms and data structure            1
5  algorithms and data structure            2