我有一个数据帧df,如下所示:
set.seed(2018)
PollsFiltered <- data.frame(
grade = sample(grades, 20, replace = T)
)
library(dplyr)
PollsFiltered %>%
left_join(lookup, by = "grade")
# grade gradenumber
#1 B- 9
#2 C 7
#3 A 13
#4 B+ 11
#5 C 7
#6 B 10
#7 C- 6
#8 A- 12
#9 F- 1
#10 C- 6
#11 C+ 8
#12 D+ 5
#13 F- 1
#14 D+ 5
#15 D- 3
#16 D+ 5
#17 B 10
#18 C- 6
#19 D 4
#20 D- 3
我想创建一个新数据框,其中一列的名称,另一列的括号之间是数字,如下所示:
data = {'A': ['Jason (121439)', 'Molly (194439)', 'Tina (114439)', 'Jake (127859)', 'Amy (122579)'],
'B': ['Bob (127439)', 'Mark (136489)', 'Tyler (121443)', 'John (126259)', 'Anna(174439)'],
'C': ['Jay (121596)', 'Ben (12589)', 'Toom (123586)', 'Josh (174859)', 'Al(121659)'],
'D': ['Paul (123839)', 'Aaron (124159)', 'Steve (161899)', 'Vince (179839)', 'Ron (128379)']}
df = pd.DataFrame(data)
我尝试了不同的方法,但是都没用:
1)
data2 = {'Name': ['Jason ', 'Molly ', 'Tina ', 'Jake ', 'Amy '],
'ID#': ['121439', '194439', '114439', '127859', '122579']}
result = pd.DataFrame(data2)
2)在所有单元格上应用功能
List_name=pd.DataFrame()
List_id=pd.DataFrame()
List_both=pd.DataFrame(columns=["Name","ID"])
for i in df.columns:
left=df[i].str.split("(",1).str[0]
right=df[i].str.split("(",1).str[1]
List_name=List_name.append(left)
List_id=List_id.append(right)
List_both=pd.concat([List_name,List_id], axis=1)
List_both
但是我想知道如何将其存储在看起来像Names = lambda x: x.str.split("(",1).str[0]
IDS = Names = lambda x: x.str.split("(",1).str[1]
...的数据框中...
答案 0 :(得分:2)
您可以使用stack
,然后使用str.extract
。
(df.stack()
.str.strip()
.str.extract(r'(?P<Name>.*?)\s*\((?P<ID>.*?)\)$')
.reset_index(drop=True))
Name ID
0 Jason 121439
1 Bob 127439
2 Jay 121596
3 Paul 123839
4 Molly 194439
5 Mark 136489
6 Ben 12589
7 Aaron 124159
8 Tina 114439
9 Tyler 121443
10 Toom 123586
11 Steve 161899
12 Jake 127859
13 John 126259
14 Josh 174859
15 Vince 179839
16 Amy 122579
17 Anna 174439
18 Al 121659
19 Ron 128379