熊猫新手在这里。删除每个团队记录并将其放入新列的最佳方法是什么?预先感谢!
Rank Team
0 1 LA Rams (5-0)
1 2 New Orleans (4-1)
2 3 New England (3-2)
3 4 Kansas City (5-0)
4 5 Pittsburgh (2-2-1)
5 6 Baltimore (3-2)
答案 0 :(得分:0)
有趣的问题。
不幸的是,Series.str.extract
会很容易地获取记录,但不会删除它(使用朴素的正则表达式,以防团队使用(...)
来命名更复杂的记录) :
df['Record'] = df['Team'].str.extract('(\(.*?\))')
print(df)
# Rank Team record
# 0 1 LA Rams (5-0) (5-0)
# 1 2 New Orleans (4-1) (4-1)
# 2 3 New England (3-2) (3-2)
# 3 4 Kansas City (5-0) (5-0)
# 4 5 Pittsburgh (2-2-1) (2-2-1)
# 5 6 Baltimore (3-2) (3-2)
这将需要实现我们自己的功能:
import re
record_regex = re.compile(r'(\(.*?\))')
records = []
def extract_and_remove_record(x):
record = record_regex.findall(x)[0]
records.append(record)
return record_regex.sub('', x)
df['Team'] = df['Team'].apply(extract_and_remove_record)
df['Record'] = records
print(df)
# Rank Team Records
# 0 1 LA Rams (5-0)
# 1 2 New Orleans (4-1)
# 2 3 New England (3-2)
# 3 4 Kansas City (5-0)
# 4 5 Pittsburgh (2-2-1)
# 5 6 Baltimore (3-2)
答案 1 :(得分:0)
另一种不涉及正则表达式技巧的方法。
df[['Team Name', 'Team Records']] = d.Team.apply(lambda x: pd.Series(x.rstrip(')').split(' (')))
df.drop('Team', axis=1, inplace=True)