Question

我有一所高中df。我试图删除学校名称的通用结尾。

in[1]:df
out[2]:
     time    school
1    09:00   Brown Academy
2    10:00   Covfefe High School
3    11:00   Bradley High
4    12:00   Johnson Prep

school_endings = ['Academy','Prep,'High','High School']

所需：

out[3]:
     time    school
1    09:00   Brown
2    10:00   Covfefe
3    11:00   Bradley
4    12:00   Johnson

Answer 1

使用拆分

df.school = df.school.str.split(' ').str[0]

    school  time
0   Brown   09:00
1   Covfefe 10:00
2   Bradley 11:00
3   Johnson 12:00

Answer 2

endings = ['Academy', 'Prep', 'High', 'High School']

endings = sorted(endings, key=len, reverse=True)

df.assign(school=df.school.replace(endings, '', regex=True).str.strip())

    time   school
1  09:00    Brown
2  10:00  Covfefe
3  11:00  Bradley
4  12:00  Johnson

Answer 3

使用rstrip()方法从原始字符串的后面删除不需要的字符串。 e.g：

mystring = "Brown Academy"

mystring.rstrip("Academy") - ＆gt;会给你o / p：'布朗'

Answer 4

我可能会使用正则表达式替换：

class Auction
  def self.by_invitee(user)
    name = user.name.downcase
    json = { name => user.email } # note: you should be downcasing emails anyways
    where('invitee ? :name', name: name).or(
      where('invitee @> :json', json: json)
    )
  end
end

删除Sub Strings pandas，python

4 个答案: