我想添加一个新列,其中包含一个具有基本/模板形式的URL,并且应该根据行中包含的信息将某些值插入其中。
base_link = "https://www.vectorbase.org/Glossina_fuscipes/Location/View?r=%(scaffold)s:%(start)s-%(end)s"
# simplify getting column data from data_frame
start = operator.attrgetter('start')
end = operator.attrgetter('end')
scaffold = operator.attrgetter('seqname')
def get_links_to_genome_browser(data_frame):
base_links = pd.Series([base_link]*len(data_frame.index))
links = base_links % {"scaffold":scaffold(data_frame),"start":start(data_frame),"end":end(data_frame)}
return links
答案 0 :(得分:2)
所以我回答了我自己的问题,但我终于明白了,所以我想把它关闭并记录解决方案。
解决方案是使用data_frame.apply()
但是要将get_links_to_genome_browser
函数中的索引语法更改为Series
语法而不是DataFrame
索引语法。
def get_links_to_genome_browser(series):
link = base_link % {"scaffold":series.ix['seqname'],"start":series.ix['start'],"end":series.ix['end']}
return link
然后称之为:
df.apply(get_links_to_genome_browser, axis=1)
答案 1 :(得分:0)
我想我得到你所要求的。让我知道
base_link = "https://www.vectorbase.org/Glossina_fuscipes/Location/View?r=%(scaffold)s:%(start)s-%(end)s"
然后你可以做这样的事情
data_frame['url'] = base_link + data_frame['start'] + data_frame['end'] + etc...