我正在从Vizier下载星型目录(使用astroquery)。有关的目录不包括星星名称,因此我可以通过查询我的Vizier目录中每颗星星的1 arcsec内的所有SIMBAD星星,从SIMBAD(也使用astroquery)中获取这些星星。
然后我需要通过ra / dec坐标进行匹配。但是Vizier和SIMBAD的坐标可能都不太准确,因此我无法进行精确匹配。
我当前的解决方案是指定一个公差,对于每个Vizier星,调用下面的函数以遍历SIMBAD星,测试坐标是否在指定公差内。仔细检查一下,因为恒星可能非常靠近,所以我还要检查恒星的大小是否在0.1 mag之内。
这一切都有效,但是对于大约2,000颗星的Vizier目录和类似大小的SIMBAD数据集,运行需要2分钟以上。我正在寻找加快这一步的想法。
def get_simbad_name(self, vizier_star, simbad_stars, tolerance):
"""
Searches simbad_stars to find the SIMBAD name of the star
referenced in vizier_star.
A match is deemed to exist if a star in simbad_stars has both
ra and dec +/- tolerance of the target vizier_star and if their V
magnitudes, rounded to one decimal place, also match.
Parameters
==========
vizier_star : astropy.table.Row
Row of results from Vizier query, corresponding to a star in a
Vizier catalog. Columns of interest to this function are:
'_RAJ2000' : float [Right ascension in decimal degrees]
'_DEJ2000' : float [Declination in decimal degrees]
'Vmag' : float [V magnitude (to 3 decimal places)]
simbad_stars : list of dict
List of star data derived from a Vizier query. Keys of interest
to this function are:
'ra' : float [Right ascension in decimal degrees (ICRS/J2000)]
'dec' : float [Declination in decimal degrees (ICRS/J2000)]
'Vmag' : float [V magnitude (to 3 decimal places)]
'name' : str [SIMBAD primary id of star]
tolerance : float
The tolerance, in degrees, to be used in determining whether
the ra/dec coordinates match.
Returns
=======
name : str
If match then returns the SIMBAD name. If no match returns
an empty string.
Notes
=====
simbad_stars are not all guaranteed to have Vmag. Any that don't are
ignored.
"""
for item in simbad_stars:
try:
approx_Vmag = round(item['Vmag'],1)
except KeyError:
continue
if ((vizier_star['_RAJ2000'] > item['ra'] - tolerance) and
(vizier_star['_RAJ2000'] < item['ra'] + tolerance) and
(vizier_star['_DEJ2000'] > item['dec'] - tolerance) and
(vizier_star['_DEJ2000'] < item['dec'] + tolerance) and
(round(vizier_star['Vmag'],1) == approx_Vmag)):
return item['name']
return ''
评论后还有其他想法:
比赛成功率非常高(大约99%),因此在几乎所有情况下循环都会提前退出。不必迭代所有simbad_stars。
如果我按ra对simbad_stars进行预排序并使用二进制印章来获取从何处开始循环的索引,则可以进一步改进。
答案 0 :(得分:1)
该问题似乎由于其询问方式而被关闭,但是有两个有用的答案:
(1)进行位置交叉匹配,请参见https://docs.astropy.org/en/stable/coordinates/matchsep.html
(2)对于您在此处所做的一般情况,应该使用向量化操作,而不是遍历源代码。
答案 1 :(得分:0)
通过对simbad_stars进行预排序并使用bisect_left和bisect_right定义其中的开始和结束索引,我设法将速度提高了20倍。
如果有人感兴趣,我可以发布代码(这比原始代码长很多,因为它是使用自定义类的更通用的解决方案)。