我有以下问题:
我有一个表示包及其版本的元组列表(某些包没有指定的版本,因此没有问题),如下所示:
('lib32c-dev', '', '', '')
('libc6-i386', '2.4', '', '')
('lib32c-dev', '', '', '')
('libc6-i386', '1.06', '', '')
('libc6-i386', '2.4', '', '')
('lib32c-dev', '', '', '')
('libc6-i386', '2.16', '', '')
('libc6-dev', '', '', '')
('', '', 'libc-dev', '')
('libc6-dev', '', '', '')
('', '', 'libc-dev', '')
('libncurses5-dev', '5.9+20150516-2ubuntu1', '', '')
('libc6-dev-x32', '', '', '')
('libc6-x32', '2.16', '', '')
('libncursesw5-dev', '5.9+20150516-2ubuntu1', '', '')
('libc6-dev-x32', '', '', '')
('libc6-x32', '2.16', '', '')
('libc6-dev-x32', '', '', '')
('libc6-x32', '2.16', '', '')
('libncurses5-dev', '5.9+20150516-2ubuntu1', '', '')
('libncursesw5-dev', '5.9+20150516-2ubuntu1', '', '')
正如您所看到的,有些软件包不止一次列在元组中,但版本不同。
我需要解析元组列表,以便在将列表转换为字典之前为每个包提供最新版本。
PS:包名称及其版本的位置不固定。但是我们可以说版本总是在包名后面,所以我们可以说版本总是在第1和第3位吗?
感谢您的帮助。
答案 0 :(得分:0)
你实际上应该先把它变成字典
data = {}
for value in my_list:
data2 = iter(value)
#find the first non-empty entry in our subtuple, that is our package name
name = next(d for d in data2 if d)
version = next(data2,'') # the version is whatever immediatly follows the package name
data.setdefault(name,[]).append(version)
这将使你获得90%的方式,虽然这取决于包名称是第一个元素......显然并不总是这样......
这里有一种方法可以从字符串中获取版本号
def float_version_from_string(version_string):
try:
return float(re.findall("\d.?\d*",version_string)[0])
except (IndexError,ValueError):
return -1
答案 1 :(得分:0)
棘手的部分是找到一个比较函数,可以可靠地确定哪个版本更新。例如,我们要将2.16
视为比2.4
更新,但是天真的字符串比较是不够的。更重要的是,浮点数比较不仅不足,当版本无法转换为浮点数时,它会引发ValueError
。
期望的排序类型可以称为“自然排序”或“人类排序”,this question中有一些解决方案。
可用于比较两个值(而不是排序列表)的实现可能类似于:
import re
def tryint(s):
try:
return int(s)
except:
return s
def getchunks(s):
return [tryint(c) for c in re.split('([0-9]+)', s)]
def compare_strings(s1, s2):
return getchunks(s1) > getchunks(s2)
# 2.4 < 2.16
# 2.4 < 2.4.1
# a_1 < a_2
# and so on...
这可以在一个相对简单的算法中使用,使用defaultdict
来跟踪已经看到的库。这假定元组列表包含在lib_tuples
。
from collections import defaultdict
lib_ver_dict = defaultdict(str)
for lib_tuple in lib_tuples:
generator = (string for string in lib_tuple if string)
lib, ver = next(generator), next(generator, '')
if compare_strings(ver, lib_ver_dict[lib]):
lib_ver_dict[lib] = ver
最终结果是:
'lib32c-dev': ''
'libc6-x32': '2.16'
'libc6-i386': '2.16'
'libncurses5-dev': '5.9+20150516-2ubuntu1'
'libc6-dev': ''
'libc-dev': ''
'libncursesw5-dev': '5.9+20150516-2ubuntu1'
'libc6-dev-x32': ''
请注意,compare_strings
不符合小数排序(例如2.001 == 2.1
);实现该细节会使代码更加混乱(并且可能无关紧要)。此外,如果您不想进行区分大小写的比较,则可以更新tryint
函数以在最后一行中使用s.lower()
。
编辑:您的解决方案应该可行,但我通常建议您在迭代时不要更改字典。此外,压缩keys
和values
似乎是可靠的,但更容易调用items
。最后,行del list_versions[:]
是荒谬的;它会创建一个全新的列表来删除它。您可以用更简洁的方式重写您的函数:
from functools import cmp_to_key
def compare_packages(package_dictionary):
new_dictionary = {}
for package, versions in package_dictionary.items():
version = max(versions, key=cmp_to_key(apt_pkg.version_compare))
new_dictionary[package] = version or 'Not Specified'
return new_dictionary
答案 2 :(得分:-1)
这只是一个动态编写的虚拟实现。它没有经过测试,只有当元组的第一个元素是包名,第二个元素是它的版本时,它才能工作。这可能无法为您提供确切的解决方案,但它应该可以帮助您解决问题。
my_list_of_tuples = [...] # some list
my_new_list = []
for tuple in my_list_of_tuples:
version = float(tuple[1])
package_name = tuple[0]
for tuple in my_new_list:
if tuple[0] == package_name and float(tuple[1]) > version:
my_new_list.append(tuple)
答案 3 :(得分:-1)
你可以迭代列表,并将包放在dict中,当且仅当它的新版本不存在时才会出现:
def version_as_list(s):
"""Converts string symoblizing version to list of integers
for comparsion purposes."""
return [int(i) for i in s.split('.')]
data = {}
for name, version, _, _:
if vesion_as_list(data.get(name, '')) < version_as_list(version):
data[name] = version
答案 4 :(得分:-1)
使用大量Python内置/库代码。似乎很长的解决方案,但实际上并非如此 - 这是因为我介入的文档。代码只有7行。
import re, itertools
pkgs = [('libc', '', '', ''), ... ] # your list of tuples
# a function to extract a version number from a string
rxVSN = re.compile('^(?P<vsn>\d+(\.\d+)?)')
def version(s):
mo = rxVSN.match(s)
return float(mo.group('vsn')) if mo is not None else 0.0
# step one: sort the list of tuples by package name and reverse version
# uses built-in sorted() function
# https://docs.python.org/2/library/functions.html#sorted
pkgs = sorted( pkgs, key = lambda tup: (tup[0], -version(tup[1])) )
# Now we can use the itertools.groupby() function to group the
# tuples by package name. Then we take the first element of each
# group because that is the one with the highest version number
# (because that's how we sorted them ...)
# https://docs.python.org/2/library/itertools.html#itertools.groupby
for (pkg, versions) in itertools.groupby( pkgs, key=lambda tup: tup[0]):
print pkg,": ", next(versions)
输出:
: ('', '', 'libc-dev', '')
lib32c-dev : ('lib32c-dev', '', '', '')
libc6-dev : ('libc6-dev', '', '', '')
libc6-dev-x32 : ('libc6-dev-x32', '', '', '')
libc6-i386 : ('libc6-i386', '2.4', '', '')
libc6-x32 : ('libc6-x32', '2.16', '', '')
libncurses5-dev : ('libncurses5-dev', '5.9+20150516-2ubuntu1', '', '')
libncursesw5-dev : ('libncursesw5-dev', '5.9+20150516-2ubuntu1', '', '')
答案 5 :(得分:-3)
我找到了理想的解决方案。我用过:
apt_pkg.version_compare(a,b).
谢谢大家。
功能:
def comparePackages(package_dictionary):
#loop in keys and values of package_dictionary
for package_name, list_versions in zip(package_dictionary.keys(), package_dictionary.values()) :
#loop on each sublist
for position in xrange(len(list_versions)) :
a = str(list_versions[position])
b = str(list_versions[position-1])
#the only way it worked was by using a and b
vc = apt_pkg.version_compare(a,b)
if vc > 0:
#a>b
max_version = a
elif vc == 0:
#a==b
max_version = a
elif vc < 0:
#a<b
max_version = b
del list_versions[:]
if(max_version is '') :
max_version = 'Not Specified'
package_dictionary[package_name] = max_version
输出:
lib32c-dev : Not Specified
libc6-x32 : 2.16
libc6-i386 : 2.16
libncurses5-dev : 5.9+20150516-2ubuntu1
libc6-dev : Not Specified
libc-dev : Not Specified
libncursesw5-dev : 5.9+20150516-2ubuntu1
libc6-dev-x32 : Not Specified