我得到了项目列表及其得分。如何返回每个子文件夹中包含最高项目的列表
来自
[('../dir_a/1.png', 5.14),
('../dir_a/2.png', 5.15),
('../dir_b/3.png', 4.19),
('../dir_b/4.png', 3.81)]
收件人
[('../dir_a/2.png', 5.15),
('../dir_b/3.png', 4.19)]
答案 0 :(得分:0)
import os
result = {}
lst = [('../dir_a/1.png', 5.14), ('../dir_a/2.png', 5.15), ('../dir_b/3.png', 4.19), ('../dir_b/4.png', 3.81)]
for p in lst:
base_dir = os.path.basename(os.path.dirname(p[0])) #Use base dir as key
if base_dir not in result:
result[base_dir] = ("", 0)
if p[1] > result[base_dir][1]: #check score
result[base_dir] = (p)
print(result.values()) #Get values
输出:
[('../dir_a/2.png', 5.15), ('../dir_b/3.png', 4.19)]
答案 1 :(得分:0)
这是使用熊猫的方法:
# Create example data frame
df = pd.DataFrame([('../dir_a/1.png', 5.14),
('../dir_a/2.png', 5.15),
('../dir_b/3.png', 4.19),
('../dir_b/4.png', 3.81)], columns = ['path', 'score'])
# Split the file path by '/' and expand into columns with original data frame
df = pd.concat([df.path.str.split('/', expand=True), df], axis=1)
# Group the rows based on the directory name (column 1) and find the max score
df.groupby(1)['score'].max().reset_index()
1 score
0 dir_a 5.15
1 dir_b 4.19
然后,您可以根据需要将值转换回列表。
答案 2 :(得分:0)
如果只需要最大值,则可以将目录放在字典中并从中删除,然后保留最大值。
from pathlib import PurePath
max_dict = {}
for path, val in lst:
parent = PurePath(path).parent
max_dict[parent] = max(max_dict.get(parent, float('-inf')), val)
答案 3 :(得分:0)
这里是单线:
a = [('../dir_a/1.png', 5.14), ('../dir_a/2.png', 5.15), ('../dir_b/3.png', 4.19), ('../dir_b/4.png', 3.81)]
f = [(next((x[0] for x in a if x[1]==max([v[1] for v in a if v[0].split('/')[1]==k])),k),max([v[1] for v in a if v[0].split('/')[1]==k])) for k in set([kk[0].split('/')[1] for kk in a])]
输出:
[('../dir_a/2.png', 5.15), ('../dir_b/3.png', 4.19)]