Question

这是我在电子标题及其评级的文件中阅读的代码，我需要阅读文件并根据评级对其进行排序。我正在使用Python代码。

这就是文件的样子：

哈利波特与阿兹卡班的囚徒，7.8 指环王：双塔，8.7 蜘蛛侠，7.3 爱丽丝梦游仙境，6.5 好恐龙，6.7 功夫熊猫，7.6

filename =("movie_ratings.txt")
def ratings_sort(array):
with open (filename) as f:
    for pair in f:
        title.append(pair.strip())
    for index in f:
        value = array[index]
        i = index-1
    while i>=0:
        if value < array[i]:
            array[i+1]=array[i]
            array[i]=value
            i = i-1
        else:
            break

title  = list ()
rating = list('.')
filename =("movie_ratings.txt")
with open (filename) as f:
for pair in f:
    title.append(pair.strip())

title.sort()

ratings_sort = sorted(title, key=lambda rating:rating[2])    




print ("Old List :\n",title)
print('\n')
print("New List :\n" ,ratings_sort)

这些是我的结果，

旧名单： ['爱丽丝梦游仙境，6.5'，'哈利波特和阿兹卡班的囚徒，7.8'，'功夫熊猫，7.6'，'指环王：双塔，8.7'，'蜘蛛侠，7.3'，'好恐龙，6.7']

新名单： ['好恐龙，6.7'，'爱丽丝梦游仙境，6.5'，'蜘蛛侠，7.3'，'功夫熊猫，7.6'，'哈利波特和阿兹卡班的囚徒，7.8'，'指环王：双塔，8.7']

Answer 1

问题是＆＃34; for x in file＆＃34;循环从文件中读取行，因此title数组包含文件的行作为字符串。因此，key的{{1}}参数正在接收这些字符串并返回每个字符串的第三个字符（sorted）;请注意＆＃34;新列表＆＃34;确实按第三个字符排序 - e，i，i，n，r，r。要解决此问题，您可以将文件的行解析为表单的元组（标题，评级）并将其存储在数组中。然后，按评级排序就像从rating[2]参数中的元组到key获取评级一样简单。

但是，在我看来，您希望自己实现排序，而不是使用内置的sorted。看起来你要进行插入排序的实现，当你在这里发布时，缩进就搞砸了。该函数具有与解析文件行相同的问题，您需要遍历sorted的数字索引而不是第二循环中的array行。通过将f右移到if条件并仅指定比较评级的最终位置而不是交换，也可以稍微改善逻辑。

while

请注意，为了清楚起见，我重命名了一些变量并使用了namedtuple。另外，我将文件读取移出from collections import namedtuple def ratings_sort(movies): for index in range(1, len(movies)): movie = movies[index] i = index-1 while i>=0 and movie.rating < movies[i].rating: movies[i+1] = movies[i] i -= 1 movies[i+1] = movie filename = "movie_ratings.txt" Movie = namedtuple("Movie", "title rating") movies = list() with open(filename) as f: for line in f: part = line.partition(",") # gives a tuple: ("movie title", ",", "rating) movies.append(Movie(title=part[0].strip(), rating=float(part[2]))) print("Old List:\n", movies, "\n") # Sort using sorted sorted_movies = sorted(movies, key=lambda movie:movie.rating) # Sort using ratings_sort (modifies movies array unlike sorted) ratings_sort(movies) print("New List (using sorted):\n", sorted_movies, "\n") print("New List (using ratings_sort):\n", movies, "\n")，因此我可以将其与ratings_sort进行比较。

Answer 2

让我们一步一步解决您的问题：

所以你的问题有两部分：

首先，从文件中获取正确格式的数据
根据评分对其进行排序

对于第一部分，我尝试了两种方法：

第一种方法，使用手动发电机方法，

首先让我们打开文件：

with open('dsda') as f:
    data=[line.strip().split() for line in f if line!='\n'][0]

因为我需要浮动isdigit，但isdigit只支持int所以我想出这样的东西：

def isfloat(point):
    try:
        float(point)
        return True
    except ValueError:
        return False

现在让我们使用生成器方法以正确的形式获取数据：

def generator_approach(data_):
    storage=[]
    flag=True
    for word in data_:

        storage.append(word)
        if isfloat(word)==True:
            yield storage
            storage=[]


closure_ = generator_approach(data)
print(list(closure_))

输出：

[['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7'], ['Spider', 'Man', ',', '7.3'], ['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Kung', 'Fu', 'Panda', ',', '7.6']]

现在让我们尝试第二种方法，即正则表达式方法：

import re
pattern=r'\w.+?[0-9.]+'

with open('dsda') as f:
    for line in f:
        data_r=[line1.split() for line1 in re.findall(pattern,line)]

输出：

[['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7'], ['Spider', 'Man', ',', '7.3'], ['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Kung', 'Fu', 'Panda', ',', '7.6']]

正如您所看到的，两种方法的输出相同，现在根据评级对它们进行排序并不是一件大事：

print(sorted(data_r,key=lambda x:float(x[-1])))

输出：

[['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Spider', 'Man', ',', '7.3'], ['Kung', 'Fu', 'Panda', ',', '7.6'], ['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7']]

使用电影评级对文件进行排序

2 个答案: