我是python的新手,并且有一个名字和分数的csv文件,如下所示:
public class ComboBoxItemWrapper<T>
{
public T Value { get; set; }
public string Text { get; set; }
}
我需要知道如何阅读这个文件,文件必须显示两个名称相同的条目,例如Andrew,10和Andrew,11作为Andrew,10,11。我还需要能够按名称,最高分或平均分数进行排序。如果可能的话,我也希望仅为每个名称使用最后3个条目。 这是我试图用来按名称阅读和排序的代码:
#example strings:
nextline1 = "DD:MM:YYYY INFO - 'WeeklyMedal: Hole = 1; Par = 4; Index = 2; Distance = 459; Score = { Player1 = 4 };"
nextline2 = "DD:MM:YYYY INFO - 'WeeklyMedal: Hole = 1; Par = 4; Index = 2; Distance = 459; Score = { Player1 = 4; Player2 = 6; Player3 = 4 };"
import re
lineRegexp = re.compile(r'.+\'WeeklyMedal:(.+)\'?') #this regexp returns WeeklyMedal record.
weeklyMedalRegexp = re.compile(r'(\w+) = (\{.+\}|\w+)') #this regexp parses WeeklyMedal
#helper recursive function to process WeeklyMedal record. returns dictionary
parseWeeklyMedal = lambda r, info: { k: (int(v) if v.isdigit() else parseWeeklyMedal(r, v)) for (k, v) in r.findall(info)}
parsedLines = []
for line in [nextline1, nextline2]:
info = lineRegexp.search(line)
if info:
#process WeeklyMedal record
parsedLines.append(parseWeeklyMedal(weeklyMedalRegexp, info.group(0)))
#or do something with parsed dictionary in place
# do something here with entire result, print for example
print(parsedLines)
答案 0 :(得分:0)
熊猫非常好看
import pandas as pd
df = pd.read_csv("<pathToFileIN>",index_col=None,header=None)
df.columns = ["name","x"]
n = df.groupby("name").apply(lambda x: ",".join([str(_) for _ in x["x"].values[-3:]])).values
df.drop_duplicates(subset="name",inplace=True)
df["x"] = n
df.sort("name",inplace=True)
df.to_csv("<pathToFileOUT>",index=None,sep=";")
答案 1 :(得分:0)
要合并分数,请使用collections.defaultdict
:
scores_by_name = collections.defaultdict(list)
for row in Reader:
name = row[0]
score = int(row[1])
scores_by_name[name].append(score)
要保留最后三个分数,请选择3个项目:
scores_by_name = {name: scores[-3:] for name, score in scores_by_name.items()}
按字母顺序迭代:
for name, scores in sorted(scores_by_name.items()):
... # whatever
按最高分数进行迭代:
for name, scores in sorted(scores_by_name.items(), key=(lambda item: max(item[1]))):
...