我正在编写一个程序,该程序需要读取带有如下所示足球得分的csv /文本文件:
Lions 3, Snakes 3
Tarantulas 1, FC Awesome 0
Lions 1, FC Awesome 1
Tarantulas 3, Snakes 1
Lions 4, Grouches 0
如果球队平局,则每支球队得1分;如果一支球队获胜,则得3分。
理想情况下,输出应如下所示:
1. Tarantulas, 6 pts
2. Lions, 5 pts
3. FC Awesome, 1 pt
3. Snakes, 1 pt
4. Grouches, 0 pts
这是我到目前为止的代码:
import pandas as pd
data = pd.read_csv("sample_input.csv", header=None, names=['left_team', 'right_team'])
data_dict = data.to_dict(orient='list')
def splitter(row):
left_team, right_team = row.split(',')
return {
'left_team': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right_team': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
我的问题是如何获取数据帧中的数据以比较值?我也尝试了在没有熊猫的情况下对解决方案进行编码,但是我为此感到挣扎。任何帮助将不胜感激!谢谢!
这是我尝试过的另一种解决方案:
from collections import defaultdict
import csv
reader = csv.DictReader(open('sample_input.csv', 'r'))
dict_list = []
for line in reader:
dict_list.append(line)
data_list = [splitter(row) for row in reader]
def splitter(row):
left_team, right_team = row.split(',')
return {
'left_team': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right_team': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
data_dicts = [splitter(row) for row in reader]
team_scores = defaultdict(int)
for game in data_dicts:
if game['left_score'] == game['right_score']:
team_scores[game['left']] += 1
team_scores[game['right']] += 1
elif game ['left_score'] > game['right_score']:
team_scores[game['left']] += 3
else:
team_scores[game['right']] += 3
teams_sorted = sorted(team_scores.items(), key=lambda team: team[1], reverse=True)
for line in teams_sorted:
print(line)
答案 0 :(得分:0)
这是一个简单的解决方案。第一步是清理数据,然后为每个团队分配分数。最后,您将每个团队的所有积分加在一起,无论它们出现在左侧还是右侧。
import pandas as pd
import numpy as np
# Create DataFrame from your input
df = pd.read_clipboard(sep=', ', header=None)
df.columns=['l_team', 'r_team']
# Clean the data, separating teams and their score.
df[['l_team', 'l_score']] = df.l_team.str.extract('(.*)\s(\d+)')
df[['r_team', 'r_score']] = df.r_team.str.extract('(.*)\s(\d+)')
df[['l_score', 'r_score']] = df[['l_score', 'r_score']].astype('int')
现在df
如下:
l_team r_team l_score r_score
0 Lions Snakes 3 3
1 Tarantulas FC Awesome 1 0
2 Lions FC Awesome 1 1
3 Tarantulas Snakes 3 1
4 Lions Grouches 4 0
确定左边或右边的球队得分了多少,并按队相加。我们使用Series.add
,因此它与索引对齐,在groupby
之后只是团队名称。
df['l_pts'] = np.select([df.l_score > df.r_score, df.l_score == df.r_score], [3, 1], 0)
df['r_pts'] = np.select([df.r_score > df.l_score, df.r_score == df.l_score], [3, 1], 0)
scores df.groupby('l_team').l_pts.sum().add(df.groupby('r_team').r_pts.sum(), fill_value=0).astype('int').sort_values(ascending=False)
输出:scores
Tarantulas 6
Lions 5
Snakes 1
FC Awesome 1
Grouches 0
dtype: int32
要完全匹配您的输出,可以执行以下操作:
pd.Series(scores.index+', '+scores.values.astype('str')+' pts', index=np.arange(1,len(scores)+1,1))
#1 Tarantulas, 6 pts
#2 Lions, 5 pts
#3 Snakes, 1 pts
#4 FC Awesome, 1 pts
#5 Grouches, 0 pts
答案 1 :(得分:0)
这里没有魔术。只需定义一个将分数转换为分数的函数,然后应用该函数,取消左右旋转,按组分组并对分数求和即可。可能会有更优雅的解决方案。
使用您的函数准备数据:
data = '''Lions 3, Snakes 3
Tarantulas 1, FC Awesome 0
Lions 1, FC Awesome 1
Tarantulas 3, Snakes 1
Lions 4, Grouches 0'''
def splitter(row):
left_team, right_team = row.split(',')
return {
'left_team': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right_team': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
data = pd.DataFrame(splitter(row) for row in data.split("\n"))
print(data)
Out:
left_score left_team right_score right_team
0 3 Lions 3 Snakes
1 1 Tarantulas 0 FC Awesome
2 1 Lions 1 FC Awesome
3 3 Tarantulas 1 Snakes
4 4 Lions 0 Grouches
使用得分添加球队得分列
def points(left_score, right_score):
win_points = 3
draw_points = 1
lose_points = 0
if left_score < right_score:
return pd.Series({'left_points': lose_points, 'right_points': win_points})
elif left_score > right_score:
return pd.Series({'left_points': win_points, 'right_points': lose_points})
else:
return pd.Series({'left_points': draw_points, 'right_points': draw_points})
data = data.merge(
data[['left_score', 'right_score']].apply(lambda row: points(*row), axis=1),
left_index=True, right_index=True
)
print(data)
Out:
left_score left_team right_score right_team left_points right_points
0 3 Lions 3 Snakes 1 1
1 1 Tarantulas 0 FC Awesome 3 0
2 1 Lions 1 FC Awesome 1 1
3 3 Tarantulas 1 Snakes 3 0
4 4 Lions 0 Grouches 3 0
取消枢纽:
data = pd.concat([
data[['left_team', 'left_points']]\
.rename({'left_team': 'team', 'left_points': 'points'}, axis=1),
data[['right_team', 'right_points']]\
.rename({'right_team': 'team', 'right_points': 'points'}, axis=1)
])
print(data)
Out:
team points
0 Lions 1
1 Tarantulas 3
2 Lions 1
3 Tarantulas 3
4 Lions 3
0 Snakes 1
1 FC Awesome 0
2 FC Awesome 1
3 Snakes 0
4 Grouches 0
分组依据以获得最终结果:
result = data.groupby("team")["points"].sum()
print(result)
Out:
team
FC Awesome 1
Grouches 0
Lions 5
Snakes 1
Tarantulas 6
Name: points, dtype: int64