我正在从文本文件中读取数据,格式如下。
1 1 5
1 3 3
1 5 4
2 1 5
2 4 3
3 1 2
3 3 4
3 4 3
3 5 4
第一列代表coachId,第二列代表playerId,最后一列代表每位教练给每位玩家的分数。所以现在说有3名教练和5名球员,我们给出的数据并不完整。我们基本上必须实施推荐系统并由每个教练为每个玩家生成缺失的分数。我已经完成了这一部分。所以基本上现在我想生成一个输出文件来填写缺失的分数。这是我的逻辑。
data = np.loadtxt('player.txt')
coaches = data.T[0]
players = data.T[1]
scores = data.T[2]
a = 0
total = 3 * 5 #total fields to fill is num of player times num of coaches
while a < total:
b = 0
while b < 3: #for each coach
#check if score was given
# if score is given don't do anything
# if score is not given get new socre and write it to file
如果我有很多教练和球员,我觉得这种方法可能需要很长时间。有更好的方法吗?
答案 0 :(得分:0)
将属于一起的值分成三个单独的列表。这使得访问它们变得更加困难。此外,如果您想要扩展文件,您不需要已经存在的分数值,而只需要教练和玩家组合已经存在的信息。如果已经存在组合,则可以将其存储在set
中以进行有效测试。
外部循环似乎在a
达到记录总数之前一直在运行?内部循环针对每个记录和每个教练执行,因此三次三个教练的记录总数。这没有多大意义。
以下是需要get_score_somehow()
填写的方法:
#!/usr/bin/env python
# coding: utf8
from __future__ import absolute_import, division, print_function
from itertools import product
def main():
filename = 'test.txt'
coach_count = 3
player_count = 5
already_scored = set()
with open(filename) as lines:
for line in lines:
coach_id, player_id, _ = map(int, line.split())
already_scored.add((coach_id, player_id))
with open(filename, 'w') as score_file:
for coach_id, player_id in product(
xrange(coach_count), xrange(player_count)
):
if (coach_id, player_id) not in already_scored:
score = get_score_somehow(coach_id, player_id)
record = [coach_id, player_id, score]
score_file.write(' '.join(map(str, record)) + '\n')
if __name__ == '__main__':
main()