我有一个程序在csv文件上运行,以创建如下所示的输出:
724, 2
724, 1
725, 3
725, 3
726, 1
726, 0
我想用一些简单的数学运算来修改脚本,以便它可以渲染输出:
724, 1.5
725, 3
726, 0.5
我目前使用的脚本在这里:
lines=open("1.txt",'r').read().splitlines()
for l in lines:
data = l.split('"Overall evaluation:')
if len(data) == 2:
print(data[0] + ", " + data[1])
如何向该管道添加简单的平均和切片操作?
我想我需要创建一些临时变量,但它应该在遍历行的循环之外?
也许是这样的:
lines=open("EasyChairData.csv",'r').read().splitlines()
for l in lines:
data = l.split('"Overall evaluation:')
submission_number_repo = data[0]
if len(data) == 2:
print(data[0] + ", " + data[1])
if submission_number_repo != data[0]
submission_number_repo = data[0]
编辑
该功能只是一个简单的平均值
答案 0 :(得分:1)
您可以使用将键映射到总计和计数的字典,然后将其打印出来:
map = {}
lines=open("1.txt",'r').read().splitlines()
for l in lines:
data = l.split('"Overall evaluation:')
if len(data) == 2:
if data[0] not in map.keys():
map[data[0]] = (0,0)
map[data[0]] = (map[data[0]][0]+int(data[1]) , map[data[0]][1]+1)
for x, y in map.items():
print(str(x) + ", " + str(y[0]/y[1]))
答案 1 :(得分:1)
我只想用密钥存储一个值列表。然后在读取文件时取平均值。
lines=open("1.txt",'r').read().splitlines()
results = {}
for l in lines:
data = l.split('"Overall evaluation:')
if len(data) == 2:
if data[0] in results:
results[data[0]].append(data[1])
else:
results[data[0]] = [data[1]]
for k,v in results.iteritems():
print("{} , {}".format(k, sum(v)/len(v) ))
答案 2 :(得分:1)
(编辑以避免存储值)
我爱# install.packages("neuralnet")
library(neuralnet)
# adapted iris
data(iris)
iris2 <- iris
iris2$setosa <- c(iris2$Species == 'setosa')
iris2$versicolor <- c(iris2$Species == 'versicolor')
iris2$virginica <- c(iris2$Species == 'virginica')
# iris2$Species <- NULL
# training and validation subsets
train.samples <- sample(nrow(iris), nrow(iris)*0.5)
train <- iris2[train.samples,]
valid <- iris2[-train.samples,]
# fit model
inet <- neuralnet(setosa + versicolor + virginica ~ Sepal.Length + Sepal.Width +
Petal.Length + Petal.Width, train, hidden=3, lifesign="full")
# prediction
pred <- compute(inet, valid[,1:4])
head(pred$net.result) # only one level (probability of each category)
predspp <- factor(c("setosa" , "versicolor", "virginica"))[apply(pred$net.result, MARGIN=1, FUN=which.max)]
table(predspp, valid$Species)
# predspp setosa versicolor virginica
# setosa 19 0 0
# versicolor 0 24 4
# virginica 0 2 26
:
defaultdict
答案 3 :(得分:1)
一种简单的方法是保持状态存储当前数量,当前总和和项目数,并且仅在当前数字改变时打印它(不要忘记打印最后状态!)。代码可以是:
lines=open("1.txt",'r') # .read().splitlines() is useless and only force a full load in memory
state = [None]
for l in lines:
data = l.split('"Overall evaluation:')
if len(data) == 2:
if data[0] != state[0]:
if state[0] is not None:
average = state[1]/state[2]
print(state[0] + ", " + str(average))
state = [data[0], 0., 0]
state[1] += float(data[1])
state[2] += 1
if state[0] is not None:
average = state[1]/state[2]
print(data[0] + ", " + str(average))