我有这样的数组:
array = [['page', 'pageviews'],
['page1', '65'],
['page2', '44'],
['page1', '40']]
如何让脚本遍历行并使用'page1'创建一行并将两个值'65'和'40'加在一起。
答案 0 :(得分:3)
使用pandas(你说你可以在评论中使用它),这变得非常简单:
import pandas as pd
df = pd.DataFrame(array[1:], columns=array[0])
df['pageviews'] = pd.to_numeric(df.pageviews)
summed = df.groupby('page').pageviews.sum()
这会产生以下熊猫系列:
page
page1 105
page2 44
您可以使用页面名称轻松编制索引:
summed['page1']
# 105
答案 1 :(得分:1)
其核心是分组问题。使用defaultdict
:
strLen = 12
Platform[0] = NVIDIA CUDA
strLen = 16
Platform[1] = Intel(R) OpenCL
strLen = 42
Platform[2] = Experimental OpenCL 2.1 CPU Only Platform
device[0] = GeForce GTX 960M
Error in clEnqueueWriteBuffer
Error in clBuildProgram
Error in clSetKernelArg
如果您希望结果与输入的格式相同(列表列表),请将dict转换为带有list comprehension
的列表:
from collections import defaultdict
sums = defaultdict(int)
for page, views in array[1:]:
sums[page] += int(views)
# result: defaultdict(<class 'int'>, {'page1': 105, 'page2': 44})
答案 2 :(得分:1)
以下是使用pandas
的解决方案:
import pandas as pd
# read list of lists into pandas dataframe
df = pd.DataFrame(array[1:], columns=array[0])
# convert views from string to integer
df['pageviews'] = df['pageviews'].astype(int)
# group by page, sum pageviews, create list from results
lst = df.groupby('page')['pageviews'].sum()\
.reset_index().values.tolist()
# add headers
res = [array[0]] + lst
print(res)
[['page', 'pageviews'],
['page1', 105],
['page2', 44]]
答案 3 :(得分:0)
您需要对其进行排序,之后您可以使用itertools.groupby
:
from itertools import groupby
array = [
['page', 'pageviews'],
['page1', '65'],
['page2', '44'],
['page1', '40']
]
# sort it on the first element of each item
array = sorted(array, key = lambda x: x[0])
# keys of interest
keys = ['page1', 'page2']
for k, v in groupby(array, key = lambda x: x[0]):
if k in keys:
s = sum([int(x[1]) for x in v])
print("Key: {}, Sum: {}".format(k, s))
这会产生
Key: page1, Sum: 105
Key: page2, Sum: 44