当两行具有相同的第一个值时,如何将行的数据添加到另一行

时间:2018-06-02 13:27:25

标签: python python-3.x list pandas

我有这样的数组:

array = [['page', 'pageviews'],
         ['page1', '65'],
         ['page2', '44'],
         ['page1', '40']]

如何让脚本遍历行并使用'page1'创建一行并将两个值'65'和'40'加在一起。

4 个答案:

答案 0 :(得分:3)

使用pandas(你说你可以在评论中使用它),这变得非常简单:

import pandas as pd

df = pd.DataFrame(array[1:], columns=array[0])
df['pageviews'] = pd.to_numeric(df.pageviews)
summed = df.groupby('page').pageviews.sum()

这会产生以下熊猫系列:

page
page1    105
page2     44

您可以使用页面名称轻松编制索引:

summed['page1']
# 105

答案 1 :(得分:1)

其核心是分组问题。使用defaultdict

可以轻松进行分组
strLen = 12
Platform[0] = NVIDIA CUDA
strLen = 16
Platform[1] = Intel(R) OpenCL
strLen = 42
Platform[2] = Experimental OpenCL 2.1 CPU Only Platform
device[0] = GeForce GTX 960M
Error in clEnqueueWriteBuffer
Error in clBuildProgram
Error in clSetKernelArg

如果您希望结果与输入的格式相同(列表列表),请将dict转换为带有list comprehension的列表:

from collections import defaultdict

sums = defaultdict(int)
for page, views in array[1:]:
    sums[page] += int(views)

# result: defaultdict(<class 'int'>, {'page1': 105, 'page2': 44})

答案 2 :(得分:1)

以下是使用pandas的解决方案:

import pandas as pd

# read list of lists into pandas dataframe
df = pd.DataFrame(array[1:], columns=array[0])

# convert views from string to integer
df['pageviews'] = df['pageviews'].astype(int)

# group by page, sum pageviews, create list from results
lst = df.groupby('page')['pageviews'].sum()\
        .reset_index().values.tolist()

# add headers
res = [array[0]] + lst

print(res)

[['page', 'pageviews'],
 ['page1', 105],
 ['page2', 44]]

答案 3 :(得分:0)

您需要对其进行排序,之后您可以使用itertools.groupby

from itertools import groupby

array = [ 
    ['page', 'pageviews'],
    ['page1', '65'],
    ['page2', '44'],
    ['page1', '40']
]

# sort it on the first element of each item
array = sorted(array, key = lambda x: x[0])

# keys of interest
keys = ['page1', 'page2']

for k, v in groupby(array, key = lambda x: x[0]):
    if k in keys:
        s = sum([int(x[1]) for x in v])
        print("Key: {}, Sum: {}".format(k, s))

这会产生

Key: page1, Sum: 105
Key: page2, Sum: 44