Question

举个例子，列C有1000个单元格，大多数填充有“ 1”，但是其中散布了几个“ 2”。我试图找到有多少个“ 2”并打印数字。

import openpyxl

wb = openpyxl.load_workbook('TestBook')
ws = wb.get_sheet_by_name('Sheet1')

for cell in ws['C']:
    print(cell.value)

我如何遍历该列，然后拉几位？

Answer 1

正如@ K.Marker指出的那样，您可以使用以下方式查询行中特定值的计数：

[c.value for c in ws['C']].count(2)

但是，如果您不知道值和/或想查看特定行的值分布怎么办？您可以使用具有类似Counter行为的dict。

In [446]: from collections import Counter

In [448]: from collections import Counter

In [449]: counter = Counter([c.value for c in ws[3]])

In [451]: counter
Out[451]: Counter({1: 17, 2: 5})

In [452]: for k, v in counter.items():
     ...:     print('{0} occurs {1} time(s)'.format(k, v))
     ...:
1 occurs 17 time(s)
2 occurs 5 time(s)

Answer 2

react-router

列表推导会在整个C列中创建一个单元格值列表，并计算其中的2个值。

Answer 3

您要查找的数字是2吗？

count = 0
#load a row in the list
row = list(worksheet.rows)[wantedRowNumber]

#iterate over it and increase the count
for r in row:
    if r==2:
        count+=1

现在，这仅适用于值“ 2”，而找不到其他异常值。要找到异常值，通常必须先确定一个阈值。在此示例中，我将使用平均值，尽管您将需要确定最佳测试以根据数据获取离群值阈值。不用担心，统计数据很有趣！

count = 0
#load a row in the list
row = list(worksheet.rows)[wantedRowNumber]

#calculatethe average
#using numpy
import numpy as np
NPavg = np.mean(list)

#without numpy
#need to cast it to float - otherwise it will round it to int
avg=sum(row)/float(len(row))

#iterate over it and increase the count
for r in row:
    #of course use your own threshold, 
    #determined appropriately, instead of average
    if r>NPavg:
        count+=1

在Excel行中查找异常值

3 个答案: