Question

我有以下数据：

  1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1
  0 3 4 2 6 7 8 8 90 23 45 2 0 0 0 1
  0 3 4 2 6 7 8 6 93 23 45 2 0 0 0 1
  -1 3 4 2 6 7 8 8 21 23 45 2 0 0 0 1
  -1 3 4 2 6 7 8 8 0 23 45 2 0 0 0 1

以上数据位于文件中。我想计算1，0，-1的数量，但仅限于第1列。我正在使用标准输入文件，但我能想到的唯一方法是这样做：

  cnt = 0
  cnt1 = 0
  cnt2 = 0
  for line in sys.stdin:
      (t1, <having 15 different variables as that many columns are in files>) = re.split("\s+", line.strip())
      if re.match("+1", t1):
         cnt = cnt + 1
      if re.match("-1", t1):
         cnt1 = cnt1 + 1
      if re.match("0", t1):
         cnt2 = cnt2 + 1

我怎样才能让它变得更好，特别是15个不同的变量部分，因为这是我将使用这些变量的唯一地方。

Answer 1

如果您只想要第一列，则只拆分第一列。并使用字典存储每个值的计数。

count = dict()
for line in sys.stdin:
    (t1, rest) = line.split(' ', 1)
    try:
        count[t1] += 1
    except KeyError:
        count[t1] = 1
for item in count:
    print '%s occurs %i times' % (item, count[item])

Answer 2

使用collections.Counter：

from collections import Counter
with open('abc.txt') as f:
    c = Counter(int(line.split(None, 1)[0]) for line in f)
    print c

<强>输出：

Counter({0: 2, -1: 2, 1: 1})

此处str.split(None, 1)只拆分一行：

>>> s = "1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1"                                                
>>> s.split(None, 1)
['1', '3 4 2 6 7 8 8 93 23 45 2 0 0 0 1']

Numpy让它变得更加简单：

>>> import numpy as np
>>> from collections import Counter                                                         
>>> Counter(np.loadtxt('abc.txt', usecols=(0,), dtype=np.int))                                     
Counter({0: 2, -1: 2, 1: 1})

Answer 3

您可以只使用这些部分的第一个元素，而不是使用元组解包，您需要一些与split（）返回的部分数完全相等的变量，您可以使用这些部分的第一个元素：

parts = re.split("\s+", line.strip())
t1 = parts[0]

或等效地，简单地

t1 = re.split("\s+", line.strip())[0]

Answer 4

import collections

def countFirstColum(fileName):
    res = collections.defaultdict(int)
    with open(fileName) as f:
    for line in f:
        key = line.split(" ")[0]
        res[key] += 1;
    return res

Answer 5

这是来自我的一个带有infile的脚本，我检查了它并且使用标准输入作为infile：

dictionary = {}

for line in someInfile:
    line = line.strip('\n') # if infile but you should
    f = line.split() #  do your standard input thing
    dictionary[f[0]]=0

for line in someInfile:
    line = line.strip('\n') # if infile but you should
    f = line.split() #  do your standard input thing
    dictionary[f[0]]+=1

print dictionary

Answer 6

rows = []
for line in f:
    column = line.strip().split(" ")
    rows.append(column)

然后你得到一个二维数组。

第1栏：

for row in rows:
    print row[0]

输出：

计算文件python中的所有+ 1

6 个答案: