计算所有列中数据框的多个值

时间:2018-11-09 14:45:32

标签: python pandas dataframe

如何计算整个数据帧中多个值的出现?有没有没有for循环的方法?

Ex =计数数据帧所有列中的全0和-1

我在想类似df.apply.count(0,-1)

谢谢!

3 个答案:

答案 0 :(得分:3)

简单地将years <- c(60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 67L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 68L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 79L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 81L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 94L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L) melt一起使用

value_counts

df.melt().value.value_counts()

答案 1 :(得分:2)

设置

np.random.seed([3, 1415])

df = pd.DataFrame(
    np.random.choice(range(-2, 3), size=(10, 10)),
    columns=[*'abcdefghij']
)

df

   a  b  c  d  e  f  g  h  i  j
0 -2  1  0  1  0  0  1  0  1 -2
1  0 -2 -2  2 -2  0  0 -2  2 -1
2  1  0  2  2  2  2  1  1  1  2
3  1 -1  1 -2  2  2  0  0 -2  0
4  2 -2  2 -1  2  2  0  0 -2  0
5  2  1 -1  2  0  2  0  1  2  1
6 -2  1  1 -1 -2 -2  2  1 -2  2
7 -1  1  2  0  2 -2 -2  0 -2  2
8  2  1 -2  1 -1 -1  2  1  2  1
9 -1  1  2 -2  1  0 -2 -2  1 -1

numpy.in1d

这应该很快

np.in1d(df.values, [0, -1]).sum()

29

@SandeepKadapa

np.in1d(df.values, [0, -1]).sum()

相似的性能
np.isin(df.values.ravel(),[0,-1]).sum()

29

numpy.in1dnp.count_nonzero

这应该非常快

np.count_nonzero(np.in1d(df.values, [0, -1]))

29

applymap + set.__contain__ + numpy.sum

这有点厚脸皮

df.applymap({0, -1}.__contains__).values.sum()

29

答案 2 :(得分:0)

您可以尝试以下方法:

(df == 0).sum(axis=1).sum()

这将计算框架所有列中零的数目。