如何计算整个数据帧中多个值的出现?有没有没有for循环的方法?
Ex =计数数据帧所有列中的全0和-1
我在想类似df.apply.count(0,-1)
谢谢!
答案 0 :(得分:3)
简单地将years <- c(60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L,
61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 62L,
62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 62L, 63L, 63L,
63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 63L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 65L, 65L, 65L, 65L,
65L, 65L, 65L, 65L, 65L, 65L, 65L, 65L, 66L, 66L, 66L, 66L, 66L,
66L, 66L, 66L, 66L, 66L, 66L, 66L, 67L, 67L, 67L, 67L, 67L, 67L,
67L, 67L, 67L, 67L, 67L, 67L, 68L, 68L, 68L, 68L, 68L, 68L, 68L,
68L, 68L, 68L, 68L, 68L, 69L, 69L, 69L, 69L, 69L, 69L, 69L, 69L,
69L, 69L, 69L, 69L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L, 70L,
70L, 70L, 70L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L,
71L, 71L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L,
72L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L,
74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 74L, 75L,
75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 76L, 76L,
76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 76L, 77L, 77L, 77L,
77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 77L, 78L, 78L, 78L, 78L,
78L, 78L, 78L, 78L, 78L, 78L, 78L, 78L, 79L, 79L, 79L, 79L, 79L,
79L, 79L, 79L, 79L, 79L, 79L, 79L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 81L, 81L, 81L, 81L, 81L, 81L, 81L,
81L, 81L, 81L, 81L, 81L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L,
82L, 82L, 82L, 82L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L,
83L, 83L, 83L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L,
84L, 84L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L,
85L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L,
87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 88L,
88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 89L, 89L,
89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 90L, 90L, 90L,
90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 91L, 91L, 91L, 91L,
91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 92L, 92L, 92L, 92L, 92L,
92L, 92L, 92L, 92L, 92L, 92L, 92L, 93L, 93L, 93L, 93L, 93L, 93L,
93L, 93L, 93L, 93L, 93L, 93L, 94L, 94L, 94L, 94L, 94L, 94L, 94L,
94L, 94L, 94L, 94L, 94L, 95L, 95L, 95L, 95L, 95L, 95L, 95L, 95L,
95L, 95L, 95L, 95L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L, 96L,
96L, 96L, 96L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L, 97L,
97L, 97L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L,
98L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L, 99L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
15L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L)
与melt
一起使用
value_counts
或
df.melt().value.value_counts()
答案 1 :(得分:2)
np.random.seed([3, 1415])
df = pd.DataFrame(
np.random.choice(range(-2, 3), size=(10, 10)),
columns=[*'abcdefghij']
)
df
a b c d e f g h i j
0 -2 1 0 1 0 0 1 0 1 -2
1 0 -2 -2 2 -2 0 0 -2 2 -1
2 1 0 2 2 2 2 1 1 1 2
3 1 -1 1 -2 2 2 0 0 -2 0
4 2 -2 2 -1 2 2 0 0 -2 0
5 2 1 -1 2 0 2 0 1 2 1
6 -2 1 1 -1 -2 -2 2 1 -2 2
7 -1 1 2 0 2 -2 -2 0 -2 2
8 2 1 -2 1 -1 -1 2 1 2 1
9 -1 1 2 -2 1 0 -2 -2 1 -1
numpy.in1d
这应该很快
np.in1d(df.values, [0, -1]).sum()
29
与np.in1d(df.values, [0, -1]).sum()
np.isin(df.values.ravel(),[0,-1]).sum()
29
numpy.in1d
与np.count_nonzero
这应该非常快
np.count_nonzero(np.in1d(df.values, [0, -1]))
29
applymap
+ set.__contain__
+ numpy.sum
这有点厚脸皮
df.applymap({0, -1}.__contains__).values.sum()
29
答案 2 :(得分:0)
您可以尝试以下方法:
(df == 0).sum(axis=1).sum()
这将计算框架所有列中零的数目。