如何在列中找到三个输入值最多的值?

时间:2017-04-21 07:03:28

标签: python pandas dataframe

我有以下数据集:

      double d = 1010.00;
      Locale uk = new Locale("en", "GB");
      NumberFormat cf = NumberFormat.getCurrencyInstance(uk);
      String s = cf.format(d);

      System.out.println(s);

      Number number = null;
      try
      {
         number = cf.parse(s);
      }
      catch (ParseException e)
      {
         System.out.print(e);
      }
      double dClone = number.doubleValue();

如何使用python找到三个输入最多的系数值?

4 个答案:

答案 0 :(得分:4)

您可以使用value_counts选择第一个3索引值,因为value_counts对输出进行排序:

print (df['Coefficient'].value_counts())
0.2    9
0.5    9
0.3    4
0.1    4
0.8    4
0.6    3
0.9    3
0.4    2
0.7    1
Name: Coefficient, dtype: int64

print (df['Coefficient'].value_counts().index[:3])
Float64Index([0.2, 0.5, 0.3], dtype='float64')

答案 1 :(得分:1)

@jezrael和@Saikat提出的解决方案简洁而优雅。

这是使用defaultdict:

的另一种解决方案
from collections import defaultdict

df = dict()
df['Coefficient'] = [0.1,0.2,0.1,0.5,0.2,0.3,0.2,0.6,0.9,0.8,0.5,0.3,0.5,0.8,0.4,0.1,0.2,0.5,0.9,0.7,0.2,0.5,0.5,0.2,0.8,0.3,0.6,0.5,0.2,0.2,0.4,0.1,0.3,0.9,0.8,0.2,0.5,0.6,0.5]

d = defaultdict(int)

for i in df['Coefficient']:
    d[i] += 1

for w in sorted(d, key=d.get, reverse=True):
      print(w, d[w])

答案 2 :(得分:0)

print(pd.Series(df['Coefficient']).value_counts())

至于我,另一个答案是抛出错误。 另一个类似的问题:How to find three most entered value in a column?

答案 3 :(得分:0)

df = pd.DataFrame( {'Coefficient': [0.1,0.2,0.1,0.5,0.2,0.3,0.2,0.6,0.9,0.8,0.5,0.3,0.5,0.8,0.4,0.1,0.2, \
                                    0.5,0.9,0.7,0.2,0.5,0.5,0.2,0.8,0.3,0.6,0.5,0.2,0.2,0.4,0.1,0.3,0.9, \
                                    0.8,0.2,0.5,0.6,0.5]})
# Count the occurences of values
print (df['Coefficient'].value_counts())
0.2    9
0.5    9
0.3    4
0.1    4
0.8    4
0.6    3
0.9    3
0.4    2
0.7    1
Name: Coefficient, dtype: int64

# Retain the top 3 most common
print (df['Coefficient'].value_counts().iloc[:3])
0.2    9
0.5    9
0.3    4
Name: Coefficient, dtype: int64

# Only the values of the three most common in an array
print (df['Coefficient'].value_counts().iloc[:3].index.values)
[ 0.2  0.5  0.3]