ValueError:long()的基数为10的文字无效:' 5B'

时间:2016-05-23 14:41:05

标签: python python-2.7 pandas

我对此错误的理解是,它意味着有一个long()类型的列。但是此列包含一个名为' 5B'这不是一个很长的类型。

这是发生错误的行:

df_Company = df1.groupby(by=['manufacturer','quality_issue'], as_index=False) ['quality_issue2'].count()

我检查了数据帧df1的所有列类型。但是没有类型为long的列。 5B是制造商的名称,所以我假设在这句话中,列制造商突然变成了长型。

检查数据帧df1的类型。

print (df1.dtypes)
manufacturer                    object
yearweek                         int64
quality_issue                   object
quality_issue2                  object

我认为'我必须对df_Company.astype(long)做一些事情,但似乎我无法使其发挥作用。有没有人知道如何解决这个问题?

注意:奇怪的是,在我拥有Python 3.5.1的其他计算机上,相同的代码工作得很好。但是当我在我当前的计算机上运行代码时,我有Python 2.7.9,我得到了这么长的错误。

1 个答案:

答案 0 :(得分:4)

问题不同,请参阅8381,但在我的pandas版本0.18.1中效果不错。

我认为您可以将False更改为True,然后reset_index

df_Company=df1.groupby(by=['manufacturer','quality_issue'], as_index=True)['quality_issue2']
              .count()
              .reset_index()

sizecount之间的差异(请参阅differences with numeric values):

带有string值的示例:

import pandas as pd
import numpy as np

df1=pd.DataFrame([['foo','foo','bar','bar','bar','oats'],
                  ['foo','foo','bar','bar','bar','oats'],
                  [None,'foo','bar',None,'bar','oats']]).T
df1.columns=['manufacturer','quality_issue','quality_issue2']
print (df1)
  manufacturer quality_issue quality_issue2
0          foo           foo           None
1          foo           foo            foo
2          bar           bar            bar
3          bar           bar           None
4          bar           bar            bar
5         oats          oats           oats

df_Company=df1.groupby(by=['manufacturer','quality_issue'], as_index=False)['quality_issue2']
              .count()
print (df_Company)

  manufacturer quality_issue  quality_issue2
0          bar           bar               2
1          foo           foo               1
2         oats          oats               1

df_Company1=df1.groupby(by=['manufacturer','quality_issue'])['quality_issue2']
               .size()
               .reset_index(name='quality_issue2')
print (df_Company1)

  manufacturer quality_issue  quality_issue2
0          bar           bar               3
1          foo           foo               2
2         oats          oats               1

我认为你可以省略[quality_issue2],输出是相同的:

df_Company1=df1.groupby(by=['manufacturer','quality_issue'])
               .size()
               .reset_index(name='quality_issue2')
print (df_Company1)
  manufacturer quality_issue  quality_issue2
0          bar           bar               3
1          foo           foo               2
2         oats          oats               1