Question

这是我的代码，

import pandas as pd
import os

os.chdir('path\to\input\file')

xl_file = pd.ExcelFile('newcustomers.xlsx')
df = xl_file.parse('Customers Export 1', index_col='Domain', na_values=['NA'])

df = df[(df["Customer phone"].str.startswith("+1")) & (df["Customer phone"].str.len() == 13)]

print
print "now changing to final CSV output directory"
print

os.chdir('path\to\output\directory')

print "Current working dir : %s" % os.getcwd()

df.to_csv('newcustomers.csv')

基本上该列有电话号码，我使用它来删除不完整的号码/空白条目，以及不以+1开头的电话号码。（美国/加拿大国家拨号代码）。它工作了一周，但后来我开始收到这个错误。我之间没有更新python或pandas。

raise AttributeError("Can only use .str accessor with string "
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

我在Windows 8.1上使用anaconda，版本如下：

conda update conda
    conda 3.18.8 py27_0 defaults
conda update anaconda
    anaconda 2.4.0 np110py27_0 <unknown>
conda update pandas
    pandas 0.17.1 np110py27_0 defaults

在代码工作的所有上周到周日之间没有任何变化，昨天没有任何更新，或输入文件的更改或任何它开始对我生气的事情：/

编辑：每@WoodChopper请求添加df.head（2）

Domain        Customer Name    Customer phone
example.com   John Doe         44.xxxxxx
google.com    Jane Doe         1.xxxxxx

在打开的原始XLSX文件中，它会列出带有“+”号的整个电话号码。但是当我使用时，这就是返回CMD的所有内容：

print df.head(2)

这只是执行xl_file变量，df变量，然后打印上面的语句。我用＃the

阻止了

df = df[(df["Registrant phone"].str.startswith("+1")) & (df["Registrant phone"].str.len() == 13)]

编辑x2

只是澄清一下，这是现在的代码

import pandas as pd
import os

os.chdir('path\to\input\file')

df = pd.read_excel('newcustomers.xlsx', sheetname = 'Customers Export 1')

#xl_file = pd.ExcelFile('newcustomers.xlsx')
#df = xl_file.parse('Customers Export 1', index_col='Domain', na_values=['NA'], convert_float=False)
#df.drop(df.columns[[0]], axis=1, inplace=True)

#print df.head(2)
#print (df["Registrant phone"])
df = df[(df["Registrant phone"].str.startswith("+1")) & (df["Registrant phone"].str.len() == 13)]

print
print "now changing to final CSV output directory"
print

os.chdir('path\to\output\directory')

print "Current working dir : %s" % os.getcwd()

df.to_csv('newcustomers.csv')

仍然会返回所有相同的结果。只是为了确保我们不会追逐错误的兔子，Here is the exact error（imgur）。

这可能是代码之外的东西吗？熊猫，康达和蟒蛇是最新的。是否还有另一个Pandas所依赖的图书馆可能已经过时了（因为一切都工作了一天而下一次没有，所以这不是完全有意义的）？

Answer 1

基本上，电话号码被解析为float，但要使您的代码有效，则需要string。

将convert_float设为false：

df = xl_file.parse('Customers Export 1', index_col='Domain',
                                 na_values=['NA'], convert_float=False)

更新

df = pd.read_excel('file.xlsx', sheetname = 'sheet 1')

Pandas Error只能使用带字符串的.str访问器

1 个答案: