AttributeError:'str'对象没有属性'loc'

时间:2018-06-07 07:20:04

标签: python pandas

尝试使用for循环来填充年龄值,如下所示

for dataset in train:
dataset.loc[(dataset['age'] > 15) & (dataset['age'] <= 25), 'age'] = 1
dataset.loc[(dataset['age'] > 25) & (dataset['age'] <= 35), 'age'] = 2
dataset.loc[(dataset['age'] > 35) & (dataset['Age'] <= 45), 'age'] = 3
dataset.loc[(dataset['age'] > 45) & (dataset['age'] <= 55), 'age'] = 4
dataset.loc[ dataset['age'] > 55, 'age']}

获取错误:

AttributeError: 'str' object has no attribute 'loc'

我正在寻找我的数据集,如下所示:

age(in existing dataset)           age(in existing dataset)
25                                 1
35                                 2
45                                 3
73                                 4

4 个答案:

答案 0 :(得分:2)

我认为需要省略循环,因为如果trainDataFrame,则dataset是列名,显然是string s:

np.random.seed(100)
train = pd.DataFrame(np.random.randint(10, size=(3,3)), columns=['age','col1','col2'])
print (train)
   age  col1  col2
0    8     8     3
1    7     7     0
2    4     2     5

for dataset in train:
    print (dataset)

age
col1
col2
train.loc[(train['age'] > 15) & (train['age'] <= 25), 'new'] = 1
train.loc[(train['age'] > 25) & (train['age'] <= 35), 'new'] = 2
train.loc[(train['age'] > 35) & (train['age'] <= 45), 'new'] = 3
train.loc[(train['age'] > 45) & (train['age'] <= 55), 'new'] = 4
train.loc[ train['age'] > 55, 'new'] = 5

更好的是使用pd.cut

r = [0, 25, 35, 45, 55, 120]
g = [1,2,3,4,5]
train['new'] = pd.cut(train['age'], bins=r, labels=g)

答案 1 :(得分:1)

您的数据集似乎是一个字符串,而一个字符串没有attibute或method loc。 使用

检查数据集的类型
type()

isinstance()

并看到它是正确的数据类型。

答案 2 :(得分:1)

将年龄分为三类:

  • 儿童:0-10岁
  • 青少年:10-17岁
  • 成人:18-65岁
  • 老年人:65-110
r = [0,10,17,65, 110]
g = ['Child','Teen','Adult','Elderly']
train['AgeCtg'] = pd.cut(train['Age'], bins = r, labels = g)

我们得到的是:

train.head(50)

The expected output

答案 3 :(得分:0)

只需这样做:

Train = [train]#converting the train dataframe into list

for dataset in Train:

dataset.loc[ dataset['Fare'] <= 17, 'Fare'] = 0, 

dataset.loc[(dataset['Fare'] > 17) & (dataset['Fare'] <= 30), 'Fare'] = 1, 

dataset.loc[(dataset['Fare'] > 30) & (dataset['Fare'] <= 100), 'Fare'] = 2, 

dataset.loc[ dataset['Fare'] > 100, 'Fare'] = 3