for petid in X['PetID']:
sentiment_file = datapath + '/train_sentiment/' + petid + '.json'
if os.path.isfile(sentiment_file):
json_data = json.loads(open(sentiment_file).read())
X['DescriptionLanguage'] = json_data['language']
X['DescriptionMagnitude'] = json_data['documentSentiment']['magnitude']
X['DescriptionScore'] = json_data['documentSentiment']['score']
# print(petid, sentiment_file,
# json_data['documentSentiment']['magnitude'])
else:
X['DescriptionLanguage'] = 'Unknown'
X['DescriptionMagnitude'] = 0
X['DescriptionScore'] = 0
这是我所拥有的,但这不起作用。它将每行设置为具有DescriptionLanguage
,DescriptionMagnitude
和DescriptionScore
的那些值。
答案 0 :(得分:2)
您可以使用.loc设置单个值,而不是整个列。这是一个包含的示例
import pandas as pd
import numpy as np
X = pd.DataFrame(np.arange(5), columns=['PetID'])
for ind, row in X.iterrows():
petid = row['PetID']
X.loc[ind, 'DescriptionLanguage'] = 'No description for {}'.format(petid)
答案 1 :(得分:2)
除了@Heikki Pulkkinen的出色回答外,您还可以为数据框中的各个列建立索引,例如:
import pandas as pd
import numpy as np
data = np.array([np.arange(10)]*4).T
X = pd.DataFrame(data,columns=["PetID","DescriptionLanguage","DescriptionMagnitude","DescriptionScore"])
for i in range(len(X['PetID'])):
X['DescriptionLanguage'][i] = 10*i
...导致X变成:
PetID DescriptionLanguage DescriptionMagnitude DescriptionScore
0 0 0 0 0
1 1 10 1 1
2 2 20 2 2
3 3 30 3 3
4 4 40 4 4
5 5 50 5 5
6 6 60 6 6
7 7 70 7 7
8 8 80 8 8
9 9 90 9 9