我的词典中有一个.csv文件,其中包含一些数据。我想做的是遍历数据帧中的特定列(带有字符串)(它本身在字典中),并根据条件在该行但在新列中分配特定编号。
import os
from os import listdir
from os.path import isfile, join
import pandas as pd
### INPUT DIRECTORY
path="folder"
### READING .csv FILES TO THE "dictionary"
files=[f.split('.')[0] for f in listdir(path) if isfile(join(path, f))]
dictionary={}
for file in files:
dictionary[file]=pd.read_csv(path+'/'+file+'.csv')
### DROPPING 2ND ROW
results={}
for df in dictionary:
results[str(df)+'_CONSTANT_VAR'] = dictionary[df]
results[str(df)+'_CONSTANT_VAR'] = results[str(df)+'_CONSTANT_D_SHALE_VAR'].iloc[1:]
for df in results:
for i in results[str(df)]['FORMATION']:
if i=='BAL6':
results[str(df)]['VAR'][i]=10 ### HERE I WANT TO ADD VALUE TO THE NEW COLUMN
不幸的是,代码只是到处都放置了“ 10”,而不仅是在满足条件的那一行。 知道为什么会这样吗?以及如何按照我想要的方式来做?
此外,还会弹出一个错误:
<input>:27: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
输入数据:
wellName DEPTH FORMATION depth2
well name 1000 bal0.5 123
well name 2000 bal1 124
well name 3000 bal0.6 125
well name 4000 bal2 126
well name 5000 bal0.7 127
well name 6000 bal3 128
well name 7000 bal0.8 129
well name 8000 bal4 130
well name 9000 bal0.9 131
well name 10000 bal5 132
well name 11000 bal0.10 133
well name 12000 bal6 134
well name 13000 bal0.11 135
输出IAM获取:
wellName DEPTH FORMATION depth2 VAR
well name 1000 bal0.5 123 10
well name 2000 bal1 124 10
well name 3000 bal0.6 125 10
well name 4000 bal2 126 10
well name 5000 bal0.7 127 10
well name 6000 bal3 128 10
well name 7000 bal0.8 129 10
well name 8000 bal4 130 10
well name 9000 bal0.9 131 10
well name 10000 bal5 132 10
well name 11000 bal0.10 133 10
well name 12000 bal6 134 10
well name 13000 bal0.11 135 10
我想要的输出:
wellName DEPTH FORMATION depth2 VAR
well name 1000 bal0.5 123
well name 2000 bal1 124
well name 3000 bal0.6 125
well name 4000 bal2 126
well name 5000 bal0.7 127
well name 6000 bal3 128
well name 7000 bal0.8 129
well name 8000 bal4 130
well name 9000 bal0.9 131
well name 10000 bal5 132
well name 11000 bal0.10 133
well name 12000 bal6 134 10 ### VALUE ADDED ONLY HERE
well name 13000 bal0.11 135
答案 0 :(得分:1)
鉴于INPUT DATA中显示的数据框df
,您可以使用以下条件有条件地分配新列VAR
或在列VAR
中分配值,
df.loc[(df.FORMATION == 'bal6'), 'VAR'] = 10
您收到的“错误”消息实际上是警告,您为数据帧的副本分配了新值,并且数据帧本身不会被更改。这称为链式索引,并说明了here。