如何在字典中通过数据集进行迭代期间创建和填充列

时间:2019-08-13 11:55:21

标签: python dataframe dictionary

我的词典中有一个.csv文件,其中包含一些数据。我想做的是遍历数据帧中的特定列(带有字符串)(它本身在字典中),并根据条件在该行但在新列中分配特定编号。

import os
from os import listdir
from os.path import isfile, join
import pandas as pd

### INPUT DIRECTORY
path="folder"


### READING .csv FILES TO THE "dictionary"
files=[f.split('.')[0] for f in listdir(path) if isfile(join(path, f))]
dictionary={}
for file in files:
    dictionary[file]=pd.read_csv(path+'/'+file+'.csv')

### DROPPING 2ND ROW
results={}
for df in dictionary:
    results[str(df)+'_CONSTANT_VAR'] = dictionary[df]
    results[str(df)+'_CONSTANT_VAR'] = results[str(df)+'_CONSTANT_D_SHALE_VAR'].iloc[1:]



for df in results:
    for i in results[str(df)]['FORMATION']:
        if i=='BAL6':
            results[str(df)]['VAR'][i]=10  ### HERE I WANT TO ADD VALUE TO THE NEW COLUMN

不幸的是,代码只是到处都放置了“ 10”,而不仅是在满足条件的那一行。 知道为什么会这样吗?以及如何按照我想要的方式来做?


此外,还会弹出一个错误:

<input>:27: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

输入数据:

wellName    DEPTH   FORMATION   depth2
well name   1000    bal0.5     123
well name   2000    bal1       124
well name   3000    bal0.6     125
well name   4000    bal2       126
well name   5000    bal0.7     127
well name   6000    bal3       128
well name   7000    bal0.8     129
well name   8000    bal4       130
well name   9000    bal0.9     131
well name   10000   bal5       132
well name   11000   bal0.10    133
well name   12000   bal6       134
well name   13000   bal0.11    135

输出IAM获取:

wellName    DEPTH   FORMATION   depth2 VAR
well name   1000    bal0.5     123     10
well name   2000    bal1       124     10
well name   3000    bal0.6     125     10
well name   4000    bal2       126     10
well name   5000    bal0.7     127     10
well name   6000    bal3       128     10
well name   7000    bal0.8     129     10
well name   8000    bal4       130     10
well name   9000    bal0.9     131     10
well name   10000   bal5       132     10
well name   11000   bal0.10    133     10
well name   12000   bal6       134     10
well name   13000   bal0.11    135     10

我想要的输出:

wellName    DEPTH   FORMATION   depth2 VAR
well name   1000    bal0.5     123     
well name   2000    bal1       124     
well name   3000    bal0.6     125     
well name   4000    bal2       126     
well name   5000    bal0.7     127     
well name   6000    bal3       128     
well name   7000    bal0.8     129     
well name   8000    bal4       130     
well name   9000    bal0.9     131     
well name   10000   bal5       132     
well name   11000   bal0.10    133     
well name   12000   bal6       134     10   ### VALUE ADDED ONLY HERE
well name   13000   bal0.11    135     

1 个答案:

答案 0 :(得分:1)

鉴于INPUT DATA中显示的数据框df,您可以使用以下条件有条件地分配新列VAR或在列VAR中分配值,

df.loc[(df.FORMATION == 'bal6'), 'VAR'] = 10

您收到的“错误”消息实际上是警告,您为数据帧的副本分配了新值,并且数据帧本身不会被更改。这称为链式索引,并说明了here