为新列赋值[Python pandas]

时间:2017-11-07 08:13:30

标签: python python-2.7 python-3.x pandas csv

我有一个场景,我在脚本中运行两个函数:

test.py:

def func1():
    df1=pd.read_csv('test1.csv')
    val1=df['col1'].mean().round(2)
    return va11

def func2():
    df2=pd.read_csv('test2.csv')
    val2=df['col1'].mean().round(2)
    return val2

def func3():
    dataf = pd.read_csv('test3.csv')
    col1=dataf['area']
    col2 = dataf['overall']
    dataf['overall']=val1 # value from val1 ->leads to error
    dataf['overall']=val2 #value from val2 ->leads to error

我在这里阅读test1.csv& test2.csv文件,我将平均值存储在变量" val1" &安培; " val2的"分别和返回相同。 我想要存储在具有两个cols和值的新test3.csv文件中的这些变量值应该一个接一个地存储(追加)。由此可见并没有解决这个问题。无法在互联网上找到任何东西。任何帮助都会很棒。

1 个答案:

答案 0 :(得分:2)

您需要在函数func3中将变量作为参数传递,如果func1func2中只有差异是文件名,则只使用parameetr创建一个函数。

感谢您的想法cᴏʟᴅsᴘᴇᴇᴅ;)

def func1(file):
    df=pd.read_csv(file)
    val=df['col1'].mean().round(2)
    return val

a = func1('test1.csv')
b = func1('test2.csv')

def func3(val1=a, val2=b):
    dataf = pd.read_csv('test3.csv')
    col1=dataf['area']
    col2 = dataf['overall']
    dataf.iloc[::2, dataf.columns.get_loc('overall')] = val1 
    dataf.iloc[1::2, dataf.columns.get_loc('overall')] = val2
    return dataf

样品:

dataf = pd.DataFrame({'overall':[1,7,8,9,4],
                      'col':list('abcde')})

print (dataf)
  col  overall
0   a        1
1   b        7
2   c        8
3   d        9
4   e        4

val1 = 20
val2 = 50

dataf.iloc[::2, dataf.columns.get_loc('overall')] = val1 
dataf.iloc[1::2, dataf.columns.get_loc('overall')] = val2
print (dataf)
  col  overall
0   a       20
1   b       50
2   c       20
3   d       50
4   e       20

从列表中追加N值的常规解决方案 - 按numpy.tile创建数组,然后分配给新列:

val =[1,8,4]
a = np.tile(val, int(len(dataf) / len(val))+2)[:len(dataf)]
dataf['overall'] = a
print (dataf)
  col  overall
0   a        1
1   b        8
2   c        4
3   d        1
4   e        8