Pandas:替换旧的列值并根据方程绘制新的列值

时间:2018-04-23 09:25:50

标签: python pandas csv

在CSV文件中,我有三列z,x,y,其中'z'列用于分组x和y列,并将绘制为w.r.t'z'。下面是z,x,y的表格。

z	      x	        y
23	1,75181E-07	6,949512
23	8,88901E-07	6,963877
23	1,61279E-05	7,293052
23	5,35262E-05	8,135064
23	8,56942E-05	8,903738
23	0,000114883	9,579907
23	0,01068653	211,0798
23	0,01070811	211,3568
23	0,0107263	  211,5871
23	0,01074606	211,8401
23	0,01076813	212,1311
23	0,01078525	212,3436
40	1,75181E-07	6,949513217
40	8,889E-07	  6,96388319
40	1,61277E-05	7,293169621
40	5,35248E-05	8,135499439
40	0,00029527	13,63721607
40	0,000319049	14,1825142
40	0,000340228	14,69608917
40	0,014110191	252,3893548
40	0,014132366	252,5804547
40	0,014155023	252,8030254
40	0,014180293	253,0374241
40	0,014202693	253,1983821
40	0,014226167	253,4140887
40	0,014251631	253,6566835
40	0,014272699	253,8120535

现在我需要用新的值替换'x'和'y'列,用'x1'和'y1'代表等式:x1 = ln(1 + x)和y1 = y *(1 + x)和wrt相同的'z'列我应该绘制x1和y1。

我已经尝试了下面的代码,我能够看到我的新值,但无法使用新值进行绘制。

import csv
import os
import tkinter as tk
import sys
from tkinter import filedialog
import pandas as pd
import matplotlib.pyplot as plt
import math
from tkinter import ttk
import numpy as np

def readCSV(self):
        x=[]   # Initializing empty lists to store the 3 columns in csv
        y=[]
        z=[]
        global df
        self.filename = filedialog.askopenfilename()
        df = pd.read_csv(self.filename, error_bad_lines=False)   #Reading CSV file using pandas
        read = csv.reader(df, delimiter = ",")
        fig = plt.figure()
        ax= fig.add_subplot(111)
        df.set_index('x', inplace=True)  #Setting index
        line = df.groupby('z')['y'].plot(legend=True,ax=ax)   #grouping and plotting
        cursor = datacursor(line)
        gdf= df[df['z'] == 23]
        x=np.asarray(gdf.index.values)
        y=np.asarray(gdf['y'].values)
        x1 = np.log(1+x)
        y1 = y * (1 + x)

        
        df.set_index('x1', inplace=True)  #Setting new index
        line = df.groupby('z')['y1'].plot(legend=True,ax=ax)   #grouping and plotting for new values
        cursor = datacursor(line)
        gdf= df[df['z'] == 23]
        x1=np.asarray(gdf.index.values)
        print ("x1:",x1)
        y1=np.asarray(gdf['y1'].values)
        print ("y1:",y1)
        ax1 = ax.twinx()
        ax.grid(True)
        ax.set_ylim(0,None)
        ax.set_xlim(0,None)
        align_yaxis(ax, y.max(), ax1, 1)
        plt.show()

我在这一行收到错误“df.set_index('x1',inplace = True)” 作为keyerror:x1

提前致谢

1 个答案:

答案 0 :(得分:1)

您必须指定x1 adn' y1'到数据框,以便能够将它们中的任何一个分配给index

x=np.asarray(df.index.values)
y=np.asarray(df['y'].values)
df['x1'] = np.log(1+x)
df['y1'] = y * (1+x)