分配新变量后,Pandas Dataframe发生更改

时间:2019-02-15 13:28:24

标签: python pandas class dataframe

我正在编写一个程序,该程序可以将.csv文件中的某些文本值更改为数值。 我创建了一个类,并使用“ self.dataframe”作为变量来存储使用熊猫提取的数据。稍后,我使用函数“ transform_values”使用字典来转换答案。我首先为变量“ b”分配来自“ self.dataframe”的数据,但是它仍然更改“ self.dataframe”中的数据,而不仅仅是更改存储在“ b”中的数据。我在这里想念什么?

class form():
def __init__(self,sections,filename): #takes in a filename as string and a list lists with answers
    self.file = None #empty var for file
    self.dataframe = None #empty var for dataframe
    self.new_dataframe = None #empty var for redone dataframe
    self.columns = None #empty var for storing the columns
    self.sections = sections #loads in the different sections (list of lists)
    self.filename = filename #file is loaded based on filename argument
    self.values = {} #Empty var for dictionary of values

def read_file(self):
    self.file = os.getcwd()+self.filename #file needs to be in cwd
    self.dataframe = pd.read_csv(self.file) #creates a dataframe
    self.columns = self.dataframe.columns #stores the columns

def create_values(self): #creates a dictionary
    #not necessary for question

def transform_values(self,dataframe,dictionary):
    b = dataframe.values
    c = dictionary
    for p in range(len(b)): #per person in dataframe
        for a in range(len(b[p])):#per answer in person
            if b[p][a] in c: #checks wether answer in dictionary
                b[p][a]=c[b[p][a]] #sets value to dictionary value
    self.new_dataframe = pd.DataFrame(b,columns=self.columns)

survey = form(all_sections,"/survey.csv")
survey.read_file()
survey.create_values()
survey.transform_values(survey.dataframe,survey.values)

survey.dataframe

“ survey.dataframe”的输出最初只包含文本,现在即使我只是在“ b”中进行转换,它也具有由transform_values插入的数值。我在这里想念什么?我想用“ self.new_dataframe”表示数值并将文本保留在“ self.dataframe”中,我可以通过两次读取数据来解决此问题,但我想我可能只是在犯错。 >

0 个答案:

没有答案