Python Pandas: Dataframe is not updating using string methods

时间:2019-01-07 12:55:38

标签: python pandas dataframe

I'm trying to update the strings in a .csv file that I am reading using Pandas. The .csv contains the column name 'about' which contains the rows of data I want to manipulate.

I've already used str. to update but it is not reflecting in the exported .csv file. Some of my code can be seen below.

import pandas as pd

df = pd.read_csv('data.csv')
df.About.str.lower() #About is the column I am trying to update
df.About.str.replace('[^a-zA-Z ]', '')
df.to_csv('newdata.csv')

3 个答案:

答案 0 :(得分:1)

You need assign output to column, also is possible chain both operation together, because working with same column About and because values are converted to lowercase, is possible change regex to replace not uppercase:

df = pd.read_csv('data.csv')
df.About = df.About.str.lower().str.replace('[^a-z ]', '')
df.to_csv('newdata.csv', index=False)

Sample:

df = pd.DataFrame({'About':['AaSD14%', 'SDD Aa']})

df.About = df.About.str.lower().str.replace('[^a-z ]', '')
print (df)
    About
0    aasd
1  sdd aa

答案 1 :(得分:1)

import pandas as pd
import numpy as np

columns = ['About']
data = ["ALPHA","OMEGA","ALpHOmGA"]
df = pd.DataFrame(data, columns=columns)
df.About = df.About.str.lower().str.replace('[^a-zA-Z ]', '')
print(df)

OUTPUT:

out

答案 2 :(得分:1)

Example Dataframe:

>>> df
        About
0      JOHN23
1     PINKO22
2   MERRY jen
3  Soojan San
4      Remo55

Solution:,another way Using a compiled regex with flags

>>> df.About.str.lower().str.replace(regex_pat,  '')
0          john
1         pinko
2     merry jen
3    soojan san
4          remo
Name: About, dtype: object

Explanation:

Match a single character not present in the list below [^a-z]+

+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)

$ asserts position at the end of a line