Question

I'm trying to update the strings in a .csv file that I am reading using Pandas. The .csv contains the column name 'about' which contains the rows of data I want to manipulate.

I've already used str. to update but it is not reflecting in the exported .csv file. Some of my code can be seen below.

import pandas as pd

df = pd.read_csv('data.csv')
df.About.str.lower() #About is the column I am trying to update
df.About.str.replace('[^a-zA-Z ]', '')
df.to_csv('newdata.csv')

Answer 1

You need assign output to column, also is possible chain both operation together, because working with same column About and because values are converted to lowercase, is possible change regex to replace not uppercase:

df = pd.read_csv('data.csv')
df.About = df.About.str.lower().str.replace('[^a-z ]', '')
df.to_csv('newdata.csv', index=False)

Sample:

df = pd.DataFrame({'About':['AaSD14%', 'SDD Aa']})

df.About = df.About.str.lower().str.replace('[^a-z ]', '')
print (df)
    About
0    aasd
1  sdd aa

Answer 2

import pandas as pd
import numpy as np

columns = ['About']
data = ["ALPHA","OMEGA","ALpHOmGA"]
df = pd.DataFrame(data, columns=columns)
df.About = df.About.str.lower().str.replace('[^a-zA-Z ]', '')
print(df)

OUTPUT:

Answer 3

Example Dataframe:

>>> df
        About
0      JOHN23
1     PINKO22
2   MERRY jen
3  Soojan San
4      Remo55

Solution:,another way Using a compiled regex with flags

>>> df.About.str.lower().str.replace(regex_pat,  '')
0          john
1         pinko
2     merry jen
3    soojan san
4          remo
Name: About, dtype: object

Explanation:

Match a single character not present in the list below [^a-z]+

+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)

$ asserts position at the end of a line

Python Pandas: Dataframe is not updating using string methods

3 个答案: