pandas:根据第一个字符映射新值

时间:2016-05-03 17:52:36

标签: python-2.7 pandas dictionary replace dataframe

是否有办法根据当前值的第一个字符将新值映射到数据框列。

我目前的代码:

ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('1'), 'city', ncesvars['urbantype'])
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('2'), 'suburban', ncesvars['urbantype'])
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('3'), 'town', ncesvars['urbantype'])
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('4'), 'rural', ncesvars['urbantype'])

我在考虑使用某种dict然后使用pd.replace,但我不确定如何使用.str.startswith()

2 个答案:

答案 0 :(得分:3)

尝试类似的事情:

ncesvars['urbantype'] = ncesvars['urbantype'].replace({
    r'^1.*', 'city', 
    r'^2.*', 'suburban'},
    regex=True)

测试:

In [32]: w
Out[32]:
     word
0    1_A_
1  word03
2  word02
3  word00
4    2xxx
5  word04
6  word01
7  word02
8  word04
9    3aaa

In [33]: w['word'].replace({r'^1.*': 'city', r'^2.*': 'suburban', r'^3.*': 'town'}, regex=True)
Out[33]:
0        city
1      word03
2      word02
3      word00
4    suburban
5      word04
6      word01
7      word02
8      word04
9        town
Name: word, dtype: object

答案 1 :(得分:2)

您可以定义类别的字典,使用str[0:1]切片数据,并通过测试数据的第一个字符是否在您的map中,在Series的布尔掩码上调用NaN dict键,以便只覆盖匹配,否则用In [16]: df = pd.DataFrame({'urbantype':['1 asdas','2 asd','3 asds','4 asdssd','5 asdas']}) df Out[16]: urbantype 0 1 asdas 1 2 asd 2 3 asds 3 4 asdssd 4 5 asdas In [18]: d = {'1':'city','2':'suburban', '3': 'town','4':'rural'} df.loc[df['urbantype'].str[0:1].isin(d.keys()), 'urbantype'] = df['urbantype'].str[0:1].map(d) df Out[18]: urbantype 0 city 1 suburban 2 town 3 rural 4 5 asdas 覆盖,因为以下示例中的最后一行没有映射:

extern crate timer;
extern crate chrono;

use timer::Timer;
use chrono::Duration;
use std::thread;

fn x() {
    println!("hello");
}

fn main() {
    let timer = Timer::new();
    let guard = timer.schedule_repeating(Duration::seconds(2), x);
    // give some time so we can see hello printed
    // you can execute any code here
    thread::sleep(::std::time::Duration::new(10, 0));
    // stop repeating
    drop(guard);
}