我有一个组ID列表:
letters = ['A', 'A/D', 'B', 'B/D', 'C', 'C/D', 'D']
和组的数据框:
groups = pd.DataFrame({'group': ['B', 'A/D', 'D', 'D', 'A']})
我想在数据框中创建一列,以提供组ID在列表中的位置,如下所示:
group group_idx
0 B 2
1 A/D 1
2 D 6
3 D 6
4 A 0
我当前的解决方案是:
group_to_num = {hsg: i for i, hsg in enumerate(letters)}
groups['group_idx'] = groups.applymap(lambda x: group_to_num.get(x)).max(axis=1).fillna(-1).astype(np.int32)
但是看起来不太优雅。有没有更简单的方法?
答案 0 :(得分:1)
使用map:
WordCountMapper.java:23: error: <identifier> expected
Public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:23: error: invalid method declaration; return type required
Public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:25: error: ';' expected
String line = value.toString()
^
WordCountMapper.java:27: error: ';' expected
StringTokenizer tokenizer = new StringTokenizer(line)
^
WordCountMapper.java:35: error: illegal start of expression
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: illegal start of expression
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: ';' expected
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: ';' expected
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: not a statement
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: ';' expected
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: not a statement
public void run(Context context) throws IOException, InterruptedException {
^
WordCountMapper.java:35: error: ';' expected
public void run(Context context) throws IOException, InterruptedException {
^
输出
import pandas as pd
letters = ['A', 'A/D', 'B', 'B/D', 'C', 'C/D', 'D']
group_to_num = {hsg: i for i, hsg in enumerate(letters)}
groups = pd.DataFrame({'group': ['B', 'A/D', 'D', 'D', 'A']})
groups['group_idx'] = groups.group.map(group_to_num)
print(groups)
答案 1 :(得分:1)
您可以在数据框构造函数之后尝试合并:
groups.merge(pd.DataFrame(letters).reset_index(),left_on='group',right_on=0).\
rename(columns={'index':'group_idx'}).drop(0,1)
group group_idx
0 B 2
1 A/D 1
2 D 6
3 D 6
4 A 0