pandas:拆分字符串和计数值?

时间:2018-01-29 19:26:17

标签: python pandas

我有一个pandas数据集,其中的列是以逗号分隔的字符串,例如1,2,3,10

data = [
  { 'id': 1, 'score': 9, 'topics': '11,22,30' },
  { 'id': 2, 'score': 7, 'topics': '11,18,30' },
  { 'id': 3, 'score': 6, 'topics': '1,12,30' },
  { 'id': 4, 'score': 4, 'topics': '1,18,30' }
]
df = pd.DataFrame(data)

我想获得topics中每个值的计数和平均分数。所以:

topic_id,count,mean
1,2,5
11,2,8
12,1,6

等等。我怎么能这样做?

我到目前为止:

df['topic_ids'] = df.topics.str.split()

但是现在我想我要爆炸topic_ids,所以整个值集中的每个唯一值都有一列......?

3 个答案:

答案 0 :(得分:3)

不需要groupbyagg

df.topics=df.topics.str.split(',')
New_df=pd.DataFrame({'topics':np.concatenate(df.topics.values),'id':df.id.repeat(df.topics.apply(len)),'score':df.score.repeat(df.topics.apply(len))})

New_df.groupby('topics').score.agg(['count','mean'])

Out[1256]: 
        count  mean
topics             
1           2   5.0
11          2   8.0
12          1   6.0
18          2   5.5
22          1   9.0
30          4   6.5

答案 1 :(得分:2)

-(IBAction)buttonPressed:(UIButton *)sender
{
    UIImagePickerController *imagePickerController = [[UIImagePickerController alloc] init];
    imagePickerController.delegate = self;

    if ([UIImagePickerController isSourceTypeAvailable:UIImagePickerControllerSourceTypePhotoLibrary]) {
        imagePickerController.sourceType = UIImagePickerControllerSourceTypePhotoLibrary;
    }
    [self presentViewController:imagePickerController animated:YES completion:nil];
}

- (void)imagePickerController:(UIImagePickerController *)picker didFinishPickingImage:(UIImage *)image editingInfo:(NSDictionary *)Info
{
    NSData *imageDataPicker = UIImageJPEGRepresentation(image, 0.1); //For resize

    if([imageDataPicker length]<2097152) //bytes 1048576
    {
        [self dismissViewControllerAnimated:YES completion:nil];
        [self SubmitImage1:image];
    }
}

// UIViewContentModeScaleAspectFill will fill the entire view
-(void)SubmitImage1:(UIImage *)image
{
    _EditProfileImage.image = image;
    _EditProfileImage.layer.cornerRadius = _EditProfileImage.frame.size.width / 2;
    _EditProfileImage.contentMode = UIViewContentModeScaleAspectFill;
    _EditProfileImage.layer.masksToBounds = YES;
    _EditProfileImage.layer.borderWidth = 3.0f;
    _EditProfileImage.layer.borderColor = [UIColor blackColor].CGColor;
}

答案 2 :(得分:1)

这是一种方式。 Reindex&amp;堆栈,然后groupby&amp; AGG

import pandas as pd

data = [
  { 'id': 1, 'score': 9, 'topics': '11,22,30' },
  { 'id': 2, 'score': 7, 'topics': '11,18,30' },
  { 'id': 3, 'score': 6, 'topics': '1,12,30' },
  { 'id': 4, 'score': 4, 'topics': '1,18,30' }
]
df = pd.DataFrame(data)
df.topics = df.topics.str.split(',')
df2 = pd.DataFrame(df.topics.tolist(), index=[df.id, df.score])\
                   .stack()\
                   .reset_index(name='topics')\
                   .drop('level_2', 1)

df2.groupby('topics').score.agg(['count', 'mean']).reset_index()