熊猫过滤:返回正确/错误与实际值

时间:2018-08-23 18:15:27

标签: python python-3.x pandas

我的数据框:

df_all_xml_mfiles_tgther

      file_names     searching_for                                 everything
0          a.txt             where              Dave Ran Away. Where is Dave?
1          a.txt             candy                                mmmm, candy
2          b.txt              time                We are looking for the book.
3          b.txt             where                   where the red fern grows

我的问题:

我正在尝试筛选包含在搜索条件中找到的单词的记录。我需要一次通过1条记录,然后返回实际记录,而不仅仅是单词true。

我尝试过的事情:

search_content_array = ['where', 'candy', 'time']
file_names_only = ['a.txt', 'b.txt']


for cc in range(0, len(file_names_only), 1):
     for bb in range(0, len(search_content_array), 1):

            stuff = `df_all_xml_mfiles_tgther[cc:cc+1].everything.str.contains(search_content_array[bb], flags=re.IGNORECASE, na=False, regex=True)`

            if not regex_stuff.empty:
                 regex_stuff_new = pd.DataFrame([regex_stuff.rename(None)])
                 regex_stuff_new.columns = ['everything']
                 regex_stuff_new['searched_for_found'] = search_content_array[bb]
                 regex_stuff_new['file_names'] = file_names_only[cc]

            regex_stuff_new = regex_stuff_new[['file_names', 'searched_for_found', 'everything']] ##This rearranges the columns

            df_regex_test =  df_regex_test.append(regex_stuff_new, ignore_index=True, sort=False)

我得到的结果是:

    file_names  searched_for_found  everything
0        a.txt               where        True
1        a.txt               candy        True
2        b.txt               where        True

我想要的结果是这样:

    file_names  searched_for_found                           everything
0        a.txt               where        Dave Ran Away. Where is Dave?
1        a.txt               candy                          mmmm, candy
3        b.txt               where             where the red fern grows

如何获取返回结果的实际值,而不仅仅是true / false?

3 个答案:

答案 0 :(得分:4)

使用列表理解来逐个执行此操作。

import { Observable, of } from 'rxjs';
import { map } from 'rxjs/operators';
import { Component, OnInit, ChangeDetectionStrategy } from '@angular/core';

@Component({
  selector: 'app-root',
  template:
    `
      <app-child
        [things]="(things$ | async)?.things">
      </app-child>
      <input
        #thingbox>
        <button (click)="addThing(thingbox.value)">add thing</button>
        <button>do nothing</button>
    `,
    changeDetection: ChangeDetectionStrategy.OnPush
})
export class AppComponent implements OnInit {

  thingsList: any[];
  things$: Observable<{message: string, things: any[]}>;

  constructor() { }

  addThing = (thing: string): void => {
    this.fakeHttpAddThing(thing)
        .pipe(
          map( (res) => {
            return res;
          }),
        )
        .subscribe( (res) => {
          // I KNOW THIS map() ASSIGNMENT IS WRONG AND UNNECCESARY
          // BUT STILL CHANGE DETECTION IS TRIGGERED IN ChildComponent
          // EVERYTHING WORKS FINE IF REMOVED.
          this.thingsList.push(res.thing);
          return res;
        });
  }

  ngOnInit() {
    this.things$ = this.fakeHttpGetThings()
        .pipe(
          map( (thingsResponse): {message: string, things: any[]} => {
            // I KNOW THIS map() ASSIGNMENT IS WRONG AND UNNECCESARY
            // BUT STILL CHANGE DETECTION IS TRIGGERED IN ChildComponent
            // EVERYTHING WORKS FINE IF REMOVED.
            this.thingsList = thingsResponse.things;
            return thingsResponse;
          }),
        );
  }

  fakeHttpGetThings = (): Observable<{message: string, things: any[]}> => {

    const things: any[] = [
      {
        thingKey: 'THING1',
      },
      {
        thingKey: 'THING2',
      },
      {
        thingKey: 'THING3',
      }
    ];

    return of({message: 'SUCCESS', things: things});
  }

  fakeHttpAddThing(thing: string): Observable<{message: string, thing: any}> {

    return of({
      message: 'SUCCESS',
      thing: {thingKey: thing}
    });
  }
}

或者,

import { Component, OnInit, Input, ChangeDetectionStrategy } from '@angular/core';

@Component({
  selector: 'app-child',
  template:
    `
      <div
        *ngFor="let thing of _things">

        {{thing.thingKey}}
      </div>

      <mat-accordion>
        <mat-expansion-panel
          *ngFor="let thing of _things">
          <mat-expansion-panel-header>
            {{thing.thingKey}}
          </mat-expansion-panel-header>
          <p>{{thing.thingKey}} contents</p>
        </mat-expansion-panel>
      </mat-accordion>
    `,
    changeDetection: ChangeDetectionStrategy.OnPush
})
export class ChildComponent implements OnInit {

  _things: any[] = null;

  @Input()
    set things(things) {
      this._things = things;
    }

  constructor() { }

  ngOnInit() {
  }
}

答案 1 :(得分:3)

使用replacestr.contains,PS我认为Cold的方法更简洁

s=df.everything.replace(regex=r'(?i)'+ df.searching_for,value='OkIFINDIT')
df[s.str.contains('OkIFINDIT')]
Out[405]: 
  file_names searching_for                  everything
0      a.txt         where Dave Ran Away Where is Dave
1      a.txt         candy                  mmmm,candy
3      b.txt         where    where the red fern grows

答案 2 :(得分:1)

您可以用np.nan替换不匹配的行,然后删除nan

 import numpy as np,re

 df.apply(lambda x: x if re.search(x[1], x[2],re.I) else np.nan,axis=1).dropna()

 file_names searching_for                     everything
0      a.txt         where  Dave Ran Away. Where is Dave?
1      a.txt         candy                    mmmm, candy
3      b.txt         where       where the red fern grows