熊猫按功能过滤数据框行

时间:2018-07-30 08:09:18

标签: python-3.x pandas filter

我想根据行中的不同值通过更复杂的功能过滤数据框。

是否可以通过布尔函数过滤DF行,例如您可以做到的在ES6 filter function中?

极端简化的示例来说明问题:

import pandas as pd

def filter_fn(row):
    if row['Name'] == 'Alisa' and row['Age'] > 24:
        return False

    return row

d = {
    'Name': ['Alisa', 'Bobby', 'jodha', 'jack', 'raghu', 'Cathrine',
             'Alisa', 'Bobby', 'kumar', 'Alisa', 'Alex', 'Cathrine'],
    'Age': [26, 24, 23, 22, 23, 24, 26, 24, 22, 23, 24, 24],

    'Score': [85, 63, 55, 74, 31, 77, 85, 63, 42, 62, 89, 77]}

df = pd.DataFrame(d, columns=['Name', 'Age', 'Score'])

df = df.apply(filter_fn, axis=1, broadcast=True)

print(df)

我发现使用apply()位的东西实际上可以通过布尔函数仅返回False / True填充的行。

我的解决方法是当函数结果为True时返回行本身,否则返回False。但这之后需要进行额外的过滤。

        Name    Age  Score
0      False  False  False
1      Bobby     24     63
2      jodha     23     55
3       jack     22     74
4      raghu     23     31
5   Cathrine     24     77
6      False  False  False
7      Bobby     24     63
8      kumar     22     42
9      Alisa     23     62
10      Alex     24     89
11  Cathrine     24     77

1 个答案:

答案 0 :(得分:2)

我认为这里的功能不是必需的,使用boolean indexing更好,并且主要更快:

!SESSION 2018-07-30 13:20:28.430 -----------------------------------------------
eclipse.buildId=4.7.0-vfinal-2017-09-29T14:34:02Z-Typesafe
java.version=9.0.4
java.vendor=Oracle Corporation
BootLoader constants: OS=macosx, ARCH=x86_64, WS=cocoa, NL=en_IN
Framework arguments:  -keyring /Users/admin/.eclipse_keyring
Command-line arguments:  -os macosx -ws cocoa -arch x86_64 -keyring /Users/admin/.eclipse_keyring

!ENTRY org.eclipse.osgi 4 0 2018-07-30 13:20:41.005
!MESSAGE Application error
!STACK 1
org.eclipse.e4.core.di.InjectionException: java.lang.NoClassDefFoundError: javax/annotation/PostConstruct
    at org.eclipse.e4.core.internal.di.InjectorImpl.internalMake(InjectorImpl.java:410)
    at org.eclipse.e4.core.internal.di.InjectorImpl.make(InjectorImpl.java:318)
    at org.eclipse.e4.core.contexts.ContextInjectionFactory.make(ContextInjectionFactory.java:162)
    at org.eclipse.e4.ui.internal.workbench.swt.E4Application.createDefaultHeadlessContext(E4Application.java:491)
    at org.eclipse.e4.ui.internal.workbench.swt.E4Application.createDefaultContext(E4Application.java:505)
    at org.eclipse.e4.ui.internal.workbench.swt.E4Application.createE4Workbench(E4Application.java:204)
    at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:614)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:336)
    at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:594)
    at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:148)
    at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:151)
    at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
Caused by: java.lang.NoClassDefFoundError: javax/annotation/PostConstruct
    at org.eclipse.e4.core.internal.di.InjectorImpl.inject(InjectorImpl.java:124)
    at org.eclipse.e4.core.internal.di.InjectorImpl.internalMake(InjectorImpl.java:399)
    ... 22 more
Caused by: java.lang.ClassNotFoundException: javax.annotation.PostConstruct cannot be found by org.eclipse.e4.core.di_1.6.100.v20170421-1418
    at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:433)
    at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:395)
    at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:387)
    at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:150)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
    ... 24 more

!ENTRY org.eclipse.e4.ui.workbench 4 0 2018-07-30 13:20:41.013
!MESSAGE FrameworkEvent ERROR
!STACK 0
java.lang.NoClassDefFoundError: javax/annotation/PreDestroy
    at org.eclipse.e4.core.internal.di.InjectorImpl.disposed(InjectorImpl.java:450)
    at org.eclipse.e4.core.internal.di.Requestor.disposed(Requestor.java:156)
    at org.eclipse.e4.core.internal.contexts.ContextObjectSupplier$ContextInjectionListener.update(ContextObjectSupplier.java:78)
    at org.eclipse.e4.core.internal.contexts.TrackableComputationExt.update(TrackableComputationExt.java:111)
    at org.eclipse.e4.core.internal.contexts.TrackableComputationExt.handleInvalid(TrackableComputationExt.java:74)
    at org.eclipse.e4.core.internal.contexts.EclipseContext.dispose(EclipseContext.java:178)

    at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:395)
    at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:387)
    at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:150)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
    ... 21 more

函数解决方案-仅需要返回布尔值,如果进行一些复杂的过滤则更好-仅需要为每行布尔值返回:

m = (df['Name'] == 'Alisa') & (df['Age'] > 24)
print(m)
0      True
1     False
2     False
3     False
4     False
5     False
6      True
7     False
8     False
9     False
10    False
11    False
dtype: bool

#invert mask by ~
df1 = df[~m]