熊猫:按年百分位数划分时间序列

时间:2018-10-08 13:14:30

标签: python pandas binning

我有以下数据框:

date  = ['2015-02-03 23:00:00','2015-02-03 23:30:00','2015-02-04 00:00:00','2015-02-04 00:30:00','2015-02-04 01:00:00','2015-02-04 01:30:00','2015-02-04 02:00:00','2015-02-04 02:30:00','2015-02-04 03:00:00','2015-02-04 03:30:00','2015-02-04 04:00:00','2015-02-04 04:30:00','2015-02-04 05:00:00','2015-02-04 05:30:00','2015-02-04 06:00:00','2015-02-04 06:30:00','2015-02-04 07:00:00','2015-02-04 07:30:00','2015-02-04 08:00:00','2015-02-04 08:30:00','2015-02-04 09:00:00','2015-02-04 09:30:00','2015-02-04 10:00:00','2015-02-04 10:30:00','2015-02-04 11:00:00','2015-02-04 11:30:00','2015-02-04 12:00:00','2015-02-04 12:30:00','2015-02-04 13:00:00','2015-02-04 13:30:00','2015-02-04 14:00:00','2015-02-04 14:30:00','2015-02-04 15:00:00','2015-02-04 15:30:00','2015-02-04 16:00:00','2015-02-04 16:30:00','2015-02-04 17:00:00','2015-02-04 17:30:00','2015-02-04 18:00:00','2015-02-04 18:30:00','2015-02-04 19:00:00','2015-02-04 19:30:00','2015-02-04 20:00:00','2015-02-04 20:30:00','2015-02-04 21:00:00','2015-02-04 21:30:00','2015-02-04 22:00:00','2015-02-04 22:30:00','2015-02-04 23:00:00','2015-02-04 23:30:00']
value = [33.24  , 31.71  , 34.39  , 34.49  , 34.67  , 34.46  , 34.59  , 34.83  , 35.78  , 33.03  , 35.49  , 33.79  , 36.12  , 37.09  , 39.54  , 41.19  , 45.99  , 50.23  , 46.72  , 47.47  , 48.46  , 48.38  , 48.40  , 48.13  , 38.35  , 38.19  , 38.12  , 38.05  , 38.06  , 37.83  , 37.49  , 37.41 , 41.84  , 42.26 , 44.09  , 48.85  , 50.07 , 50.94  , 51.09  , 50.60  , 47.39  , 45.57  , 45.03  , 44.98  , 41.32  , 40.37  , 41.12  , 39.33  , 35.38  , 33.44  ]
df = pd.DataFrame({'value':value,'index':date})
df.index = pd.to_datetime(df['index'],format='%Y-%m-%d %H:%M')
df.drop(['index'],axis=1,inplace=True)
print(df)    

                     value
index                     
2015-02-03 23:00:00  33.24
2015-02-03 23:30:00  31.71
2015-02-04 00:00:00  34.39
2015-02-04 00:30:00  34.49
2015-02-04 01:00:00  34.67
2015-02-04 01:30:00  34.46

我想对“值”列进行装箱,以查看该值是否优于该年的值的90%或该年未包括的80%至90%之间。

我知道我可以使用pandas cut函数,我的问题是如何将每年给定的百分位数传递给它(变量分别为'PERCENTILE80_of_considered_year'和'PERCENTILE90_of_considered_year'):

binned = pd.cut(x=df.value, bins=[-np.inf,PERCENTILE80_of_considered_year, PERCENTILE90_of_considered_year, np.inf], right=False, labels=['<P80', 'P80_90', '>P90'])

预期结果将类似于(仅用于说明)

                     value   bin
index                     
2015-02-03 23:00:00  33.24   P80_90 
2015-02-03 23:30:00  31.71   <P80
2015-02-04 00:00:00  34.39   P80_90
2015-02-04 00:30:00  34.49  P80_90
2015-02-04 01:00:00  34.67   >P90
2015-02-04 01:30:00  34.46   P80_90

有人知道如何有效地做到这一点吗?还是任何其他有效的方法?

非常感谢,

2 个答案:

答案 0 :(得分:1)

不确定我是否能完全回答您的问题,但是我将按以下方式计算百分比:

p80 = df.value.quantile(0.8)
p90= df.value.quantile(0.9)
df['binned'] = pd.cut(x=df.value, bins=[-np.inf, p80, p90, np.inf], right=False, labels=['<P80', 'P80_90', '>P90'])

您的示例只有一年,如果是多年,则可以执行相同的操作,但是使用groups而不是整个df。许多方法可以做到这一点,但一种选择是:

for year in df.index.year.unique():
   mask = df.index.year == year
   df.loc[mask, 'binned'] = pd.cut(x=df.value 
               , bins=[-np.inf, df[mask].value.quantile(0.8), df[mask].value.quantile(0.9), np.inf]
                , right=False, labels=['<P80', 'P80_90', '>P90'])
df.head()

答案 1 :(得分:1)

您可以 <properties> <springboot.version>1.4.6.RELEASE</springboot.version> <swagger.version>2.4.0</swagger.version> <tomcat.version>8.0.3</tomcat.version> <sonar.java.coveragePlugin>jacoco</sonar.java.coveragePlugin> <sonar.jacoco.reportPath>${basedir}/target/jacoco.exec</sonar.jacoco.reportPath> <sonar.junit.reportsPath>${basedir}/target/surefire-reports</sonar.junit.reportsPath> <sonar.language>java</sonar.language> <spring.data.commons.version>1.12.1.RELEASE</spring.data.commons.version> <spring.data.commons.core.version>1.4.1.RELEASE</spring.data.commons.core.version> <spring.data.jpa.version>1.10.2.RELEASE</spring.data.jpa.version> <spring.boot.starter.test.version>1.4.0.RELEASE</spring.boot.starter.test.version> <google.gson.version>2.6.2</google.gson.version> <apache.commons.io.version>1.3.2</apache.commons.io.version> <google.guava.version>19.0</google.guava.version> <harmcrest.version>1.3</harmcrest.version> <jacoco.version>0.7.4.201502262128</jacoco.version> <junit.version>4.12</junit.version> <spring.test.version>4.1.6.RELEASE</spring.test.version> <apache.commons.lang3.version>3.0</apache.commons.lang3.version> <ojdbc6.version>11.2.0.2</ojdbc6.version> <xalan.version>2.7.1</xalan.version> <xercesImpl.version>2.9.1</xercesImpl.version> <fop-patch-46319.version>0.93</fop-patch-46319.version> <itextpdf.version>7.0.5</itextpdf.version> <bcprov-jdk15.version>1.44</bcprov-jdk15.version> <bctsp-jdk15.version>1.44</bctsp-jdk15.version> <commons-lang.version>2.4</commons-lang.version> <bcmail-jdk15.version>1.44</bcmail-jdk15.version> <sonar-maven-plugin.version>3.7</sonar-maven-plugin.version> </properties> .. .. .. <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>${springboot.version}</version> <type>pom</type> <scope>import</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> <version>${springboot.version}</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> <version>${springboot.version}</version> </dependency> <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-commons</artifactId> <version>${spring.data.commons.version}</version> </dependency> <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-commons-core</artifactId> <version>${spring.data.commons.core.version}</version> </dependency> <!-- https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-test --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <version>${spring.boot.starter.test.version}</version> <scope>test</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.springframework.data/spring-data-jpa --> <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-jpa</artifactId> <version>${spring.data.jpa.version}</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-configuration-processor</artifactId> <version>${springboot.version}</version> <optional>true</optional> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> <version>${springboot.version}</version> </dependency> <dependency> <groupId>com.google.code.gson</groupId> <artifactId>gson</artifactId> <version>${google.gson.version}</version> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-io</artifactId> <version>${apache.commons.io.version}</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>${google.guava.version}</version> </dependency> <!-- https://mvnrepository.com/artifact/org.hamcrest/hamcrest-all --> <dependency> <groupId>org.hamcrest</groupId> <artifactId>hamcrest-all</artifactId> <version>${harmcrest.version}</version> <scope>test</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.jacoco/jacoco-maven-plugin --> <dependency> <groupId>org.jacoco</groupId> <artifactId>jacoco-maven-plugin</artifactId> <version>${jacoco.version}</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>${junit.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>io.springfox</groupId> <artifactId>springfox-swagger-ui</artifactId> <version>${swagger.version}</version> </dependency> <dependency> <groupId>io.springfox</groupId> <artifactId>springfox-swagger2</artifactId> <version>${swagger.version}</version> </dependency> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-test</artifactId> <version>${spring.test.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>${apache.commons.lang3.version}</version> </dependency> <dependency> <groupId>com.oracle</groupId> <artifactId>ojdbc6</artifactId> <version>${ojdbc6.version}</version> </dependency> <dependency> <groupId>xalan</groupId> <artifactId>xalan</artifactId> <version>${xalan.version}</version> </dependency> <dependency> <groupId>xerces</groupId> <artifactId>xercesImpl</artifactId> <version>${xercesImpl.version}</version> </dependency> <dependency> <groupId>org.apache.xmlgraphics</groupId> <artifactId>fop-patch-46319</artifactId> <version>${fop-patch-46319.version}</version> </dependency> <!-- https://mvnrepository.com/artifact/com.itextpdf/itext7-core --> <dependency> <groupId>com.itextpdf</groupId> <artifactId>itext7-core</artifactId> <version>${itextpdf.version}</version> <type>pom</type> </dependency> <dependency> <groupId>org.bouncycastle</groupId> <artifactId>bcprov-jdk15</artifactId> <version>${bcprov-jdk15.version}</version> </dependency> <dependency> <groupId>org.bouncycastle</groupId> <artifactId>bcmail-jdk15</artifactId> <version>${bcmail-jdk15.version}</version> </dependency> <dependency> <groupId>org.bouncycastle</groupId> <artifactId>bctsp-jdk15</artifactId> <version>${bctsp-jdk15.version}</version> </dependency> <dependency> <groupId>commons-lang</groupId> <artifactId>commons-lang</artifactId> <version>${commons-lang.version}</version> </dependency> <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <version>1.4.193</version> <scope>test</scope> </dependency> <dependency> <groupId>com.zaxxer</groupId> <artifactId>HikariCP</artifactId> <version>2.6.0</version> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.16.16</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-jdbc --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-jdbc</artifactId> <version>1.3.5.RELEASE</version> <exclusions> <exclusion> <groupId>org.apache.tomcat</groupId> <artifactId>tomcat-jdbc</artifactId> </exclusion> </exclusions> </dependency> 年和groupby每个组的功能。

apply