我有一个像这样的pandas专栏:
index colA
1 10.2
2 10.8
3 11.6
4 10.7
5 9.5
6 6.2
7 12.9
8 10.6
9 6.4
10 20.5
我想搜索当前行值并查找之前关闭的行的匹配项。例如,index4(10.7)将返回1的匹配,因为它接近index2(10.8)。类似地,index8(10.6)将返回2的匹配,因为它接近index2和4。
对于此示例,使用+/- 5%的阈值将输出以下内容:
index colA matches
1 10.2 0
2 10.8 0
3 11.6 0
4 10.7 2
5 9.5 0
6 6.2 0
7 12.9 0
8 10.6 3
9 6.4 1
10 20.5 0
对于大型数据帧,我想将其限制为先前要搜索的X(300?)行数,而不是整个数据帧。
答案 0 :(得分:3)
这是一个利用广播比较的numpy解决方案:
df
index colA matches
0 1 10.2 0
1 2 10.8 0
2 3 11.6 0
3 4 10.7 2
4 5 9.5 0
5 6 6.2 0
6 7 12.9 0
7 8 10.6 3
8 9 6.4 1
9 10 20.5 0
<dependencies>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>${org.springframework.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-web</artifactId>
<version>${org.springframework.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-webmvc</artifactId>
<version>${org.springframework.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-jpa</artifactId>
<version>1.6.0.RELEASE</version>
<exclusions>
<exclusion>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
</exclusion>
<exclusion>
<artifactId>spring-context</artifactId>
<groupId>org.springframework</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>${jackson.version}</version>
</dependency>
<!-- Hibernate -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator</artifactId>
<version>4.2.0.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>4.2.6.Final</version>
</dependency>
<!-- hsql -->
<dependency>
<groupId>org.hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<version>2.3.0</version>
</dependency>
<!-- apache -->
<dependency>
<groupId>net.sf.dozer</groupId>
<artifactId>dozer</artifactId>
<version>5.4.0</version>
</dependency>
<!-- Servlet -->
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>3.0.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>javax.servlet.jsp</groupId>
<artifactId>jsp-api</artifactId>
<version>2.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>jstl</artifactId>
<version>1.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${org.slf4j-version}</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jcl-over-slf4j</artifactId>
<version>${org.slf4j-version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>${org.slf4j-version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.15</version>
<exclusions>
<exclusion>
<groupId>javax.mail</groupId>
<artifactId>mail</artifactId>
</exclusion>
<exclusion>
<groupId>javax.jms</groupId>
<artifactId>jms</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.jdmk</groupId>
<artifactId>jmxtools</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.jmx</groupId>
<artifactId>jmxri</artifactId>
</exclusion>
</exclusions>
<scope>runtime</scope>
</dependency>
<!-- web jars -->
<dependency>
<groupId>org.webjars</groupId>
<artifactId>bootstrap</artifactId>
<version>3.2.0</version>
<exclusions>
<exclusion>
<groupId>org.webjars</groupId>
<artifactId>jquery</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>bootstrap-material-design</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>jquery</artifactId>
<version>2.1.1</version>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>angularjs</artifactId>
<version>1.3.8</version>
</dependency>
<!-- test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
请注意;这非常快,但不处理大型数据帧的300行限制。
答案 1 :(得分:3)
使用三角形索引确保我们只向后看。然后使用a = df.colA.values
i, j = np.tril_indices(len(a), -1)
mask = np.abs(a[i] - a[j]) / a[i] <= .05
df.assign(matches=np.bincount(i[mask], minlength=len(a)))
colA matches
index
1 10.2 0
2 10.8 0
3 11.6 0
4 10.7 2
5 9.5 0
6 6.2 0
7 12.9 0
8 10.6 3
9 6.4 1
10 20.5 0
累积匹配项。
numba
如果您遇到资源问题,请考虑使用优质的'ol fashion loop。但是,如果您可以访问from numba import njit
@njit
def counter(a):
c = np.arange(len(a)) * 0
for i, x in enumerate(a):
for j, y in enumerate(a):
if j < i:
if abs(x - y) / x <= .05:
c[i] += 1
return c
df.assign(matches=counter(a))
colA matches
index
1 10.2 0
2 10.8 0
3 11.6 0
4 10.7 2
5 9.5 0
6 6.2 0
7 12.9 0
8 10.6 3
9 6.4 1
10 20.5 0
,则可以大大加快这一速度。
return new ExpansionTile(..);
答案 2 :(得分:2)
rolling
apply
df.colA.rolling(window=len(df),min_periods=1).apply(lambda x : sum(abs((x-x[-1])/x[-1])<0.05)-1)
Out[113]:
index
1 0.0
2 0.0
3 0.0
4 2.0
5 0.0
6 0.0
7 0.0
8 3.0
9 1.0
10 0.0
Name: colA, dtype: float64
,如果速度很重要,请查看冷answer
//dont add inine function
//add a common class
<input class="getValue" value="4219" type="button">
<input class="getValue" value="5419" type="button">
//add event handler
for (var i = 0; i < getValue.length; i++) {
getValue[i].addEventListener('click', function(){
alert(this.vaue);
});
}