我是一名研究人员,使用Python从事气候模型输出,以发现某些类型的风暴。我有8个大型numpy阵列(尺寸为109574 x 52 x 57)。这些数组用1填充,表示当天有风暴(第一个维度是时间),0表示没有暴风雨。另外两个维度是纬度和经度。
我必须从这些阵列中消除背靠背的日子。例如,如果在第1天和第2天发生风暴,我想只计算1次风暴。如果第1天,第2天和第3天有暴风雨,我想只计算1和3总共两场风暴,第1-4天将有2场风暴,依此类推。我在最后使用np.sum找到了风暴的数量,以便沿着时间轴计算阵列中的1#。
我正在运行以下代码来实现这一目标,但我遇到的问题是它非常慢。因为我将不得不为其他数据集重复这个过程,我想知道是否有办法加快这个过程的效率。我的代码如下,我非常乐意澄清任何内容。
# If there is a storm that overlaps two two-day periods, only count it once
print("Eliminating doubles...")
for i in range(52):
for j in range(57):
print(i,j)
for k in range(109573):
if((storms1[k,i,j]) == 1 and (storms1[k+1,i,j] == 1)):
storms1[k,i,j] = 0
if((storms2[k,i,j]) == 1 and (storms2[k+1,i,j] == 1)):
storms2[k,i,j] = 0
if((storms3[k,i,j]) == 1 and (storms3[k+1,i,j] == 1)):
storms3[k,i,j] = 0
if((storms4[k,i,j]) == 1 and (storms4[k+1,i,j] == 1)):
storms4[k,i,j] = 0
if((storms5[k,i,j]) == 1 and (storms5[k+1,i,j] == 1)):
storms5[k,i,j] = 0
if((storms6[k,i,j]) == 1 and (storms6[k+1,i,j] == 1)):
storms6[k,i,j] = 0
if((storms7[k,i,j]) == 1 and (storms7[k+1,i,j] == 1)):
storms7[k,i,j] = 0
if((storms8[k,i,j]) == 1 and (storms8[k+1,i,j] == 1)):
storms8[k,i,j] = 0
在有人建议使用循环迭代数组之前,我更改了变量名称以简化它们以便提出这个问题。
感谢您的帮助。
答案 0 :(得分:2)
这是一个矢量化函数,可以替换你最内层的循环:
def do(KK):
# find stretches of ones
switch_points = np.where(np.diff(np.r_[0, KK, 0]))[0]
switch_points.shape = -1, 2
# isolate stretches starting on odd days and create mask
odd_starters = switch_points[switch_points[:, 0] % 2 == 1, :]
odd_mask = np.zeros((KK.shape[0] + 1,), dtype=KK.dtype)
odd_mask[odd_starters] = 1, -1
odd_mask = np.add.accumulate(odd_mask[:-1])
# apply global 1,0,1,0,1,0,... mask
KK[1::2] = 0
# invert stretches starting on odd days
KK ^= odd_mask
从外部循环对(i和j)中调用它:
do(storms1[:, i, j])
do(storms2[:, i, j])
etc.
它将就地更改数组。
这应该比循环快得多(两个外部循环没有区别)。
工作原理:
它找到块的起点和终点。我们知道在每个这样的块中,每隔一个块必须是零。 使用全局1,0,1,0,1,0,...掩码,算法每隔一天就会清零。
产生
该算法的最后一步是反转这些奇数起始块。
答案 1 :(得分:2)
使用模拟第一轴的一维数组的示例。首先,找到1组的起始位置。接下来,找到每个组的长度。最后,根据您的逻辑计算事件数量:
import numpy
a = numpy.random.randint(0,2,20)
# Add an initial 0
a1 = numpy.r_[0, a]
# Mark the start of each group of 1's
d1 = numpy.diff(a1) > 0
# Indices of the start of groups of 1's
w1 = numpy.arange(len(d1))[d1]
# Length of each group
cs = numpy.cumsum(a)
c = numpy.diff(numpy.r_[cs[w1], cs[-1]+1])
# Apply the counting logic
storms = c - c//2
print(a)
>>> array([0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1])
print(c)
>>> array([1, 2, 4, 1, 3])
print(storms)
>>> array([1, 1, 2, 1, 2])
通过在不再需要后重用变量名等,可以节省比我在此处显示的内存更多的内存。
答案 2 :(得分:0)
所以我想你想要:
apply plugin: 'com.android.application'
android {
compileSdkVersion 25
buildToolsVersion "25.0.0"
defaultConfig {
minSdkVersion 15
targetSdkVersion 25
versionCode 1
versionName "1.0"
multiDexEnabled true
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
}
}
}
repositories {
mavenCentral()
}
dependencies {
compile fileTree(dir: 'libs', include: ['*.jar'])
androidTestCompile('com.android.support.test.espresso:espresso-core:2.2.2', {
exclude group: 'com.android.support', module: 'support-annotations'
})
compile 'com.android.support:appcompat-v7:25.1.0'
compile 'com.android.support:support-v4:25.1.0'
compile 'com.android.volley:volley:1.0.0'
compile 'com.mcxiaoke.volley:library:1.0.19'
compile 'dev.dworks.libs:volleyplus:0.1.4'
compile 'com.squareup.okhttp3:okhttp:3.5.0'
compile 'com.google.android.gms:play-services-gcm:10.0.1'
compile 'com.google.firebase:firebase-appindexing:10.0.1'
compile 'com.android.support:design:25.1.0'
compile 'com.roughike:bottom-bar:1.2.1'
compile 'com.ncapdevi:frag-nav:1.2.2'
compile 'me.dm7.barcodescanner:zxing:1.8.4'
compile 'com.android.support:cardview-v7:25.1.0'
compile 'com.android.support:recyclerview-v7:25.1.0'
compile 'com.google.firebase:firebase-core:10.0.1'
compile 'com.google.firebase:firebase-messaging:10.0.1'
compile 'de.hdodenhof:circleimageview:2.1.0'
compile 'com.loopj.android:android-async-http:1.4.9'
compile 'net.gotev:uploadservice:3.0.3'
compile 'com.facebook.android:facebook-android-sdk:[4,5)'
compile 'com.google.android.gms:play-services-auth:10.0.1'
compile 'com.journeyapps:zxing-android-embedded:3.4.0'
compile 'com.ogaclejapan.smarttablayout:library:1.6.1@aar'
compile 'com.ogaclejapan.smarttablayout:utils-v4:1.6.1@aar'
compile 'com.hedgehog.ratingbar:app:1.1.2'
testCompile 'junit:junit:4.12'
compile 'com.squareup.picasso:picasso:2.5.2'
compile 'com.jakewharton.picasso:picasso2-okhttp3-downloader:1.1.0'
}
apply plugin: 'com.google.gms.google-services'
这是不您的代码示例正在做什么,但是您在第二段中想要做的是。
要做到这一点,您需要两个步骤
storms_in[:,i,j] = [0,0,1,1,0,1,1,1,0,1,0,1,1,1,1,0]
storms_out[:,i,j]= [0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0]
这会对整个过程进行矢量化,您只需要调用它8次。 def storms_disc(storms): # put the whole array here, boolean-safe
z = np.zeros((1,) + storms.shape[1:]) # zero-pads for the ends
changes = np.r_[storms.astype('int8') ,z] - np.r_[z, storms.astype('int8')] #find where the weather changes
changes=((changes[:-1] == 1) | (changes[1:] == -1)).astype('int8') # reduce dimension
return ((np.r_[changes, z] - np.r_[z, changes])[:-1] == 1).astype(storms.dtype) #find the first of successive changes
调用是因为减去布尔值会导致错误,即使它们的值为1和0
测试:
astype