我最近发现了Codility,我正在接受演示培训。 我将此解决方案写入基因组范围查询问题,它运行正常,解决方案提供动态编程,但它的得分仅为87%而不是预期的100%。
任何人都有任何想法?
在这里您可以找到问题,它位于前缀部分。刚开始测试,看看问题描述! Codility training
谢谢!
def solution(S, P, Q):
# write your code in Python 2.6
S = list(S)
sol = [[0]*len(S),[0]*len(S),[0]*len(S),[0]*len(S)]
mapping = {"A":1, "C":2, "G":3, "T":4}
for i in range(0,len(S)):
if S[i] == 'A':
sol[0][i]+= 1
elif S[i] == 'C':
sol[1][i] += 1
elif S[i] == 'G':
sol[2][i] += 1
elif S[i] == 'T':
sol[3][i] += 1
if i < len(S)-1:
sol[0][i+1] = sol[0][i]
sol[1][i+1] = sol[1][i]
sol[2][i+1] = sol[2][i]
sol[3][i+1] = sol[3][i]
for n in range(0, len(P)):
l = P[n]
r = Q[n]
pre_sum = [0,0,0,0]
if l > 0:
pre_sum = [sol[0][l],sol[1][l],sol[2][l],sol[3][l]]
post_sum = [sol[0][r],sol[1][r],sol[2][r],sol[3][r]]
if post_sum[0]-pre_sum[0] > 0:
P[n] = 1
elif post_sum[1]-pre_sum[1] > 0:
P[n] = 2
elif post_sum[2]-pre_sum[2] > 0:
P[n] = 3
elif post_sum[3]-pre_sum[3] > 0:
P[n] = 4
else:
P[n] = mapping[S[P[n]]];
return P
pass
答案 0 :(得分:2)
啊,我正在做同样的事情,我花了很长时间来调试,但最后我设法获得了100/100。
例如,何时
S='AGT'
和P=[1]
,Q=[2]
,
该函数应该为G返回3,但是你的(最初我的)将为T
我认为这会解决它:
if l > 0:
pre_sum = [sol[0][l-1],sol[1][l-1],sol[2][l-1],sol[3][l-1]]
答案 1 :(得分:0)
如果仍然有人对此练习感兴趣,请分享我的Python解决方案 (100/100 in Codility)
def solution(S, P, Q):
count = []
for i in range(3):
count.append([0]*(len(S)+1))
for index, i in enumerate(S):
count[0][index+1] = count[0][index] + ( i =='A')
count[1][index+1] = count[1][index] + ( i =='C')
count[2][index+1] = count[2][index] + ( i =='G')
result = []
for i in range(len(P)):
start = P[i]
end = Q[i]+1
if count[0][end] - count[0][start]:
result.append(1)
elif count[1][end] - count[1][start]:
result.append(2)
elif count[2][end] - count[2][start]:
result.append(3)
else:
result.append(4)
return result
答案 2 :(得分:0)
100/100
def solution(S,P,Q):
d = {"A":0,"C":1,"G":2,"T":3}
n = len(S)
pref = [[0,0,0,0]]*(n+1)
for i in range(0,n):
pref[i] = [x for x in pref[i-1]]
pref[i][d[S[i]]] += 1
lst = []
for i in range(0,len(P)):
if Q[i] == P[i]:
lst.append(d[S[P[i]]]+1)
else:
x = 0
while x < 4:
if pref[Q[i]][x] - pref[P[i]-1][x] > 0:
lst.append(x+1)
break
x += 1
return lst
答案 3 :(得分:0)
这同样适用于100/100
def solution(S, P, Q):
res = []
for i in range(len(P)):
if 'A' in S[P[i]:Q[i]+1]:
res.append(1)
elif 'C' in S[P[i]:Q[i]+1]:
res.append(2)
elif 'G' in S[P[i]:Q[i]+1]:
res.append(3)
else:
res.append(4)
return res
答案 4 :(得分:0)
对于Python3.6,为100%:
def solution(S, P, Q):
NUCLEOTIDES = 'ACGT'
IMPACTS = {nucleotide: impact for impact, nucleotide in enumerate(NUCLEOTIDES, 1)}
result = []
for query in range(len(P)):
sample = S[P[query]:Q[query]+1]
for nucleotide, impact in IMPACTS.items():
if nucleotide in sample:
result.append(impact)
break
return result
答案 5 :(得分:0)
我发现GenomicRangeQuery的优异成绩得分为100%。
def solution(s,p,q):
n = len(p)
r = [0]*n
for i in range(n):
pi=p[i]
qi=q[i]+1
ts=s[pi:qi]
if 'A' in ts:
r[i]=1
elif 'C' in ts:
r[i]=2
elif 'G' in ts:
r[i]=3
elif 'T' in ts:
r[i]=4
return r
s,p,q = 'CAGCCTA', [2, 5, 0], [4, 5, 6]
solution(s,p,q)
答案 6 :(得分:0)
使用in
或contains
运算符的语言特定实现,无需任何技巧就可以得分100/100 O(N + M)算法:
Lets define prefix as:
* last index of particular nucleone before on in current position. If no prev occcurance put -1.
*
*
* indexes: 0 1 2 3 4 5 6
* factors: 2 1 3 2 2 4 1
* C A G C C T A
*
* prefix : A -1 1 1 1 1 1 6
* C 0 0 0 3 4 4 4
* G -1 -1 2 2 2 2 2
* T -1 -1 -1 -1 -1 5 5
*
* Having such defined prefix let us easily calculate answer question of minimal factor in following way:
* subsequence S[p]S[p+1]...S[q-1]S[q] has the lowest factor:
* 1 if prefix index [A][q] >= p
* 2 if prefix index [C][q] >= p
* 3 if prefix index [G][q] >= p
* 4 if prefix index [T][q] >= p
答案 7 :(得分:0)
对于每种类型的核苷酸,我们可以计算从当前位置 (i=0,1,...,N-1) 到最近的前一个核苷酸的距离,其中所有以前的核苷酸和当前的核苷酸(在当前位置)被考虑。
距离数组 pre_dists 将类似于:
| C A G C C T A |
----|-----------------------------------|
A | -1 0 1 2 3 4 0 |
C | 0 1 2 0 0 1 2 |
G | -1 -1 0 1 2 3 4 |
T | -1 -1 -1 -1 -1 0 1 |
基于这个距离数据,我可以得到任何切片的最小影响因子。
我在 Python 中的实现:
def solution(S, P, Q):
N = len(S)
M = len(P)
# impact factors
I = {'A': 1, 'C': 2, 'G': 3, 'T': 4}
# distance from current position to the nearest nucleotide
# for each nucleotide type (previous or current nucleotide are considered)
# e.g. current position is 'A' => the distance dist[0] = 0, index 0 for type A
# 'C' => the distance dist[1] = 0, index 1 for type C
pre_dists = [[-1]*N,[-1]*N,[-1]*N,[-1]*N]
# initial values
pre_dists[I[S[0]]-1][0] = 0
for i in range(1, N):
for t in range(4):
if pre_dists[t][i-1] >= 0:
# increase the distances
pre_dists[t][i] = pre_dists[t][i-1] + 1
# reset distance for current nucleotide type
pre_dists[I[S[i]]-1][i] = 0
# result keeper
res = [0]*M
for k in range(M):
p = P[k]
q = Q[k]
if pre_dists[0][q] >=0 and q - pre_dists[0][q] >= p:
res[k] = 1
elif pre_dists[1][q] >=0 and q - pre_dists[1][q] >= p:
res[k] = 2
elif pre_dists[2][q] >=0 and q - pre_dists[2][q] >= p:
res[k] = 3
else:
res[k] = 4
return res
我希望这有帮助。谢谢!