代码
def clouds_function():
"""
Extracts Cloud Height and Type from the data
Returns: Cloud Height and Type CCCXXX
"""
clouds1 = content[1]
clouds1 = clouds1[15:len(clouds1)]
clouds1 = clouds1.split()
clouds2 = content[2]
clouds2 = clouds2 + " "
clouds2=[clouds2[y-8:y] for y in range(8, len(clouds2)+8,8)]
clouds3 = content[3]
clouds3 = clouds3 + " "
print(clouds3)
clouds3=[clouds3[y-8:y] for y in range(8, len(clouds3)+8,8)]
return(clouds3)
print(clouds_function())
样本数据
content[1] = 'OVC018 BKN006 OVC006 OVC006 OVC017 OVC005 OVC005 OVC016 OVC029 OVC003 OVC002 OVC001 OVC100'
content[2] =' OVC025 OVC010 OVC009 OVC200'
content[3] =' OVC100 '
我尝试了
def split(s, n):
if len(s) < n:
return []
else:
return [s[:n]] + split(s[n:], n)
它返回['OVC100 ']
的{{1}}
我需要
content[3]
结果
['','OVC100','','','','','','','','','','','']
我需要齐次数组
开始时每个长度都是不均匀的,这可能是个问题。
答案 0 :(得分:1)
您的数据存在长度问题,并且间隙大小不同(2个或1个字符):
c[1] = 'OVC018 BKN006 OVC006 OVC006 OVC017 OVC005 OVC005 OVC016 OVC029 OVC003 OVC002 OVC001 OVC100' c[2] =' OVC025 OVC010 OVC009 OVC200' c[3] =' OVC100 '
c[2]
和c[3]
在第二个值的开头使用9个字符,c[1]
仅使用8个字符'OVC005 OVC016'
之间只有1个空格,通常为2 如果长度是固定的或可预测的(不是),切片是很好的方法-使用简单的字符串加法和用空格分隔符替换空格可以更好地解决此问题:
'-'
替换所有[8,7,6,2,1]长的空格-一个(新的)人工分隔符'-'
处分裂content= ['OVC018 BKN006 OVC006 OVC006 OVC017 OVC005 OVC005 OVC016 OVC029 OVC003 OVC002 OVC001 OVC100',
' OVC025 OVC010 OVC009 OVC200',
' OVC100 ']
# extend data
max_len = max(len(data) for data in content)
for i,c in enumerate(content):
# fix legths
content[i] = c + " " * (max_len-len(c))
# replace stretches of spaces by a splitter character
content[i] = content[i].replace(" "*8,"-").replace(" "*7,"-").replace(" "*6,"-").replace(" ","-").replace(" ","-")
hom = [c.split("-") for c in content]
for c in hom:
print(c,"\n")
输出:
['OVC018', 'BKN006', 'OVC006', 'OVC006', 'OVC017', 'OVC005', 'OVC005', 'OVC016', 'OVC029', 'OVC003', 'OVC002', 'OVC001', 'OVC100']
['', 'OVC025', '', '', '', 'OVC010', 'OVC009', '', '', '', '', '', 'OVC200']
['', 'OVC100', '', '', '', '', '', '', '', '', '', '', '']