以指定格式填充缺失值

时间:2021-06-10 06:14:02

标签: python

我有一个问题陈述,我必须以指定的方式填充缺失值(“_”)。

示例

输入:- ",,30,,,,50,,_" 输出:- 10,10,12,12,12,12,4,4,4

它是如何被填满的? 我们将从左到右填充缺失值 一种。首先我们将 30 分布到左边两个缺失值 (10, 10, 10, _, _, _, 50, _, _) 湾现在在 (10, 10, 12, 12, 12, 12, 12, _, _) 之间分配总和 (10+50) 个缺失值 C。现在我们将 12 个分布到右侧缺失值 (10, 10, 12, 12, 12, 12, 4, 4, 4)

我的代码如下

s = "_,_,30,_,_,_,50,_,_"

s = s.split(",")  
print(s)
print('***********')
result = []

count = s.index('30') + 1

print(count)
print('****************')
value = int(s[2]) / count
while count > 1:
    result.append(str(int(value)))
    count -= 1
value = int((value + int(s[6])) / (6 - 2 + 1))
count = 6 - 2 + 1
while count > 1:
    result.append(str(value))
    count -= 1
value = int(value / (len(s) - 6))
count = len(s) - 6
while count > 0:
    result.append(str(value))
    count -= 1
print(result)

唯一的问题是我正在硬编码,所以如果有任何其他数字,我的代码将无法工作,而不是 30/50。那么有人可以帮我修复这个硬编码的部分吗。

3 个答案:

答案 0 :(得分:1)

第一个版本,在极端情况下可能很粗糙

s = "_,_,30,_,_,_,50,_,_"
l = s.split(",")

out = []
last_pos = 0
for index, elem in enumerate(l):
    if elem == "_":
        if index == (len(l) - 1) and out:
            elem_to_div = out[-1]
            num = (index - last_pos + 1)
            if out:
                del out[-1]
            fills = elem_to_div / num
            out.extend([fills]*num)
        continue
    elem = int(elem)
    num = (index - last_pos + 1)
    elem_to_div = elem if not out else elem + out[-1]
    fills = elem_to_div / num
    if out:
        del out[-1]
    out.extend([fills]*num)
    last_pos = index

哪些输出

>>> out
[10.0, 10.0, 12.0, 12.0, 12.0, 12.0, 4.0, 4.0, 4.0]

答案 1 :(得分:1)

import copy

def is_int(num):
    try:
        int(num)
        return True
    except Exception as e:
        return False


def fill_empty(d_list, start, end, avg):
    for i in range(start, end+1):
        d_list[i] = avg
    return d_list

def get_starts(d_list):
    start = d_list.index('_')
    if start != 0:
        before_value = int(d_list[start-1])
        start -= 1
    else:
        before_value = 0
    return start,before_value

def process():
    s = '_,_,30,_,_,_,50,_,_'
    s_list = s.split(",")

    res_list = copy.copy(s_list)
    for i in range(1, len(s_list)):
        pre = s_list[i-1]
        cur = s_list[i]
        end = i

        if pre == '_' and is_int(cur):
            start,before_value = get_starts(res_list)
            length = end-start+1
            avg = (before_value + int(cur))/length
            res_list = fill_empty(res_list, start, end, avg)
        elif cur == '_' and i == len(s_list)-1:
            start,before_value = get_starts(res_list)
            length = end-start+1
            avg = (before_value + 0)/length
            res_list = fill_empty(res_list, start, end, avg)
    return res_list

r = process()
print(r)
# [10.0, 10.0, 12.0, 12.0, 12.0, 12.0, 4.0, 4.0, 4.0]

答案 2 :(得分:0)

试试这个递归方法。

MISSING = '_'

def settle(s):
        water = [x if x == MISSING else float(x) for x in s.split(',')]
        return settle_water(water)

def settle_water(water):
        if MISSING not in water:
                return water
        i = 0
        while i + 1 < len(water) and water[i + 1] != MISSING:
                i += 1
        j = i + 1
        while j + 1 < len(water) and water[j] == MISSING:
                j += 1
        average = ((0 if water[i] == MISSING else water[i]) + (0 if water[j] == MISSING else water[j])) / (j - i + 1)
        water[i : j + 1] = [average] * (j - i + 1)
        return water[:j] + settle_water(water[j:])

assert settle('_,_,30,_,_,_,50,_,_') == [10,10,12,12,12,12,4,4,4]
assert settle('1,_,_,3,_') == [1,1,1,0.5,0.5]