如何通过基于类别的先前非null填充null值

时间:2019-06-25 13:09:25

标签: c# linq

我有如下数据集,需要通过以前的非空“ Price”值外推所有空“ Price”值。这看起来很简单,我需要考虑的类别还很少-Cat1,Cat2,DateG和TimeG-我有这个“价格”。

class DataLoad
    {
        public int DateG { get; set; }
        public DateTime TimeG { get; set; }
        public string Cat1 { get; set; }
        public string Cat2 { get; set; }
        public double? Price { get; set; }
        public int? Volume { get; set; }

        public static List<DataLoad> GetSomeData()
        {
            return new List<DataLoad>()
            {
                new DataLoad {Cat1 = "A", Cat2 = "A1", DateG = 20190601, TimeG = DateTime.Parse("00:11:00.0000000"), Price = null, Volume = 4209},
                new DataLoad {Cat1 = "A", Cat2 = "A1", DateG = 20190602, TimeG = DateTime.Parse("12:22:00.0000000"), Price = 123.54, Volume = 2109},
                new DataLoad {Cat1 = "A", Cat2 = "A2", DateG = 20190602, TimeG = DateTime.Parse("15:33:00.0000000"), Price = 213.44, Volume = 2119},
                new DataLoad {Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("20:31:00.0000000"), Price = null, Volume = 1134},
                new DataLoad {Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("21:33:00.0000000"), Price = null, Volume = 1824},
                new DataLoad {Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("21:34:00.0000000"), Price = 214.74, Volume = 1111},
                new DataLoad {Cat1 = "A", Cat2 = "A2", DateG = 20190606, TimeG = DateTime.Parse("23:41:00.0000000"), Price = 223.64, Volume = 3456},
                new DataLoad {Cat1 = "B", Cat2 = "B1", DateG = 20190512, TimeG = DateTime.Parse("11:41:00.0000000"), Price = 135.77, Volume = 1956},
                new DataLoad {Cat1 = "B", Cat2 = "B1", DateG = 20190513, TimeG = DateTime.Parse("12:34:00.0000000"), Price = null, Volume = 3457},
                new DataLoad {Cat1 = "B", Cat2 = "B2", DateG = 20190514, TimeG = DateTime.Parse("08:11:00.0000000"), Price = 123.54, Volume = 9873},
                new DataLoad {Cat1 = "B", Cat2 = "B2", DateG = 20190514, TimeG = DateTime.Parse("15:21:00.0000000"), Price = null, Volume = 2890},

            };
        }
    }

我想知道基于DateG,TimeG,Cat1和Cat2的数据集,然后应用一些逻辑,但是我总是遇到许多for循环,这使它过于复杂,最终我无法获得所需的输出。 / p>

所需的输出应该看起来像低于填充的价格(无论DateG,TimeG,Cat1,Cat2的顺序如何):

Cat1 = "A", Cat2 = "A1", DateG = 20190601, TimeG = DateTime.Parse("00:11:00.0000000"), Price = 123.54, Volume = 4209
Cat1 = "A", Cat2 = "A1", DateG = 20190602, TimeG = DateTime.Parse("12:22:00.0000000"), Price = 123.54, Volume = 2109
Cat1 = "A", Cat2 = "A2", DateG = 20190602, TimeG = DateTime.Parse("15:33:00.0000000"), Price = 213.44, Volume = 2119
Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("20:31:00.0000000"), Price = 213.44, Volume = 1134
Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("21:33:00.0000000"), Price = 213.44, Volume = 1824
Cat1 = "A", Cat2 = "A2", DateG = 20190605, TimeG = DateTime.Parse("21:34:00.0000000"), Price = 214.74, Volume = 1111
Cat1 = "A", Cat2 = "A2", DateG = 20190606, TimeG = DateTime.Parse("23:41:00.0000000"), Price = 223.64, Volume = 3456
Cat1 = "B", Cat2 = "B1", DateG = 20190512, TimeG = DateTime.Parse("11:41:00.0000000"), Price = 135.77, Volume = 1956
Cat1 = "B", Cat2 = "B1", DateG = 20190513, TimeG = DateTime.Parse("12:34:00.0000000"), Price = 135.77, Volume = 3457
Cat1 = "B", Cat2 = "B2", DateG = 20190514, TimeG = DateTime.Parse("08:11:00.0000000"), Price = 123.54, Volume = 9873
Cat1 = "B", Cat2 = "B2", DateG = 20190514, TimeG = DateTime.Parse("15:21:00.0000000"), Price = 123.54, Volume = 2890

有没有使用linq的简单方法。

3 个答案:

答案 0 :(得分:1)

我在这里看到几个选择。

  • 您可以使用MoreLinq,其中包含一些可能有效的方法(例如Lag/LeadFillForward/FillBackward)。
  • 您可以编写自己的相当简单的扩展方法来为您填充。

使用MoreLinq:

您可以使用MoreLinq进行多种操作,但是我将使用扩展名Lag展示一个示例。

var result = GetSomeData() 
    // Do ordering if you want
    .OrderByDescending(d => d.DateG)
    .ThenByDescending(t => t.TimeG)
    .ThenByDescending(c1 => c1.Cat1)
    .ThenByDescending(c2 => c2.Cat2)
    // Add the filling logic with .Lag()
    .Lag(1, (current, previous) =>
    {
        if(previous != null) current.name = current.name ?? previous.name;
        return current;
    }).ToList();

一个缺点是它没有提供您可能需要的“回填”。如果您在列表的开头有null个价格,则这些价格将保持为空且不会被填充。您可以通过手动处理这些情况或在反向列表中运行它来解决(可能不推荐)。要注意的另一件事是,这将编辑列表中的实际对象,而不创建我通常在处理LINQ时要避免的新对象。您可以编辑选择器以更改该行为。

自定义扩展方法:

这是我想到的:

public static IEnumerable<TSource> Fill<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate, Func<TSource, TSource, TSource> resultSelector)
{
    var backFilled = false;
    var previous = default(TSource);
    var backFill = new List<TSource>();
    foreach (var elm in source)
    {
        if (predicate(elm))
        {
            if (!backFilled)
            {
                backFill.Add(elm);
            }
            else
            {
                yield return resultSelector(previous, elm);
            }
        }
        else if (!backFilled)
        {
            // We've found our first element to be able to backfill with
            for (int i = 0; i < backFill.Count; i++)
            {
                yield return resultSelector(elm, backFill[i]);
            }
            backFilled = true;
            yield return elm;
        }
        else
        {
            yield return elm;
        }
        previous = elm;
    }
}

用法

第一个参数是我要填写的条件。在您的情况下,数据DataLoad.Price为空。像这样:

data => data.Price == null

如果该条件的值为真,则使用当前值和先前值调用handler函数。你的看起来像这样:

(prev, curr) => 
{ 
    curr.Price = prev.Price;
    return curr;
}

将所有内容汇总在一起:

var result = GetSomeData()
    // Do ordering/filtering/grouping here
    .Fill(
        data => data.Price == null,
        (prev, curr) => 
        { 
            curr.Price = prev.Price;
            return curr;
        })
    .ToList();

这里是指向demo的链接,您可以使用它。

这样做的好处是,您可以更好地控制填充时所发生的事情,同时仍然使该函数具有一定的通用性。您可以将其应用于任何IEnumerable,并使其仍然有效。这也可以执行MoreLinq查询没有开箱即用的“回填”。

注意:这仍会编辑列表中的现有对象,但是使用其他选择器可以解决该问题。

答案 1 :(得分:0)

List<DataLoad> result =
    DataLoad.GetSomeData()
            .OrderByDescending(d => d.DateG)
            .ThenByDescending(t => t.TimeG)
            .ThenByDescending(c1 => c1.Cat1)
            .ThenByDescending(c2 => c2.Cat2)
            .ToList();

您想要的输出示例令人困惑,但是如果我预料到您想要的,该查询应该会有所帮助。

答案 2 :(得分:0)

Linq将为您提供性能最佳的方法。

DataLoad.GetSomeData()
    .Where(x => x.Price == null)
    .ToList()
    .ForEach(x =>
    {
        x.Price = list.First(v => v.Cat1 == x.Cat1 &&
                                  v.Cat2 == x.Cat2 &&
                                  v.Price != null)
                      .Price;
    });

如果您的数据已经订购,那么当价格为空时,您可以基于前一个值执行索引选择。另一个示例是使用具有递归和排序依据的本地函数:

var orderedList = GetSomeData()
    .OrderBy(x => x.Cat1)
    .ThenBy(x => x.Cat2)
    .ThenBy(x => x.Price);

var result = orderedList.Select((e, i) =>
{
    e.Price = e.Price ?? GetPrice(i);
    return e;
});

double GetPrice(int index)
{
    return orderedList.ElementAt(++index).Price
           ?? GetPrice(index);
}

使用相同的方法逻辑,您可以编写循环以完成相同的事情。