格式化和丢失数据

时间:2014-10-16 23:13:04

标签: c# algorithm sorting csv

我是c#的初学者。我有一个数据结构ResourceID,EditionId,LocationID,ViewCount和ClickCount。每个条目都有一个日期及其信息。如下图所示。

在某些日子里,我可能会得到不同的版本ID,ResourcesID,ViewCount,ClickCount ,但是对于相同的 ResourceID

我有以下var enteries = new Dictionary<IgaAdKey, IgaEntry>(); IgaAdkeyResourceId,LocationID,EditionID的地方。 IgaEntryViewCountClickCount

我还拥有HashSet<int>所有资源。我也有Dictionary<string, HashSet<int>> resourcesDate日 - &gt;那天可用的资源ID,以便我可以在该行中放置空位,如果当前处理日期中的resourceID不在该日期,那么我将其放入其中的空行。

使用所有这些信息,我想格式化表格,但我仍然有正确填充数据的问题。有时我会在错误的地方排队......等等。

我用来填充数据的代码:

 foreach (IgaAdKey adKey in enteries.Keys)
    {
        IgaEntry entry;

        if (enteries.TryGetValue(adKey, out entry))
        {
            streamWriter.Write(adKey.LocationId + ",");
            streamWriter.Write(adKey.EditionId + ",");
            streamWriter.Write(entry.mClickCount + ",");
            streamWriter.Write(entry.mViewCount + ",");
            streamWriter.Write(",");
        }
        else
        {
            for (int i = 0; i < 5; i++)
            {
                streamWriter.Write(",");
            }
        }
    }

enter image description here

更新: enter image description here

2 个答案:

答案 0 :(得分:2)

首先,您可以像这样简化循环:

foreach (var kvp in enteries)
{
    IgaAdKey = kvp.Key;
    IgaEntry entry = kvp.Value;

    streamWriter.Write(adKey.LocationId + ",");
    streamWriter.Write(adKey.EditionId + ",");
    streamWriter.Write(entry.mClickCount + ",");
    streamWriter.Write(entry.mViewCount + ",");
    streamWriter.Write(",");
}

这是您在字典中迭代键的正常方式。

很难说为什么你会在“错误的地方”。如果你想按某种特定顺序排列这些东西,那么你必须对它们进行排序。 Dictionary不保证商品的订单。你不能指望按照插入它们的顺序从字典中取出东西。

我怀疑你遇到的麻烦更多地与你添加到词典中的项目有关。您是否为GetHashCode类创建了重写的EqualsIgaAdKey方法?

答案 1 :(得分:2)

显示的代码永远不会输出空行。那是因为你循环遍历Keys字典的enteries,然后尝试获取密钥的值。但是总会找到这个值,因为你正在循环键。最好只是将字典作为collection of KeyValuePair<TKey, TValue>对象循环。

very similar question asked earlier问题是将字典条目组合在一起,ResourceIdEditionId具有相同LocationId但不同的值。假设这实际上是同一个问题,一种方法是让IgaAdKey实现IComparable<IgaAdKey>,如下所示:

public class IgaAdKey : IEquatable<IgaAdKey>, IComparable<IgaAdKey>
{
    public int ResourceId;
    public int EditionId;
    public int LocationId;

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(this, obj))
            return true;
        else if (ReferenceEquals(null, obj))
            return false;
        if (obj.GetType() != GetType())
            return false;
        var other = (IgaAdKey)obj;
        return ResourceId == other.ResourceId && EditionId == other.EditionId && LocationId == other.LocationId;
    }

    public override int GetHashCode()
    {
        return ResourceId.GetHashCode() ^ EditionId.GetHashCode() ^ LocationId.GetHashCode();
    }

    public override string ToString()
    {
        return string.Format("ResourceId={0}, EditionId={1}, LocationId={2}", ResourceId, EditionId, LocationId);
    }

    #region IEquatable<IgaAdKey> Members

    public bool Equals(IgaAdKey other)
    {
        return Equals((object)other);
    }

    #endregion

    #region IComparable<IgaAdKey> Members

    public int CompareTo(IgaAdKey other)
    {
        if (other == null)
            return -1; // At end?
        if (object.ReferenceEquals(this, other))
            return 0;
        int diff;
        if ((diff = ResourceId.CompareTo(other.ResourceId)) != 0)
            return diff;
        if ((diff = EditionId.CompareTo(other.EditionId)) != 0)
            return diff;
        if ((diff = LocationId.CompareTo(other.LocationId)) != 0)
            return diff;
        return 0;
    }

    #endregion
}

完成此操作后,您可以:

  1. 将您的对象存储在SortedDictionary

    var enteries = new SortedDictionary<IgaAdKey, IgaEntry>()
    
    // Build the dictionary
    
    foreach (var pair in enteries)
    {
        // Write to the CSV file
    }
    

    在这种情况下,所有具有相同ResourceId的密钥都将相邻。

  2. 将它们存储在常规字典中并对其进行排序以进行书写:

    var enteries = new Dictionary<IgaAdKey, IgaEntry>()
    
    // Build the dictionary
    
    foreach (var pair in enteries.OrderBy(pair => pair.Key))
    {
        // Write to the CSV file
    }
    
  3. 顺便提一下,如果您要将IgaAdKey用作字典键,则应该使其成为不可变的,原因是here

    <强>更新

    虽然您的问题不是很清楚,但是从您的代码中我可以确定您正在尝试输出一个基本上是较小表的2d网格的表。沿X轴是所有资源ID,每个资源有五列数据。沿着Y轴是所有日期,并且对于每个日期,每个位置所需的行数和每个位置一样多。找到每个资源+日期组合的版本。

    在这种情况下,您需要:

    1. 收集所有文件并按日期编制索引(您已经这样做了。)

    2. 扫描所有文件以查找所有资源ID(您已经在这样做了)。

    3. 将所有资源的列表排序为一致的顺序,以便程序的输出不会以任何方式随机出现:

                  var allResourcesInOrder = allResources.ToList();
                  allResourcesInOrder.Sort();
      
    4. 为每个资源输出5列:

                  foreach (int resourceId in allResourcesInOrder)
                  {
                      stream.Write(resourceId + ",");
                      stream.Write("Location ID" + ",");
                      stream.Write("Edition ID" + ",");
                      stream.Write("Click Count" + ",");
                      stream.Write("View Count" + ",");
                  }
                  stream.Write("\n");
      
    5. 对于每个日期,在该日期找到的所有文件中为该日期找到的每个资源输出单元格:

      /// <summary>
      /// reads & merges all the files for one specific date and create iga entry, merge their values, write them to the file
      /// </summary>
      /// <param name="date"></param>
      /// <param name="files"></param>
      /// <param name="streamWriter"></param>
      private static void ReadMergeAndWriteFilesForDay(
          DateTime date, List<string> files, StreamWriter streamWriter,
          IList<int> allResourcesInOrder  // Specifies the column order.
          )
      {
          var enteries = new Dictionary<IgaAdKey, IgaEntry>();
      
          foreach (string fileName in files)
              ReadFileForDay(fileName, enteries);
      
          var dateResources = new Dictionary<int, List<IgaAdKey>>();
          foreach (var key in enteries.Keys)
              dateResources.Add(key.ResourceId, key);
      
          // Sort the resources to output them in a consistent order.  Not required but good practice.
          dateResources.SortAll();
      
          for (int iRow = 0, nRows = dateResources.MaxCount(); iRow < nRows; iRow++)
          {
              for (int index = 0; index < allResourcesInOrder.Count; index++)
              {
                  if (index == 0)
                      streamWriter.Write(date.ToDateString() + ",");
                  else
                      streamWriter.Write(","); // Date goes under the resource ID for the first resource; otherwise leave it empty.
                  int resourceId = allResourcesInOrder[index];
                  IgaAdKey key;
                  IgaEntry value;
                  if (dateResources.TryGetValue(resourceId, iRow, out key)
                      && enteries.TryGetValue(key, out value))
                  {
                      streamWriter.Write(key.LocationId + ",");
                      streamWriter.Write(key.EditionId + ",");
                      streamWriter.Write(value.mClickCount + ",");
                      streamWriter.Write(value.mViewCount + ",");
                  }
                  else
                  {
                      streamWriter.Write(",");
                      streamWriter.Write(",");
                      streamWriter.Write(",");
                      streamWriter.Write(",");
                  }
              }
          }
      }
      

      注意ReadFileForDay是从MergeFilesForDay的前半部分here中提取的。

    6. 添加一些有用的扩展方法,让生活更轻松:

          public static class Returns
          {
              public static bool False<TValue>(out TValue value)
              {
                  value = default(TValue);
                  return false;
              }
          }
      
          public static class ListDictionaryExtensions
          {
              public static void Add<TKey, TValue>(this IDictionary<TKey, List<TValue>> listDictionary, TKey key, TValue value)
              {
                  if (listDictionary == null)
                      throw new ArgumentNullException();
                  List<TValue> values;
                  if (!listDictionary.TryGetValue(key, out values))
                  {
                      listDictionary[key] = (values = new List<TValue>());
                  }
                  values.Add(value);
              }
      
              public static bool TryGetValue<TKey, TValue>(this IDictionary<TKey, List<TValue>> listDictionary, TKey key, int index, out TValue value)
              {
                  List<TValue> list;
                  if (!listDictionary.TryGetValue(key, out list))
                      return Returns.False(out value);
                  if (index < 0 || index >= list.Count)
                      return Returns.False(out value);
                  value = list[index];
                  return true;
              }
      
              public static void SortAll<TKey, TValue>(this IDictionary<TKey, List<TValue>> listDictionary)
              {
                  if (listDictionary == null)
                      return;
                  foreach (var list in listDictionary.Values)
                      list.Sort();
              }
      
              public static int MaxCount<TKey, TValue>(this IDictionary<TKey, List<TValue>> listDictionary)
              {
                  if (listDictionary == null)
                      return 0;
                  int count = 0;
                  foreach (var list in listDictionary.Values)
                      count = Math.Max(count, list.Count);
                  return count;
              }
          }
      
    7. 完整代码here。当然我无法测试它,因为我没有任何输入文件。