Question

我试图从Access数据库中读取一个表，然后将该表中的数据排序为多个文本文件。关键是要写入的文件名取决于每条记录中的值。这是我的第一个C＃应用程序，所以你可以认为我是“绿色”。我还应该提一下，我正在使用Access数据库，直到我能够得到代码，最终它将从拥有数百万条记录的SQL服务器中提取出来。

我现在有代码工作，但问题是有大量的文件打开/关闭操作。我想只打开一次文件，因为它会将这些文件写入网络驱动器。这本质上是一个在服务器上运行的粘合应用程序 - 所以还有一些其他限制 - 我无法保存到本地驱动器然后复制到网络。我无法在拉动之前对查询进行排序。运行时我无法对服务器资源产生负面影响。

执行此操作的最佳方法可能是使用哈希表。检查文件是否已打开，如果没有，请将其打开并将文件句柄保存在哈希表中。完成后立即关闭它们。但是，我找不到如何同时使用多个StreamWriter对象的示例。

我希望能够相对容易地找到答案，但我似乎无法找到解决方案。我怀疑StreamWriter是用于此的错误类。

我能找到的最接近的上一个问题来自CodeProject page。在那个页面上，他们说保持文件打开的做法很糟糕，应该避免，但页面没有解释为什么，也没有提供示例替代方案。有人建议将整个数据集加载到内存中，然后对其进行操作，但这对我来说不是一个选项，因为表中的数据太多了。

这是我到目前为止所拥有的。

String strConnection;
String strQuery;
String strPunchFileNameTemplate;

// Define our Variables
strConnection = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=ClockData.accdb";
strQuery = @"SELECT * FROM ClockPunches";   
strPunchFileNameTemplate = @"C:\PUNCHES\%READER%.TXT";      

// OleDbConnection implements iDisposable interface, so we must scope out its usage.
// Set up Connection to our data source
using (OleDbConnection ConnObj = new OleDbConnection(strConnection))    {

    // Create a Command with our Query String
    OleDbCommand CmdObj = new OleDbCommand(strQuery,ConnObj);

    // Open our Connection
    ConnObj.Open();

    // OledbDataReader implements iDisposable interface, so we must scope out its usage.
    // Execute our Reader
    using (OleDbDataReader ReaderObj = CmdObj.ExecuteReader(CommandBehavior.KeyInfo))   {

        // Load the source table's schema into memory (a DataTable object)
        DataTable TableObj = ReaderObj.GetSchemaTable();

        // Parse through each record in the Reader Object
        while(ReaderObj.Read()) {

            // Extract PunchTime, CardNumber, and Device to separate variables
            DateTime dtTime = ReaderObj.GetDateTime(ReaderObj.GetOrdinal("PunchTime"));
            Int16 intID = ReaderObj.GetInt16(ReaderObj.GetOrdinal("CardNumber"));
            String strReader = ReaderObj.GetString(ReaderObj.GetOrdinal("Device"));

            // Translate the device name into a designated filename (external function)
            strReader = GetDeviceFileName(strReader);

            // Put our dynamic filename into the path template
            String pathStr = strPunchFileNameTemplate.Replace("%READER%",strReader);

            // Check to see if the file exists.  New files need an import Header
            Boolean FileExistedBool = File.Exists(pathStr);

            // StreamWrite implements iDisposable interface, so we must scope out its usage.
            // Create a Text File for each Device, Append if it exists
            using (StreamWriter outSR = new StreamWriter(pathStr, true))    {

                // Write our Header if required
                if (FileExistedBool == false)   {
                    outSR.WriteLine("EXAMPLE FILE HEADER");
                }

                // Set up our string we wish to write to the file
                String outputStr = dtTime.ToString("MM-dd-yyyy HH:mm:ss") + " " + intID.ToString("000000");

                // Write the String
                outSR.WriteLine(outputStr);

                // End of StreamWriter Scope - should automatically close
            }
        }
        // End of OleDbDataREader Scope - should automatically close
    }
    // End of OleDbConnection Scope - should automatically close
}

Answer 1

这是一个非常有趣的问题，让你自己进入。

缓存文件处理程序的问题在于，大量的文件处理程序会耗尽系统资源，导致程序和窗口执行不良。

如果数据库中的设备数量不是太高（小于100），我认为缓存句柄是安全的。

或者，您可以缓存一百万条记录，将它们分发到不同的设备并保存一些，然后再读取更多记录。

您可以将记录放在这样的字典中：

class PunchInfo
{  
    public PunchInfo(DateTime time, int id)
    {
        Id = id;
        Time = time;
    }
    public DateTime Time;
    public int Id;
}

Dictionary<string, List<PunchInfo>> Devices;
int Count = 0;
const int Limit = 1000000;
const int LowerLimit = 90 * Limit / 100;
void SaveRecord(string device, int id, DateTime time)
{
   PunchInfo info = new PunchInfo(time, id);
   List<PunchInfo> list;
   if (!Devices.TryGetValue(device, out list))
   {
      list = new List<PunchInfo>();
      Devices.Add(device, list);
   }
   list.Add(info);
   Count++;
   if (Count >= Limit)
   {
       List<string> writeDevices = new List<string>();
       foreach(KeyValuePair<string, List<PunchInfo>> item in Devices)
       {
           writeDevices.Add(item.Key);
           Count -= item.Value.Count;
           if (Count < LowerLimit) break;
       }

       foreach(string device in writeDevices)
       {
          List<PunchInfo> list = Devices[device];
          Devices.Remove(device);
          SaveDevices(device, list);
       }
    }
}

void SaveAllDevices()
{
    foreach(KeyValuePair<string, List<PunchInfo>> item in Devices)
        SaveDevices(item.Key, item.Value);
    Devices.Clear();
}

这样你就可以避免打开和关闭文件，并且有很多打开的文件。

一百万条记录占用20 MB内存，你可以毫无问题地轻松将其提升到1000万条记录。

Answer 2

我建议您将数据保留在内存中，并且只有在达到某个三分之一时才能写入磁盘

const int MAX_MEMORY_BUFFER = 100000; // To be defined according to you memory limits
String strConnection;
String strQuery;
String strPunchFileNameTemplate;

strConnection = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=ClockData.accdb";
strQuery = @"SELECT * FROM ClockPunches";   
strPunchFileNameTemplate = @"C:\PUNCHES\%READER%.TXT";      

Dictionary<string, StringBuilder> data = new Dictionary<string, StringBuilder>();

using (OleDbConnection ConnObj = new OleDbConnection(strConnection))    
{
    OleDbCommand CmdObj = new OleDbCommand(strQuery,ConnObj);
    ConnObj.Open();

    using (OleDbDataReader ReaderObj = CmdObj.ExecuteReader(CommandBehavior.KeyInfo))   
    {
        while(ReaderObj.Read()) 
        {
            DateTime dtTime = ReaderObj.GetDateTime(ReaderObj.GetOrdinal("PunchTime"));
            Int16 intID = ReaderObj.GetInt16(ReaderObj.GetOrdinal("CardNumber"));
            String strReader = ReaderObj.GetString(ReaderObj.GetOrdinal("Device"));

            strReader = GetDeviceFileName(strReader);

            bool dataPresent = data.ContainsKey(strReader);
            if (dataPresent == false)   
            {
                StringBuilder sb = new StringBuilder("EXAMPLE FILE HEADER\r\n");
                data.Add(strReader, sb);
            }

            String outputStr = dtTime.ToString("MM-dd-yyyy HH:mm:ss") + " " + intID.ToString("000000");
            StringBuilder sb = data[strReader];
            sb.AppendLine(outputStr);
            if(sb.Length > MAX_MEMORY_BUFFER)
            {
                String pathStr = strPunchFileNameTemplate.Replace("%READER%",strReader);
                using(StreamWriter sw = new StremWriter(pathStr, true) // Append mode
                {
                    // Write the buffer and set the lenght to zero
                    sw.WriteLine(sb.ToString());
                    sb.Length = 0;
                }
            }
        }
    }

    // Write all the data remaining in memory
    foreach(KeyValuePair<string, StringBuilder> info in data)
    {
        if(info.Value.Length > 0)
        {
          String pathStr = strPunchFileNameTemplate.Replace("%READER%",info.Key);
          using(StreamWriter sw = new StremWriter(pathStr, true) // Append mode
          {
              sw.WriteLine(info.Value.ToString());
          }
        }
    }
}

这段代码需要测试，但我希望能给你一般的想法。通过这种方式，您可以平衡IO操作。通过增加内存缓冲区来降低，反之亦然。当然，现在您还需要考虑可用于存储数据的内存。

Answer 3

如果单个进程在较长时间内打开100或1000个文件句柄，通常会被认为是有问题的。但时代变了，这已经没有问题了。因此，如果情况要求，那就去做吧。

我可以在分析这些文件中的数据的过程中打开100,1000甚至5000个文件。这将持续数小时。我在Windows操作系统上测量文件读/写性能是否会下降。而事实并非如此。由于现代机器现在可用的内存资源，在OS端的内存中具有5000个文件描述符不会再引起任何问题。操作系统对它们进行排序（我猜），然后查找这些描述符的是log（n），因此没有任何可测量的结果。

打开这些句柄（文件描述符结构）肯定比用数据填充内存然后逐个文件刷新到磁盘更好。

Answer 4

您需要设置一组编写器。这是一个如何做的例子。

<script type="text/javascript" src="..."></script>

}

C＃写入多个文件而不必不断关闭/重新打开流。 StreamWriter的？

4 个答案: