为什么使用DeviceIoControl进行文件枚举在VB.NET中比在C ++中更快?

时间:2014-12-10 05:37:28

标签: c++ vb.net windows performance ntfs-mft

我正在尝试阅读Windows主文件表(MFT)以快速枚举文件。到目前为止,我已经看到了两种方法:

  1. 根据Jeffrey Cooperstein and Jeffrey Richter使用DeviceIoControl
  2. 的建议
  3. 如某些开源工具和An NTFS Parser Lib
  4. 中所示,直接解析MFT

    对于我的项目,我专注于方法[1]。我面临的问题主要与执行时间有关。需要明确的是,以下是我的系统和开发环境:

    1. IDE - Visual Studio 2013
    2. 语言 - C ++
    3. 操作系统 - Windows 7专业版x64
    4. 为C ++和.NET代码生成32位二进制文​​件。
    5. 问题

      我将[1]中提到的版本(稍加修改)与VB.NET实现available on codeplex进行了比较。问题是如果我取消注释内环中的语句,C ++代码执行时间会增加7-8倍。我还没有在C ++代码中实现路径匹配(可以在VB代码中找到)。

      Q1。请建议如何提高C ++代码的性能。

      计时,用于在我的计算机上枚举C:\驱动器:

      1. C ++(内循环中带有未注释的语句) - 21 seconds
      2. VB.NET(附加路径匹配代码) - 3.5 seconds
      3. 为了更加清晰,以下是C ++和VB.NET片段。

        C ++

        bool FindAll()
        {
            if (m_hDrive == NULL) // Handle of, for example, "\\.\C:"
                return false;
        
            USN_JOURNAL_DATA ujd = {0};
            DWORD cb = 0;
            BOOL bRet = FALSE;
            MFT_ENUM_DATA med = {0};
        
            BYTE pData[sizeof(DWORDLONG) + 0x10000] = {0};
        
            bRet = DeviceIoControl(m_hDrive, FSCTL_QUERY_USN_JOURNAL, NULL, 0, &ujd, sizeof(USN_JOURNAL_DATA), &cb, NULL);
            if (bRet == FALSE) return false;
        
            med.StartFileReferenceNumber = 0;
            med.LowUsn = 0;
            med.HighUsn = ujd.NextUsn;
        
            //Outer Loop
            while (TRUE)
            {
                bRet = DeviceIoControl(m_hDrive, FSCTL_ENUM_USN_DATA, &med, sizeof(med), pData, sizeof(pData), &cb, NULL);
                if (bRet == FALSE) {
                    break;
                }
        
                PUSN_RECORD pRecord = (PUSN_RECORD)&pData[sizeof(USN)];
        
                //Inner Loop
                while ((PBYTE)pRecord < (pData + cb))
                {
                    tstring sz((LPCWSTR) ((PBYTE)pRecord + pRecord->FileNameOffset), pRecord->FileNameLength / sizeof(WCHAR));
        
                    bool isFile = ((pRecord->FileAttributes & FILE_ATTRIBUTE_DIRECTORY) != FILE_ATTRIBUTE_DIRECTORY);
                    if (isFile) m_dwFiles++;
                    //m_nodes[pRecord->FileReferenceNumber] = new CNode(pRecord->ParentFileReferenceNumber, sz, isFile);
        
                    pRecord = (PUSN_RECORD)((PBYTE)pRecord + pRecord->RecordLength);
                }
                med.StartFileReferenceNumber = *(DWORDLONG *)pData;
            }
            return true;
        }
        

        m_nodes定义为typedef std::map<DWORDLONG, CNode*> NodeMap;

        的位置

        VB.NET

        Public Sub FindAllFiles(ByVal szDriveLetter As String, fFileFound As FileFound_Delegate, fProgress As Progress_Delegate, fMatch As IsMatch_Delegate)
        
                Dim usnRecord As USN_RECORD
                Dim mft As MFT_ENUM_DATA
                Dim dwRetBytes As Integer
                Dim cb As Integer
                Dim dicFRNLookup As New Dictionary(Of Long, FSNode)
                Dim bIsFile As Boolean
        
                ' This shouldn't be called more than once.
                If m_Buffer.ToInt32 <> 0 Then
                    Console.WriteLine("invalid buffer")
                    Exit Sub
                End If
        
                ' progress 
                If Not IsNothing(fProgress) Then fProgress.Invoke("Building file list.")
        
                ' Assign buffer size
                m_BufferSize = 65536 '64KB
        
                ' Allocate a buffer to use for reading records.
                m_Buffer = Marshal.AllocHGlobal(m_BufferSize)
        
                ' correct path
                szDriveLetter = szDriveLetter.TrimEnd("\"c)
        
                ' Open the volume handle 
                m_hCJ = OpenVolume(szDriveLetter)
        
                ' Check if the volume handle is valid.
                If m_hCJ = INVALID_HANDLE_VALUE Then
                    Console.WriteLine("Couldn't open handle to the volume.")
                    Cleanup()
                    Exit Sub
                End If
        
                mft.StartFileReferenceNumber = 0
                mft.LowUsn = 0
                mft.HighUsn = Long.MaxValue
        
                Do
                    If DeviceIoControl(m_hCJ, FSCTL_ENUM_USN_DATA, mft, Marshal.SizeOf(mft), m_Buffer, m_BufferSize, dwRetBytes, IntPtr.Zero) Then
                        cb = dwRetBytes
                        ' Pointer to the first record
                        Dim pUsnRecord As New IntPtr(m_Buffer.ToInt32() + 8)
        
                        While (dwRetBytes > 8)
                            ' Copy pointer to USN_RECORD structure.
                            usnRecord = Marshal.PtrToStructure(pUsnRecord, usnRecord.GetType)
        
                            ' The filename within the USN_RECORD.
                            Dim FileName As String = Marshal.PtrToStringUni(New IntPtr(pUsnRecord.ToInt32() + usnRecord.FileNameOffset), usnRecord.FileNameLength / 2)
        
                            'If Not FileName.StartsWith("$") Then
                            ' use a delegate to determine if this file even matches our criteria
                            Dim bIsMatch As Boolean = True
                            If Not IsNothing(fMatch) Then fMatch.Invoke(FileName, usnRecord.FileAttributes, bIsMatch)
        
                            If bIsMatch Then
                                bIsFile = Not usnRecord.FileAttributes.HasFlag(FileAttribute.Directory)
                                dicFRNLookup.Add(usnRecord.FileReferenceNumber, New FSNode(usnRecord.FileReferenceNumber, usnRecord.ParentFileReferenceNumber, FileName, bIsFile))
                            End If
                            'End If
        
                            ' Pointer to the next record in the buffer.
                            pUsnRecord = New IntPtr(pUsnRecord.ToInt32() + usnRecord.RecordLength)
        
                            dwRetBytes -= usnRecord.RecordLength
                        End While
        
                        ' The first 8 bytes is always the start of the next USN.
                        mft.StartFileReferenceNumber = Marshal.ReadInt64(m_Buffer, 0)
        
                    Else
        
                        Exit Do
        
                    End If
        
                Loop Until cb <= 8
        
                If Not IsNothing(fProgress) Then fProgress.Invoke("Parsing file names.")
        
                ' Resolve all paths for Files
                For Each oFSNode As FSNode In dicFRNLookup.Values.Where(Function(o) o.IsFile)
                    Dim sFullPath As String = oFSNode.FileName
                    Dim oParentFSNode As FSNode = oFSNode
        
                    While dicFRNLookup.TryGetValue(oParentFSNode.ParentFRN, oParentFSNode)
                        sFullPath = String.Concat(oParentFSNode.FileName, "\", sFullPath)
                    End While
                    sFullPath = String.Concat(szDriveLetter, "\", sFullPath)
        
                    If Not IsNothing(fFileFound) Then fFileFound.Invoke(sFullPath, 0)
                Next
        
                '// cleanup
                Cleanup() '//Closes all the handles
                If Not IsNothing(fProgress) Then fProgress.Invoke("Complete.")
            End Sub
        

        fFileFound的定义如下:

        Sub(s, l)
            If s.ToLower.StartsWith(sSearchPath) Then
                lCount += 1
                lstFileNames.Add(s.ToLower) '// Dim lstFileNames As List(Of String)
            End If
        End Sub
        

        FSNode&amp; CNode具有以下结构:

        //C++ version
        class CNode
        {
        public:
            //DWORDLONG m_dwFRN;
            DWORDLONG m_dwParentFRN;
            tstring m_sFileName;
            bool m_bIsFile;
        
        public:
            CNode(DWORDLONG dwParentFRN, tstring sFileName, bool bIsFile = false) : 
                m_dwParentFRN(dwParentFRN), m_sFileName(sFileName), m_bIsFile(bIsFile){
            }
            ~CNode(){
            }
        };
        

        注意 - VB.NET代码产生一个新线程(需要因为它有GUI),而我在主线程中调用c ++函数(一个简单的控制台应用程序用于测试)。


        更新

        这是我身边的一个愚蠢的错误。 DeviceIoControl API按预期工作。虽然Debug构建比Release构建慢一点。请参阅以下文章:

        how-can-i-increase-the-performance-in-a-map-lookup-with-key-type-stdstring

1 个答案:

答案 0 :(得分:1)

我没有运行你的代码,但是既然你说注释行是问题,那么问题可能就是地图插入。 在C ++代码中,您使用的是std :: map,它实现为树(按键,log(n)访问时间排序)。 在VB代码中,您使用的是Dictionary,它实现为哈希表(无排序,持续访问时间)。 尝试在C ++版本中使用std :: unordered_map。