是否有更有效的方法从带有日期过滤器的目录中填充文件名列表?
目前,我正在这样做:
foreach (FileInfo flInfo in directory.GetFiles())
{
DateTime yesterday = DateTime.Today.AddDays(-1);
String name = flInfo.Name.Substring(3,4);
DateTime creationTime = flInfo.CreationTime;
if (creationTime.Date == yesterday.Date)
yesterdaysList.Add(name);
}
这会遍历文件夹中的每个文件,我觉得应该有更多文件 有效的方式。
答案 0 :(得分:18)
第一个解决方案:
您可以使用LINQ:
List<string> yesterdaysList = directory.GetFiles().Where(x => x.CreationTime.Date == DateTime.Today.AddDays(-1))
.Select(x => x.Name)
.ToList();
然后你可以直接使用这个名单。
第二个解决方案:
使其更快的另一个解决方案可能是:
DateTime yesterday = DateTime.Today.AddDays(-1); //initialize this variable only one time
foreach (FileInfo flInfo in directory.GetFiles()){
if (flInfo.CreationTime.Date == yesterday.Date) //use directly flInfo.CreationTime and flInfo.Name without create another variable
yesterdaysList.Add(flInfo.Name.Substring(3,4));
}
<强>基准:强>
我使用此代码进行了基准测试:
class Program {
static void Main( string[ ] args ) {
DirectoryInfo directory = new DirectoryInfo( @"D:\Films" );
Stopwatch timer = new Stopwatch( );
timer.Start( );
for ( int i = 0; i < 100000; i++ ) {
List<string> yesterdaysList = directory.GetFiles( ).Where( x => x.CreationTime.Date == DateTime.Today.AddDays( -1 ) )
.Select( x => x.Name )
.ToList( );
}
timer.Stop( );
TimeSpan elapsedtime = timer.Elapsed;
Console.WriteLine( string.Format( "{0:00}:{1:00}:{2:00}", elapsedtime.Minutes, elapsedtime.Seconds, elapsedtime.Milliseconds / 10 ) );
timer.Restart( );
DateTime yesterday = DateTime.Today.AddDays( -1 ); //initialize this variable only one time
for ( int i = 0; i < 100000; i++ ) {
List<string> yesterdaysList = new List<string>( );
foreach ( FileInfo flInfo in directory.GetFiles( ) ) {
if ( flInfo.CreationTime.Date == yesterday.Date ) //use directly flInfo.CreationTime and flInfo.Name without create another variable
yesterdaysList.Add( flInfo.Name.Substring( 3, 4 ) );
}
}
timer.Stop( );
elapsedtime = timer.Elapsed;
Console.WriteLine( string.Format("{0:00}:{1:00}:{2:00}", elapsedtime.Minutes, elapsedtime.Seconds, elapsedtime.Milliseconds / 10));
timer.Restart( );
for ( int i = 0; i < 100000; i++ ) {
List<string> list = new List<string>( );
foreach ( FileInfo flInfo in directory.GetFiles( ) ) {
DateTime _yesterday = DateTime.Today.AddDays( -1 );
String name = flInfo.Name.Substring( 3, 4 );
DateTime creationTime = flInfo.CreationTime;
if ( creationTime.Date == _yesterday.Date )
list.Add( name );
}
}
elapsedtime = timer.Elapsed;
Console.WriteLine( string.Format( "{0:00}:{1:00}:{2:00}", elapsedtime.Minutes, elapsedtime.Seconds, elapsedtime.Milliseconds / 10 ) );
}
}
<强>结果:强>
First solution: 00:19:84
Second solution: 00:17:64
Third solution: 00:19:91 //Your solution
答案 1 :(得分:5)
我认为你是在提高文件系统级别的效率之后,而不是在C#级别。如果是这种情况,答案是否:没有办法告诉文件系统按日期过滤。它将不必要地归还所有东西。
如果你追求CPU效率:这是毫无意义的,因为将列表框中的项添加到比过滤日期更加昂贵。优化代码不会产生任何结果。
答案 2 :(得分:4)
我不想用正确的创建日期创建足够的文件来做一个不错的基准测试,所以我做了一个更通用的版本,它需要一个开始和结束时间,并给出匹配的文件的名称。让它给出了昨天创建的文件的特定子字符串,其后自然就是这样。
我提出的最快的单线程纯.NET回答是:
private static IEnumerable<string> FilesWithinDates(string directory, DateTime minCreated, DateTime maxCreated)
{
foreach(FileInfo fi in new DirectoryInfo(directory).GetFiles())
if(fi.CreationTime >= minCreated && fi.CreationTime <= maxCreated)
yield return fi.Name;
}
我原本期望EnumerateFiles()
稍快一点,但结果会稍慢一点(如果你要通过网络可能会做得更好,但我没有测试过。)
有一点点好处:
private static ParallelQuery<string> FilesWithinDates(string directory, DateTime minCreated, DateTime maxCreated)
{
return new DirectoryInfo(directory).GetFiles().AsParallel()
.Where(fi => fi.CreationTime >= minCreated && fi.CreationTime <= maxCreated)
.Select(fi => fi.Name);
}
但并不多,因为它无法帮助实际调用GetFiles()
。如果您没有要使用的核心,或者GetFiles()
没有足够大的结果,那么它只会让事情变得更糟(AsParallel()
的开销大于实现的好处并行过滤)。另一方面,如果您可以并行执行下一步处理,那么整体应用程序的速度可能会提高。
似乎没有必要用EnumerateFiles()
来做这件事,因为它似乎没有很好地并行化,因为它基于我将要持续的相同方法,而且本质上是连续的 - 需要先前的结果生产下一个。
我得到的最快的是:
public const int MAX_PATH = 260;
public const int MAX_ALTERNATE = 14;
[StructLayoutAttribute(LayoutKind.Sequential)]
public struct FILETIME
{
public uint dwLowDateTime;
public uint dwHighDateTime;
public static implicit operator long(FILETIME ft)
{
return (((long)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;
}
};
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Unicode)]
public struct WIN32_FIND_DATA
{
public FileAttributes dwFileAttributes;
public FILETIME ftCreationTime;
public FILETIME ftLastAccessTime;
public FILETIME ftLastWriteTime;
public uint nFileSizeHigh;
public uint nFileSizeLow;
public uint dwReserved0;
public uint dwReserved1;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst=MAX_PATH)]
public string cFileName;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst=MAX_ALTERNATE)]
public string cAlternate;
}
[DllImport("kernel32", CharSet=CharSet.Unicode)]
public static extern IntPtr FindFirstFile(string lpFileName, out WIN32_FIND_DATA lpFindFileData);
[DllImport("kernel32", CharSet=CharSet.Unicode)]
public static extern bool FindNextFile(IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);
[DllImport("kernel32.dll")]
public static extern bool FindClose(IntPtr hFindFile);
private static IEnumerable<string> FilesWithinDates(string directory, DateTime minCreated, DateTime maxCreated)
{
long startFrom = minCreated.ToFileTimeUtc();
long endAt = maxCreated.ToFileTimeUtc();
WIN32_FIND_DATA findData;
IntPtr findHandle = FindFirstFile(@"\\?\" + directory + @"\*", out findData);
if(findHandle != new IntPtr(-1))
{
do
{
if(
(findData.dwFileAttributes & FileAttributes.Directory) == 0
&&
findData.ftCreationTime >= startFrom
&&
findData.ftCreationTime <= endAt
)
{
yield return findData.cFileName;
}
}
while(FindNextFile(findHandle, out findData));
FindClose(findHandle);
}
}
FindClose()
承诺IDisposable
没有那个冒险,而IEnumerator<string>
的手动实现不仅应该让这更容易(做这个的严肃理由),也希望刮掉3纳秒或其他东西(这不是一个严重的理由),但上面显示了基本的想法。
答案 3 :(得分:-1)
我使用:
DirectoryInfo dI = new DirectoryInfo(fileLocation);
var files = dI.GetFiles().Where(i=>i.CreationTime>=dateFrom && i.CreationTime<=dateTo);