Thursday, January 23, 2014

Fastest way for recursive folder deletion with PowerShell

I had a hard time to develop a script that remotely deletes some files, and when I finally put it in production I was surprised with the time it took to delete the folders. I tried researching a better method for recursively deleting folders but the best I could find was this post in the Microsoft´s Scripting Guy Blog. The only problem is that nobody worried about the efficiency of these methods, so that's what I'm talking about in this post.
Measuring the performance of a deletion method is tricky, as depending on the contents and attributes of the files some methods are not able to handle, and treating these details prior to delete may take more time than just deleting with another method, so after some tests I was able to create the following "generic force method" to get rid of a folder using PowerShell:

$folderToDelete = "C:\temp\test"

$ErrorActionPreference= 'silentlycontinue'
[]::delete($folderToDelete, $true)
$fso = New-Object -ComObject scripting.filesystemobject
if (Test-Path ($folderToDelete)) {
New-Item -ItemType directory -Path .\EmptyFolder
robocopy .\EmptyFolder $folderToDelete /mir
Remove-Item .\EmptyFolder
Remove-Item $folderToDelete

If you want, you can just use the above code to quickly get rid of your folders, but if you want to understand the code, continue reading =)

You may notice that I didn't use the "Remove-Item" cmdlet, first reason is because the -Recurse parameter doesn't work as expected until version 4, the other reason is that it was the method that took more time to delete a 17.5 GB test folder (22′ 27″).
Another strange thing you may notice was the robocopy command: Although it is designed to copy files, the /mir (mirroring an empty folder) parameter can be very useful to force the deletion of some things we get access denied errors with other commands, like "read-only" files and ownership issues. It's not that efficient (20′ 36″ for the same test folder) but is useful as it always deletes everything.
The FileSystemObject was also used, as it deletes (almost) everything in most cases. In the test folder case it was able to delete everything in 13′ 21″, but, as I said, it doesn't delete all files in every case, so using robocopy is still necessary.
Finally let's talk about the the .NET Framework System.IO.Directory class. This is the fastest method, but unfortunately it's also the one we get more errors... In the test case I was only able to delete everything with it after recursively removing the read-only attribute (Get-ChildItem $folderToDelete -Recurse | % { if($_.IsReadOnly){$_.IsReadOnly= $false}} ) but doing this the performance downgraded so much that it took a bit more than the FileSystemObject: 13′ 28″. Now, if I use this method combined with robocopy I get some advantage: 11′ 18″.
As you could see combining the methods was better than trying to fix the access issues, so why not using the fastest methods combined? This was what I did: Combined System.IO.Directory with FileSystemObject and got the folder deleted in 5′ 26″! Finally, I just added the robocopy command to ensure the folder will be cleaned up in all cases.

As you can see, depending on the case some methods can be better than others, but generally speaking, the commands I just brought to you can be the solution for performance in deleting folders with PowerShell.
Note: I haven´t tested in version 4 of PowerShell yet, so maybe Remove-Item can still be considered.

No comments:

Post a Comment