Last Updated: February 25, 2016

Find dupe files

This script is finddupe.ps1. Basically what it does is list duplicate files on your pc with a size greater than 1 megabyte. For my pc it found over eleven thousand. It works from the current directory but it changes to the root directory of that drive so it captures all files on that drive. Normally I'd have it hash the files as well. I'm still thinking about a quick dirty way of doing that. I may edit this post with that update. Based on the output I was looking at hashes won't have a very large impact. It seems really common to have the same file in two places in windows.

cd \
$dirlisting = Get-ChildItem -Recurse -file *  | Sort-Object -property length
$dirgrouping = $dirlisting | Group-Object -Property length | Sort-Object -property count
$table = @()
foreach ($group in $dirgrouping)
{
    if($group.count -le 1)
    {
        continue
    }
    if($group.name -le (1024*1024))
    {
        continue
    }
    $i = 0
    $row = New-Object PSObject
    Add-Member -InputObject $row -Name "Size" -Value $group.name -MemberType NoteProperty
    Add-Member -InputObject $row -Name "Count" -Value $group.count -MemberType NoteProperty
    foreach ($file in $group.group)
    {
        Add-Member -InputObject $row -Name "Path $i" -Value $file.fullname -MemberType NoteProperty
        Add-Member -InputObject $row -Name "File $i" -Value $file.name -MemberType NoteProperty
        $i++
    }
    $table += $row
}
$table
$table | export-csv -Path "<drive>:\<dir>\finddupe.$((get-date).tostring("yy-MM-dd-HH-mm-ss")).csv" -NoTypeInformation

#powershell