SharePoint Online: Find All Large Files using PowerShell

Requirement: Find all large files in the SharePoint Online site collection.

How to Find Large Files in SharePoint Online?

Are you looking for a way to find all large files in your SharePoint Online environment quickly? Whether you’re preparing for an organizational audit or simply trying to find all the large files in a given site to clean up some disk space, this guide will show you how to locate large files in SharePoint Online.

To find large files in SharePoint Online, you can use the “Storage Metrics” page.

  • Navigate to your SharePoint Online site collection >> Click on Settings Gear >> Site Settings
  • On the Site Settings page, click on the “Storage Metrics” link under “Site Collection Administration” (https://crescent.sharepoint.com/_layouts/15/storman.aspx)
  • The Storage Metrics page gives you the current Storage consumption of the site. You can navigate to each site/folder object to get the storage stats of the particular object.
    sharepoint online find large files

You can also include the OOTB “File Size” column in your document library views to get the size of individual files! Explorer view also helps!!

SharePoint Online: PowerShell to Get All Large Files in a Site Collection

Getting large files by navigating into each folder from the above Storage metrics page would be cumbersome. Let’s use PowerShell to fetch all larger files in a SharePoint Online site collection.

#Load SharePoint CSOM Assemblies
Add-Type -Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"
Add-Type -Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"
 
#Function to find large files of the web
Function Find-SPOLargeFiles([String]$SiteURL, [Microsoft.SharePoint.Client.Folder]$Folder)
{
    Write-host -f Yellow "Processing Folder: $($SiteCollURL)$($Folder.ServerRelativeURL)"
    Try {
            $LargeFileResult = @()
            #Get all Files from the folder
            $FilesColl = $Folder.Files
            $Ctx.Load($FilesColl)
            $Ctx.ExecuteQuery()
 
            #Iterate through each file and check the size
            Foreach($File in $FilesColl)
            {
                If($File.length -gt 50MB)
                {
                    $FileURL= $SiteCollURL+$File.ServerRelativeURL
                    $Result = New-Object PSObject
                    $Result | Add-Member NoteProperty FileName($File.Name)
                    $Result | Add-Member NoteProperty FileURL($FileURL)
                    $Result | Add-Member NoteProperty Size-MB([math]::Round($File.Length/1MB))
                     
                    #Add the result to an Array
                    $LargeFileResult += $Result
 
                    Write-host -f Green "Found a File '$($File.Name)' with Size $([math]::Round($File.Length/1MB))MB"

                    #Export the result to CSV file
                    $LargeFileResult | Export-CSV $ReportOutput -NoTypeInformation -Append
                }
            }
         
            #Process all Sub Folders
            $SubFolders = $Folder.Folders
            $Ctx.Load($SubFolders)
            $Ctx.ExecuteQuery()
            Foreach($Folder in $SubFolders)
            {
                #Exclude "Forms" and Hidden folders
                If( ($Folder.Name -ne "Forms") -and (-Not($Folder.Name.StartsWith("_"))))
                {
                    #Call the function recursively
                    Find-SPOLargeFiles -SiteURL $SiteURL -Folder $Folder
                }
            }
        }
    Catch {
        write-host -f Red "Error Finding Large Files!" $_.Exception.Message
    }
}

#Function to Generate Report on Large Files in a SharePoint Online Site Collection
Function Get-SPOLargeFilesRpt($SiteURL)
{
    #Setup the context
    $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
    $Ctx.Credentials = $Credentials
 
    #Get the web from given URL and its subsites
    $Web = $Ctx.web
    $Ctx.Load($Web)
    $Ctx.Load($web.Webs)
    $Ctx.executeQuery()

    #Call the function to get large files of the web
    Find-SPOLargeFiles -SiteURL $SiteURL -Folder $Web.RootFolder

    #Iterate through each subsite of the current web and call the function recursively
    foreach ($Subweb in $web.Webs)
    {
        #Call the function recursively to process all subsites underneath the current web
        Get-SPOLargeFilesRpt($Subweb.url)
    }
}

#Config Parameters
$SiteCollURL="https://crescent.sharepoint.com"
$ReportOutput="C:\temp\LargeFilesRpt.csv" 

#Setup Credentials to connect
$Cred= Get-Credential
$Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Cred.Username, $Cred.Password)

#Delete the Output Report, if exists
if (Test-Path $ReportOutput) { Remove-Item $ReportOutput }

#Call the function 
Get-SPOLargeFilesRpt $SiteCollURL

This script gets you the list of all files which are larger than 50MB into a CSV file with “File Name”, “File URL” and “Size in MB” columns!

PnP PowerShell: Find Large Files in SharePoint Online Site

Let’s find large files >100 MB in a given SharePoint Online site with PnP PowerShell:

#Config Variables
$SiteURL = "https://crescent.sharepoint.com/sites/marketing"
$CSVFilePath = "C:\Temp\LargeFiles.csv"

#Connect to PnP Online
Connect-PnPOnline -Url $SiteURL -Credentials (Get-Credential)

#Get all document libraries
$FileData = @()
$DocumentLibraries = Get-PnPList | Where-Object {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $False}

#Iterate through document libraries
ForEach ($List in $DocumentLibraries)
{
    Write-host "Processing Library:"$List.Title -f Yellow
    
    #Get All Files of the library with size > 100MB 
    $Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)}
    
    #Collect data from each files
    ForEach ($File in $Files) 
    {
        $FileData += [PSCustomObject][ordered]@{
            Library         = $List.Title
            FileName        = $File.FieldValues.FileLeafRef
            URL              = $File.FieldValues.FileRef
            Size            = [math]::Round(($File.FieldValues.SMTotalFileStreamSize/1MB),2)
        }
    }
}
#Export Files data to CSV File
$FileData | Sort-object Size -Descending
$FileData | Export-Csv -Path $CSVFilePath -NoTypeInformation

Large Files Report for the Entire Tenant

How about generating large files report for all SharePoint Online sites in the Tenant?

#Config Variables
$TenantAdminURL = "https://crescent-admin.sharepoint.com"
$CSVFilePath = "C:\Temp\LargeFiles.csv"
 
#Connect to Admin Center using PnP Online
Connect-PnPOnline -Url $TenantAdminURL -Interactive

#Delete the Output Report, if exists
if (Test-Path $CSVFilePath) { Remove-Item $CSVFilePath }

#Get All Site collections - Exclude: Seach Center, Mysite Host, App Catalog, Content Type Hub, eDiscovery and Bot Sites
$SiteCollections = Get-PnPTenantSite | Where { $_.URL -like '*/sites*' -and $_.Template -NotIn ("SRCHCEN#0", "SPSMSITEHOST#0", "APPCATALOG#0", "POINTPUBLISHINGHUB#0", "EDISC#0", "STS#-1")}

#Get All Large Lists from the Web - Exclude Hidden and certain lists
$ExcludedLists = @("Form Templates", "Preservation Hold Library","Site Assets", "Pages", "Site Pages", "Images",
                        "Site Collection Documents", "Site Collection Images","Style Library")

$SiteCounter = 1   
#Loop through each site collection
ForEach($Site in $SiteCollections)
{    
    #Display a Progress bar
    Write-Progress -id 1 -Activity "Processing Site Collections" -Status "Processing Site: $($Site.URL)' ($SiteCounter of $($SiteCollections.Count))" -PercentComplete (($SiteCounter / $SiteCollections.Count) * 100)
 
    #Connect to the site
    $SiteConn = Connect-PnPOnline -Url $Site.URL -Interactive -ReturnConnection

    #Get all document libraries
    $DocumentLibraries = Get-PnPList -Connection $SiteConn | Where-Object {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $False -and $_.Title -notin $ExcludedLists -and $_.ItemCount -gt 0}

    $ListCounter = 1
    #Iterate through document libraries
    ForEach ($List in $DocumentLibraries)
    {
        $global:counter = 0
        $FileData = @()

        Write-Progress -id 2 -ParentId 1 -Activity "Processing Document Libraries" -Status "Processing Document Library: $($List.Title)' ($ListCounter of $($DocumentLibraries.Count))" -PercentComplete (($ListCounter / $DocumentLibraries.Count) * 100)

        #Get All Files of the library with size > 100MB
        $Files = Get-PnPListItem -List $List -Connection $Siteconn -Fields FileLeafRef,FileRef,SMTotalFileStreamSize -PageSize 500 -ScriptBlock { Param($items) $global:counter += $items.Count; Write-Progress -Id 3 -parentId 2 -PercentComplete ($global:Counter / ($List.ItemCount) * 100) -Activity "Getting List Items of '$($List.Title)'" -Status "Processing Items $global:Counter to $($List.ItemCount)";} | Where {($_.FileSystemObjectType -eq "File") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)} 

        #Collect data from each files
        ForEach ($File in $Files)
        {
            $FileData += [PSCustomObject][ordered]@{
                Library      = $List.Title
                FileName  = $File.FieldValues.FileLeafRef
                URL            = $File.FieldValues.FileRef
                Size            = [math]::Round(($File.FieldValues.SMTotalFileStreamSize/1MB),2)
            }
        }

        #Export Files data to CSV File
        $FileData | Sort-object Size -Descending
        $FileData | Export-Csv -Path $CSVFilePath -NoTypeInformation -Append
        $ListCounter++
        #Write-Progress -Activity "Completed Processing List $($List.Title)" -Completed -id 2

    }
    $SiteCounter++
    Disconnect-PnPOnline -Connection $Siteconn
}

This script can help identify any potential storage issues and determine which files need to be relocated or archived.

Salaudeen Rajack

Salaudeen Rajack is a SharePoint Architect with Two decades of SharePoint Experience. He loves sharing his knowledge and experiences with the SharePoint community, through his real-world articles!

3 thoughts on “SharePoint Online: Find All Large Files using PowerShell

  • I am getting an error, when I try to use Microsoft.SharePoint.Client.Folder constructor:

    $Folder = New-Object -TypeName Microsoft.SharePoint.Client.Folder
    A constructor was not found. Cannot find an appropriate constructor for type Microsoft.SharePoint.Client.Folder

    I am using SharePoint Online Client Components version 16 and I load the same libraries as in the script. Here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.sharepoint.client.folder?view=sharepoint-csom I read that Folder constructor is inside Microsoft.SharePoint.Client.Portable.dll library, but that didn’t help and I am getting the same error. What do I miss ?

    Reply
  • This was very useful, Thanks!!

    Reply

Leave a Reply