SharePoint Online: Get All Documents Inventory in a Site using PowerShell

Requirement: List All Documents in SharePoint Online Site Collection using PowerShell

SharePoint Online PowerShell to Get All Documents 
Here is the PowerShell to list all documents from document libraries in SharePoint Online site collection and export to CSV.
#Load SharePoint CSOM Assemblies
Add-Type -Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"
Add-Type -Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"
  
#Function to Generate Report on all documents in a SharePoint Online Site Collection
Function Get-SPODocumentInventory($SiteURL)
{
    Try {
        #Setup the context
        $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
        $Ctx.Credentials = $Credentials
  
        #Get the web from given URL and its subsites
        $Web = $Ctx.web
        $Ctx.Load($Web)
        $Ctx.Load($Web.Lists)
        $Ctx.Load($web.Webs)
        $Ctx.executeQuery()

        #Arry to Skip System Lists and Libraries
        $SystemLists [email protected]("Converted Forms", "Master Page Gallery", "Customized Reports", "Form Templates", "List Template Gallery", "Theme Gallery",
               "Reporting Templates", "Solution Gallery", "Style Library", "Web Part Gallery","Site Assets", "wfpub", "Site Pages", "Images")
    
        Write-host -f Yellow "Processing Site: $SiteURL"

        #Filter Document Libraries to Scan 
        $Lists = $Web.Lists | Where {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $false -and $SystemLists -notcontains $_.Title}
        #Loop through each document library
        Foreach ($List in $Lists)
        {
            #Get All Files of from the document library - Excluding Folders
            $Query =  New-Object Microsoft.SharePoint.Client.CamlQuery
            $Query.ViewXml = "<View Scope='RecursiveAll'><Query><Where><Eq><FieldRef Name='FSObjType' /><Value Type='Integer'>0</Value></Eq></Where><OrderBy><FieldRef Name='ID' /></OrderBy></Query></View>"
            $ListItems=$List.GetItems($Query)
            $Ctx.Load($ListItems)
            $Ctx.ExecuteQuery() 

            Write-host -f Cyan "`t Processing Document Library: '$($List.Title)' with $($List.ItemCount) Item(s)"

            $DocumentInventory = @()
            #Iterate through each file and get data
            Foreach($Item in $ListItems)
            {
                $File = $Item.File
                $Ctx.Load($File)
                $Ctx.ExecuteQuery()

                $DocumentData = New-Object PSObject
                $DocumentData | Add-Member NoteProperty SiteURL($SiteURL)
                $DocumentData | Add-Member NoteProperty DocLibraryName($List.Title)
                $DocumentData | Add-Member NoteProperty FileName($File.Name)
                $DocumentData | Add-Member NoteProperty FileURL($File.ServerRelativeUrl)
                $DocumentData | Add-Member NoteProperty CreatedBy($Item["Author"].Email)
                $DocumentData | Add-Member NoteProperty CreatedOn($File.TimeCreated)
                $DocumentData | Add-Member NoteProperty ModifiedBy($Item["Editor"].Email)
                $DocumentData | Add-Member NoteProperty LastModifiedOn($File.TimeLastModified)
                $DocumentData | Add-Member NoteProperty Size-KB([math]::Round($File.Length/1KB))
                      
                #Add the result to an Array
                $DocumentInventory += $DocumentData  
            }            
            #Export the result to CSV file
            $DocumentInventory | Export-CSV $ReportOutput -NoTypeInformation -Append
        }
 
        #Iterate through each subsite of the current web and call the function recursively
        ForEach ($Subweb in $Web.Webs)
        {
            #Call the function recursively to process all subsites underneaththe current web
            Get-SPODocumentInventory($Subweb.url)
        }
    }
    Catch {
        write-host -f Red "Error Generating Document Inventory!" $_.Exception.Message
    }
}
 
#Config Parameters
$SiteCollURL="https://crescent.sharepoint.com"
$ReportOutput="C:\temp\DocInventory.csv"
 
#Setup Credentials to connect
$Cred= Get-Credential
$Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Cred.Username, $Cred.Password)
 
#Delete the Output Report, if exists
if (Test-Path $ReportOutput) { Remove-Item $ReportOutput }
 
#Call the function 
Get-SPODocumentInventory $SiteCollURL
Change the site collection URL and report output path in the script and run:
sharepoint online powershell list all documents

This gets detail report of all documents in SharePoint Online site using PowerShell. Below is the result of PowerShell to list all documents:
sharepoint online powershell get all documents
If you just need to list all documents in a single document library instead of whole site collection, use: PowerShell to List All documents in a Library in SharePoint Online

PnP PowerShell to Get All Documents Inventory from a SharePoint Online Site Collection
Here is the PnP PowerShell way to get all files from all document libraries of a given site collection to a CSV file.
#Parameters
$SiteURL = "https://crescent.SharePoint.com/sites/Marketing"
$CSVPath = "C:\Temp\DocumentInventory.csv"
$global:DocumentInventory = @()
$Pagesize = 2000

#Function to scan and collect Document Inventory
Function Get-DocumentInventory
{
    [cmdletbinding()]
    param([parameter(Mandatory = $true, ValueFromPipeline = $true)] $Web)
  
    Write-host "Getting Documents Inventory from Site '$($Web.URL)'" -f Yellow
    #Calculate the URL of the tenant
    If($Web.ServerRelativeUrl -eq "/")
    {
        $TenantURL = $Web.Url
    }
    Else
    {
        $TenantURL = $Web.Url.Replace($Web.ServerRelativeUrl,'')
    }

    #Exclude certain libraries
    $ExcludedLists = @("Form Templates", "Preservation Hold Library","Site Assets", "Pages", "Site Pages", "Images",
                            "Site Collection Documents", "Site Collection Images","Style Library") 
                              
    #Get All Document Libraries from the Web
    Get-PnPList -Web $Web -PipelineVariable List | Where-Object {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $false -and $_.Title -notin $ExcludedLists -and $_.ItemCount -gt 0} | ForEach-Object {
        #Get Items from List   
        $global:counter = 0;
        $ListItems = Get-PnPListItem -List $_ -Web $web -PageSize $Pagesize -Fields Author, Created, File_x0020_Type -ScriptBlock `
                 { Param($items) $global:counter += $items.Count; Write-Progress -PercentComplete ($global:Counter / ($_.ItemCount) * 100) -Activity "Getting Documents from '$($_.Title)'" -Status "Processing Items $global:Counter to $($_.ItemCount)";} | Where {$_.FileSystemObjectType -eq "File"}
        Write-Progress -Activity "Completed Retrieving Documents from Library $($List.Title)" -Completed
    
            #Get Root folder of the List
            $Folder = Get-PnPProperty -ClientObject $_ -Property RootFolder

            #Iterate through each document and collect data           
            ForEach($ListItem in $ListItems)
            {  
                #Collect document data
                $global:DocumentInventory += New-Object PSObject -Property ([ordered]@{
                    SiteName  = $Web.Title
                    SiteURL  = $Web.URL
                    LibraryName = $List.Title
                    ParentFolder = $Folder.ServerRelativeURL
                    FileName = $ListItem.FieldValues.FileLeafRef
                    FileType = $ListItem.FieldValues.File_x0020_Type
                    AbsoluteURL = "$TenantURL$($ListItem.FieldValues.FileRef)"
                    CreatedBy = $ListItem.FieldValues.Author.Email
                    CreatedAt = $ListItem.FieldValues.Created
                    ModifiedBy = $ListItem.FieldValues.Editor.Email
                    ModifiedAt = $ListItem.FieldValues.Modified
                })
            }
        }
}
 
#Connect to Site collection
Connect-PnPOnline -Url $SiteURL -UseWebLogin
   
#Call the Function for Web & all Subwebs
Get-PnPWeb | Get-DocumentInventory
Get-PnPSubWebs -Recurse| ForEach-Object { Get-DocumentInventory $_ }
Disconnect-PnPOnline
  
#Export Documents Inventory to CSV
$Global:DocumentInventory | Export-Csv $CSVPath -NoTypeInformation
Write-host "Documents Inventory Report has been Exported to '$CSVPath'"  -f Green

Here is my another post written for SharePoint On-Premises to get all documents from SharePoint document library: Get All Documents Inventory in SharePoint using PowerShell

5 comments:

  1. Any way you can include the last entry for version history?

    ReplyDelete
    Replies
    1. Last entry for version history? You can generate version history report for documents using: Version History report for SharePoint Online

      Delete
  2. Hi all,

    Im trying to run the "SharePoint Online PowerShell to Get All Documents" but I get an error in the xml query. the code I used is the same displayed above:

    $Query = New-Object Microsoft.SharePoint.Client.CamlQuery
    $Query.ViewXml = "0"
    $ListItems=$List.GetItems($Query)

    The error I get:
    Cannot convert argument "query", with value: "Microsoft.SharePoint.Client.CamlQuery", for "GetItems" to type "Microsoft.SharePoint.Client.CamlQuery":
    "Cannot convert the "Microsoft.SharePoint.Client.CamlQuery" value of type "Microsoft.SharePoint.Client.CamlQuery" to type
    "Microsoft.SharePoint.Client.CamlQuery"."
    At line:3 char:1
    + $ListItems=$List.GetItems($Query)
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodException
    + FullyQualifiedErrorId : MethodArgumentConversionInvalidCastArgument

    Any ideas?

    Thanks in advanced,

    ReplyDelete
    Replies
    1. Why $Query.ViewXml = "0"? Shouldn't it be the Filter to get files?

      Delete
    2. Sorry, wrong copy/paste, the code is:
      ...
      $Query = New-Object Microsoft.SharePoint.Client.CamlQuery
      $Query.ViewXml = "0"
      $ListItems=$List.GetItems($Query)
      ...



      Thanks!

      Delete

Please Login and comment to get your questions answered!

Powered by Blogger.