SharePoint Search Crawl Rules – What is that, what it does for us?

Search Crawl rules are mechanisms to influencing the behavior of the crawler when it crawls specific sites. A single crawl rule is created by specifying a URL wildcard matching sites plus a set of options for setting the behavior of the crawler for these sites. When performing a search in SharePoint you often find you get noisy results where not only will it return the document you searched for, but it will also return the view and edit properties pages, the AllItems.aspx view form, etc.

How to create SharePoint search crawl rules?

To prevent unwanted items from being returned in search results, you need to create/update the crawl rules. To do this, follow the steps below:

For example: if you like all the document views and properties page to be excluded, you can use achieve it by configuring the crawl rule: *://*/Forms/*.aspx

1. Go to the crawl rules section of the search setting in the SSP

2. Add crawls rules to exclude the following path:

SharePoint Search Crawl Rules

Add crawl rules to exclude the following paths:
*://*webfldr.aspx*
*://*my-sub.aspx*
*://*mod-view.aspx*
*://*allitems.aspx*
*://*all forms.aspx*

You can also test a specific URL against the crawl rules to determine whether the rules will include or exclude the URL during a crawl.

In SharePoint 2007, wildcard operator “*” is the only operator supported in crawl rules foe matching everything. Because of its  nature that matches everything, it does not have the flexibility to, for example, recognize and omit URL that contain mobile phone number.

Technet Article: https://technet.microsoft.com/en-us/library/cc262934%28office.12%29.aspx

SharePoint 2007 Search Crawler not Crawling with Basic Authentication:
crawling won’t occur on the Site Collections using Basic Authentication. You will receive an Error Message: “Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to crawl this content.”

So, The solution is: Extend your Site Collections with Integrated Windows Authentication and set the extended Site Collections to be the default websites in Alternate Access Mapping.

Search Crawl Rules in SharePoint 2010:

In SharePoint 2010, Search Crawl Rules are set from Search Service Application. Go to Central Administration >> Manage Service Applications >> Search Service Application >> Crawl Rules

SharePoint 2010 Search Crawl Rules

Setting Search Crawl Rules with PowerShell:

Other than search service application web interface, Search crawl rules can be set with PowerShell also.

#Get Search service Application
$SearchServiceApp = Get-SPenterpriseSearchServiceApplication "Search Service Application"

#Create Crawl Rules
New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*allitems.aspx*" -CrawlAsHttp 1 -Type ExclusionRule
New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*mod-view.aspx* " -CrawlAsHttp 1 -Type ExclusionRule
New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*webfldr.aspx*" -CrawlAsHttp 1 -Type ExclusionRule
New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*my-sub.aspx*" -CrawlAsHttp 1 -Type ExclusionRule

Technet reference: https://technet.microsoft.com/ko-kr/library/ff608119.aspx

SharePoint 2010 includes new capability in this area to support regular expression in the URL!

Salaudeen Rajack

Salaudeen Rajack - Information Technology Expert with Two-decades of hands-on experience, specializing in SharePoint, PowerShell, Microsoft 365, and related products. He has held various positions including SharePoint Architect, Administrator, Developer and consultant, has helped many organizations to implement and optimize SharePoint solutions. Known for his deep technical expertise, He's passionate about sharing the knowledge and insights to help others, through the real-world articles!

One thought on “SharePoint Search Crawl Rules – What is that, what it does for us?

  • Thank you! The title is the clencher here! That is just what I was looking for.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *