Search Crawl rules are mechanisms to influencing the behavior of the crawler when it crawls specific sites. A single crawl rule is created by specifying a URL wildcard matching sites plus a set of options for setting the behavior of the crawler for these sites. When performing a search in SharePoint you often find you get noisy results where not only will it return the document you searched for, but it will also return the view and edit properties pages, the AllItems.aspx view form, etc.
How to create SharePoint search crawl rules?
To prevent unwanted items from being returned in search results, you need to create/update the crawl rules. To do this, follow the steps below:
For example: if you like all the document views and properties page to be excluded, you can use achieve it by configuring the crawl rule: *://*/Forms/*.aspx
1. Go to the crawl rules section of the search setting in the SSP
2. Add crawls rules to exclude the following path:
Add crawl rules to exclude the following paths:
You can also test a specific URL against the crawl rules to determine whether the rules will include or exclude the URL during a crawl.
In SharePoint 2007, wildcard operator “*” is the only operator supported in crawl rules foe matching everything. Because of its nature that matches everything, it does not have the flexibility to, for example, recognize and omit URL that contain mobile phone number.
SharePoint 2007 Search Crawler not Crawling with Basic Authentication:
crawling won’t occur on the Site Collections using Basic Authentication. You will receive an Error Message: “Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to crawl this content.”
So, The solution is: Extend your Site Collections with Integrated Windows Authentication and set the extended Site Collections to be the default websites in Alternate Access Mapping.
Search Crawl Rules in SharePoint 2010:
In SharePoint 2010, Search Crawl Rules are set from Search Service Application. Go to Central Administration >> Manage Service Applications >> Search Service Application >> Crawl Rules
Setting Search Crawl Rules with PowerShell:
Other than search service application web interface, Search crawl rules can be set with PowerShell also.
#Get Search service Application $SearchServiceApp = Get-SPenterpriseSearchServiceApplication "Search Service Application" #Create Crawl Rules New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*allitems.aspx*" -CrawlAsHttp 1 -Type ExclusionRule New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*mod-view.aspx* " -CrawlAsHttp 1 -Type ExclusionRule New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*webfldr.aspx*" -CrawlAsHttp 1 -Type ExclusionRule New-SPEnterpriseSearchCrawlRule -SearchApplication $SearchServiceApp -Path "*://*my-sub.aspx*" -CrawlAsHttp 1 -Type ExclusionRule
Technet reference: http://technet.microsoft.com/ko-kr/library/ff608119.aspx
SharePoint 2010 includes new capability in this area to support regular expression in the URL!