Tag Archives: Search

Crawling Large Library`s in SharePoint 2007

I had been experiencing issues crawling a large document library of over 60,000 items in a SharePoint 2007 farm 64x after the index was corrupt and I had to reset the crawled content. The only error I could find in the crawl log was the error “The item may be too large or corrupt.” The crawler stopted around the 33,000 items from this document library. I have searched a lot on the internet for this problem and found a few Blogs describing this problem with different solutions. The solution for my issue was a mix of what I found on the internet. After these changes the crawler was able to index all 60,000 items from one Library.

Register changes:

  • HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/DedicatedFilterProcessMemoryQuota” –> Change the value to: 256000000 Hex
  • HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/FilterProcessMemoryQuota –> Change the value to: 256000000 Hex
  • HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/FolderHighPriority –> Change the value to: 500 Hex
  • HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Office Server/12.0/Search/Global/Gathering Manager: set DeleteOnErrorInterval –> Change the value to: 4 Decimal

Search Time Out settings:

1. Central Administration -> Application Management -> Search section -> Manage search service

2. Manage Search Service page –> “Farm-level search settings

3. Change the “Timeout Settings” both from 60 –> 500

The crawler could not communicate with the server

The crawler could not communicate with the server. Check that the server is available and that the firewall access is configured correctly

This is a rather generic error message and I found out that it generally covers problems communicating with the index server, i.e. the target server is responding with a http response code 5xx “internal server error” or not at all.

Quite often if I hit that particular page on the index server (not the WFE!) I would see the error. For instance in one site an email contact form was failing because it used the referral header that wasn’t given by the indexer, or if you hit it directly with a browser. Or an error in the web.config.

If you’re having this problem for local SharePoint sites (and you verified that the page worked) remember to test it on the index server, not just the front-end, as the index server is using itself for indexing. You might have forgotten to deploy some resources or lots of other errors are possible…… Enable stack trace on the index server (setting in the web.config) and fix the actual problem afterwards.