Справочник Пользователя для Wiley Professional Microsoft Search: SharePoint 2007 and Search Server 2008 978-0-470-27933-5

Страница из 4
       Introduction to Enter prise 
 With the explosion of digitally borne information in the workplace, Enterprise Search has become 
more critical than ever before. Gone are the days when you could remember the location of all the 
file shares, web sites, and SharePoint sites, where the information you needed was stored. Instead, 
sites with terabytes of data are normal now, rather than being the anomaly they were just a few 
years ago. Remembering where you stored something last year, or even last week, has become an 
exercise in searching for a needle in a haystack. Also, with the growth of Internet Search, 
companies have begun to question why they do not have as good a search engine inside the 
firewall as they do outside the firewall. Internal customers are demanding that you provide a 
robust, scalable infrastructure for them to search against and provide in return relevant and timely 
results. Not a short order in any way, but reading this book will help!  
  Why Enterprise Search 
 Some of you may be scratching your heads, wondering why there is a distinction between 
Enterprise Search and Internet Search. Aren ’ t the problem sets and technologies the same between 
the two? Yes and no. Some of the algorithms and protocols are the same, but some are different. 
While some Internet technologies grew out of Enterprise Search products, the technologies are 
distinctly different for a number of reasons that we will discuss. 
  A Tale of Two Content Types 
 While there is some overlap between Internet content types and Enterprise Search content types, 
the majority of the corpuses remain distinct. The Internet is made up mostly of web files ranging 
from Hypertext Markup Language (HTML) to Extensible Markup Language (XML) with not as 
much Office document content, while the reverse is true for Enterprise Search, where the majority 
of content is usually Office documents. Of course, this all depends on the types of content you 
c01.indd   1
c01.indd   1
8/2/08   2:00:55 PM
8/2/08   2:00:55 PM