Difference between revisions of "Swish-e"

From SME Server
Jump to navigationJump to search
Line 56: Line 56:
  
 
Next create a config file: ibay.cfg
 
Next create a config file: ibay.cfg
 
+
# ibay.cfg, a shwish-e config file
 +
#
 
  IndexDir /usr/libexec/swish-e/DirTree.pl
 
  IndexDir /usr/libexec/swish-e/DirTree.pl
 
+
#
 
  SwishProgParameters /home/e-smith/files/ibays/ibayname/files
 
  SwishProgParameters /home/e-smith/files/ibays/ibayname/files
 
+
#
 
  StoreDescription HTML <body> 20000
 
  StoreDescription HTML <body> 20000
 
+
#
 
  # replace to make links to UNC
 
  # replace to make links to UNC
 
  # works in IE, needs fix for Firefox
 
  # works in IE, needs fix for Firefox

Revision as of 11:28, 15 March 2009


Description

http://www.swish-e.org

Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.

Forum link

http://forums.contribs.org/index.php/topic,43486.0.html

Installation

Download rpm's from http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/

wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-debuginfo-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-devel-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-api-2.4.5-4.i386.rpm

Install with dependencies from the SME Contribs repository by issuing the following command on the SME Server shell.

Howto enable dag's repository: http://wiki.contribs.org/Dag

yum --enablerepo=dag localinstall swish-e-2.4.5-4.i386.rpm swish-e-d* swish-e-p*

There is no need to reboot. Test:

swish-e -h 

Setup Part 2

In order to have swish-e index .doc .xls and .pdf files we need:

yum install --enablerepo=dag perl-Spreadsheet-ParseExcel perl-MIME-Types xpdf catdoc

Test filter:

swish-filter-test 
swish-filter-test -man
swish-filter-test -headers /path/to/xlsfile.xls
swish-filter-test -headers /path/to/docfile.doc
swish-filter-test -headers /path/to/pdffile.pdf

Configuration

As I was not interested in indexing web pages, just files in ibays I used the following spider: /usr/libexec/swish-e/DirTree.pl

I modified it, so it would index .doc .xls .pdf files:

sub check_path {
   my $path = shift;
   return 1 if $path = /\.doc$/;  # return true if ends in .doc?
   return 1 if $path = /\.xls$/;  # return true if ends in .xls?
   return 1 if $path = /\.pdf$/;  # return true if ends in .pdf?
   return 0;  # otherwise return false
}

Next create a config file: ibay.cfg

# ibay.cfg, a shwish-e config file
# 
IndexDir /usr/libexec/swish-e/DirTree.pl
#
SwishProgParameters /home/e-smith/files/ibays/ibayname/files
#
StoreDescription HTML <body> 20000
#
# replace to make links to UNC
# works in IE, needs fix for Firefox
ReplaceRules remove /home/e-smith/files/ibays
ReplaceRules prepend //smeservername
ReplaceRules replace /files/ /

Next: run the swish. The index file will be placed in the current dir.

swish-e -c ibay.cfg -S prog -v 9

This should create both index.swish-e and index.swish-e.prop in the current dir.

Under construction

swish.cgi

Under construction

Options

Under construction

Usage

Under construction