HuntnGather

Subversion Repositories:
Compare Path: Rev
With Path: Rev
?path1? @ 36  →  ?path2? @ 37
/trunk/HuntnGather/HuntnGather.readme
@@ -1,9 +1,9 @@
Short: File indexing and search utilities.
Author: Wizardry and Steamworks
Uploader: "Wizardry and Steamworks" <office@grimore.org>
Uploader: "Wizardry and Steamworks" <mail@grimore.org>
Type: util/dir
Replaces: util/dir/HuntnGather.lha
Version: 1.7.3
Version: 1.7.4
Architecture: m68k-amigaos
 
Hunt & Gather - File search and indexing utilities.
@@ -15,22 +15,24 @@
 
-=:[ Changes ]:=-
 
TBA:
* Testing with MuForce.
* Tested using large entire drives and eliminated some bugs.
* Use locale for string comparisons.
* Gather now requires either of -a (add), -r (remove), -c (create).
* Allow specifying multiple paths when gathering.
20211105:
* Use a single binary for all CPUs.
* Refocus the documentation to match the latest changes.
* MuForce and MuGuardianAngel memory usage stress testing.
* Large storage (cca. 8k directories / 80k files) stress testing.
* Use locale for string comparisons on the Amiga.
* "Gather" now requires either of -a (add), -r (remove), -c (create).
* Allow multiple paths when issuing the "Gather" command.
 
-=:[ Introduction ]:=-
 
 
Hunt and Gather are two utiltities for indexing and then searching
"Hunt" and "Gather" are two utiltities for indexing and then searching
fileswithin a drive or directory designed to speed up searching files.
 
The Gather utility is meant to index any path and generate a search
database. The Hunt utility will then open the database generated by
Gather and look for files matching the string provided to Hunt as
The "Gather" utility is meant to index any path and generate a search
database. The "Hunt" utility will then open the database generated by
Gather and look for files matching the string provided to "Hunt" as
parameter.
 
The utility was designed to check large collections of icons but the
@@ -39,77 +41,161 @@
 
-=:[ Design ]:=-
 
Hunt and Gather are designed with constant memory usage in order to be
suitable for all Amiga models. Namely, the Gather utility will search
all files in a given path, sort the files in ascending order by using
an external merge sort (tailored down to a 256KiB memory limit).
"Hunt" and "Gather" are designed with constant memory usage in order
to be suitable for all Amigas. Namely, the "Gather" utility will
search all files in a given path, sort the files in ascending order by
using an external file-based merge sort.
 
Conversely, Hunt uses brute force to search for files but by reading
lines from the database without loading the entire database in RAM.
Perhaps ulterior versions of Hunt might partition the database file
just like Gather does and then build Tries in oder to speed up finding
files on the filesystem.
Conversely, "Hunt" searchs files by reading lines from the database
without loading the entire database in RAM or by searching files
again.
 
The project adheres to the ANSI C standard and Amiga-centric semantics
are compiled conditionally (in case the "___AmigaOS__" macro is
defined at compile time).Otherwise, Hunt & Gather should run under any
platform that benefits from an ANSI C compiler.
defined at compile time). Otherwise, "Hunt" and "Gather" should run
under any platform that benefits from an ANSI C compiler.
The project is developed from scratch on a real Amiga using StormC.
 
-=:[ Usage ]:=-
 
First the Gather utility is used to index a path:
The "Gather" utility is used to index a path. The following command:
 
 
Gather RAM:
Gather -c RAM:
 
 
which will create a file in the S: directory named "gahter.db". While
Gather is running, the utility will display the number of indexed
directories and files on the command line.
will create a file in the S: directory named "gather.db". "Gather"is
verbose by default and will show the user what the utility is doing
but the behaviour can be changed with the "-q" (quiet) flag that will
make "Gather" print only errors.
 
In order to look for a file, the Hunt utility is invoked with an
Amiga search pattern:
In order to look for a file, the "Hunt" utility is then invoked with
an AmigaOS search pattern, for instance, the pattern "#?test#?:
 
 
Hunt #?test#?
 
 
in this case, "#?test#?", that will be compared to all the files
indexed previously by Gather. In case any of the files previously
indexed by Gather contain the term "test", then the Hunt utility will
"Hunt" will then search the database previously generated by the
"Gather" utility and will print out all the paths corresponding to the
files matching the supplied pattern.
In the previous example, in case any of the files previously indexed
by "Gather" contain the term "test", then the "Hunt" utility will
display the path to the file.
At some point you might decide to add some other path to the search
database as well. In that case, "Gather" would be invoked with the
"-a" option instead of "-c" in order to add the files:
Gather -a HDH0:Icons/
"Gather" will then index the additional directory and add the new
files to the database. Adding a path to the index database will
require that "Gather" sorts the database again such that after adding
the new files, "Gather" will proceed with sorting.
Lastly, the "-r" parameter can be used with "Gather" to remove paths
that have been previously indexed. Let's say that you have indexed the
following paths with "Gather":
RAM:
HDH0:Icons/
but now you would like to remove the "RAM:" path and all the files
indexed below that path. In that case, you would issue a "Gather"
command with the "-r" parameter:
Gather -r RAM:
and "Gather" will remove all files matching the "RAM:" path. Removing
a path with the "-r" parameter does not take a long time compared to
adding files to the database.
 
-=:[ Gather ]:=-
 
The Gather utility takes one single parameter representing the path
to be indexed; for example, all the following paths are valid:
"Gather" requires that one of the following parameters is specified:
* -a (add files to an already existing database),
* -r (remove files from an already existing database),
* -c (delete the previous database file and create a new database).
 
The "Gather" utility takes several paths as parameters representing
the paths to be indexed; for example, all the following paths are
valid:
 
 
RAM:
DH0:System/
 
 
When Gather runs, a database is created at "S:gather.db" by
overwriting the previous database. For best results, Gather should
run periodically and should scan a path that is most frequently
searched for files.
When the "Gather" utility runs, a database is created at "S:gather.db"
containing all the found files.
"Gather" is also happy to work with a different database file other
than the default database at "S:gather.db" by passing the "-d"
parameter when "Gather" is invoked. For instance, the following
command invocation will create the database file at "T:gather.db" and
index the paths "RAM:" and "HDH0:Icons":
Gather -d T:gather.db RAM: HDH0:Icons
Conversely, the "Hunt" utility can then be used to search specified
database files:
Hunt -d T:gather.db #?test?#
The previous "Hunt" command will search a database file located at
"T:gather.db" for all files matching the pattern "#?test?".
 
-=:[ Hunt ]:=-
 
Hunt is the counterpart to Gather and will search the database at
"S:gather.db" for files matching the terms passed to Hunt on the
command line.
"Hunt" is the counterpart to "Gather" and will search a given database
generated by the "Gather" utility for files matching the terms passed
to "Hunt" on the command line.
 
For instance:
 
Hunt #?test#?
 
will search all files in the Gather database "S:gather.db" for the
will search all files in the "Gather" database "S:gather.db" for the
term "test". If any file within the database partially matches the
term "test", then Hunt will display the path on the command line.
term "test", then "Hunt" will display the path on the command line.
Hunt uses Amiga-style pattern for matching the file names.
"Hunt" uses AmigaOS pattern for matching the file names on AmigaOS.
 
-=:[ Notes ]:=-
 
* The "Gather" utility will be slow and that is the intended
behaviour: slow indexing with "Gather", fast searching with "Hunt".
* Temporary files might end up created in the same location where the
"Gather" utility is invoked. Traditionally the temporary directory
on AmigaOS is mainted in RAM but "Gather" cannot use RAM since it
intends to index very large hierarchies. Fortunately, "Gather"will
delete the temporary files once "Gather" is done indexing.
Nevertheless, in case you intend to index a large filesystem
hierarchy please make sure that you invoke "Gather" from a directory
that is able to hold large temporary files.
 
* The output of the "Hunt" utility can be combined with the pipe
operator (in newer AmigaOS releases) or the PIPE: handler on older
AmigaOS releases in order to to perform some action on the found
files. For example, using Thomas Radtke's "from" utility located at:
http://aminet.net/package/util/batch/from
and the Workbench 3.2 "MD5Sum" utility, you could print out the MD5
hashes of all files indexed by "Gather" ending in "#?.library:
Hunt #?.library | from - md5sum $1
Or you could generate a list of versions of all libraries indexed
with the "Gahter" utility:
Hunt #?.library | from - version $1
-=:[ Source ]:=-
 
The project is open sourced and licensed under MIT. The source code
@@ -118,12 +204,14 @@
svn co http://svn.grimore.org/HuntnGather
StormC was used as the developer environment.
-=:[ Mentions ]:=-
 
The code includes a shim for "getopt" in order to process command line
parameters on Amiga without changing the semantics. The shim is
created by Daniel J. Barrett, barrett@cs.umass.edu and is available on
AmiNET:
parameters on AmigaOS just like one would on a POSIX sytem. The shim
is created by Daniel J. Barrett, barrett@cs.umass.edu and is
available on AmiNET:
http://aminet.net/package/dev/misc/GetOpt-1.3