| HsHyperEstraier-0.1: HyperEstraier binding for Haskell | Contents | Index |
|
Text.HyperEstraier.Database |
|
|
|
|
Description |
An interface to functions to manipulate databases.
|
|
Synopsis |
|
|
|
|
Types
|
|
data Database |
Database is an opaque object representing a HyperEstraier
database.
|
|
|
data EstError |
EstError represents an error occured on various operations. It
is usually thrown as a DynException.
| Constructors | InvalidArgument | An argument passed to the function was invalid.
| AccessForbidden | The operation is forbidden.
| LockFailure | Failed to lock the database.
| DatabaseProblem | The database has a problem.
| IOProblem | An I/O operation failed.
| NoSuchItem | An object you specified does not exist.
| MiscError | Errors for other reasons.
|
| Instances | |
|
|
data AttrIndexType |
AttrIndexType represents an index type for an attribute.
| Constructors | SeqIndex | Map from a document ID to an attribute value. This
type of index increses the efficiency of, say,
getDocAttr.
| StrIndex | Map from an attribute value to a document ID. This
increases the search speed when you search for
documents by an attribute value.
| NumIndex | This is similar to StrIndex but for attributes
whose value is a number.
|
| Instances | |
|
|
data OptimizeOption |
OptimizeOption is an option for the optimizeDatabase action.
| Constructors | NoPurge | Omit the process which purges garbages of
removed documents.
| NoDBOptimize | Omit the process which optimizes the database
file.
|
| Instances | |
|
|
data RemoveOption |
RemoveOption is an option for the mergeDatabase action and the
removeDocument action.
| Constructors | CleaningRemove | Clean up the region in the database where
the removed documents were placed.
|
| Instances | |
|
|
data PutOption |
PutOption is an option for the putDocument action.
| Constructors | CleaningPut | If the new document overwrites an old one,
clean up the region in the database where
the old document were placed.
| WeightStatically | Statically apply the "@weight"
attribute of the document.
|
| Instances | |
|
|
data GetOption |
GetOption is an option for the getDocument action.
| Constructors | NoAttributes | Don't retrieve the attributes of the document.
| NoText | Don't retrieve the body of the document.
| NoKeywords | Don't retrieve the keywords of the document.
|
| Instances | |
|
|
data OpenMode |
OpenMode represents how to open a database.
| Constructors | Reader [ReaderOption] | Open the database with read-only
mode. You can specify ReaderOption
to modify the behavior of the
database.
| Writer [WriterOption] | Open the database with writable
mode. You can specify WriterOption
to modify the behavior of the
database.
|
| Instances | |
|
|
data ReaderOption |
|
|
data WriterOption |
WriterOption is an option for the Writer constructor.
| Constructors | Create [CreateOption] | Create a database if an old one
doesn't exist. You can specify
CreateOption to modify the
behavior of the database.
| Truncate [CreateOption] | Always create a new database even
if an old one already exists. You
can specify CreateOption to
modify the behavior of the
database.
| WriteLock LockingMode | Specify how to lock the database.
|
| Instances | |
|
|
data LockingMode |
LockingMode represents how to lock the database.
| Constructors | NoLock | Do no exclusive access control at all. This
option is very unsafe.
| NonblockingLock | Do non-blocking lock. (The author of this
module doesn't know what happens if this
option is in effect. See the manual and the
source code of HyperEstraier and QDBM.)
|
| Instances | |
|
|
data CreateOption |
|
|
data AnalysisOption |
AnalysisOption is an option for the Analysis constructor.
| Constructors | PerfectNGram | Use the perfect N-gram analyzer.
| CharCategory | Use the character category analyzer.
|
| Instances | |
|
|
data IndexTuning |
IndexTuning is an option for the Index constructor.
| Constructors | Small | Predict the database will have less than 50,000
documents.
| Large | Predict the database will have less than 300,000
documents.
| Huge | Predict the database will have less than 1,000,000
documents.
| Huge2 | Predict the database will have less than 5,000,000
documents.
| Huge3 | Predict the database will have more than 10,000,000
documents.
|
| Instances | |
|
|
data ScoreOption |
ScoreOption is an option for the Score constructor.
| Constructors | Nullified | Nullify anything about the score of
documents.
| StoredAsInt | Store the scores for documents into the
database as 32-bit integer.
| OnlyToBeStored | Store the scores for documents into the
database but don't use them during the
search operation.
|
| Instances | |
|
|
Opening and closing databases
|
|
withDatabase :: FilePath -> OpenMode -> (Database -> IO a) -> IO a |
withDatabase fpath mode f opens a database at fpath and
compute f. When the action f finishes or throws an exception,
the database will be closed automatically. If withDatabase fails
to open the database, it throws an EstError. See openDatabase.
|
|
openDatabase :: FilePath -> OpenMode -> IO (Either EstError Database) |
openDatabase fpath mode opens a database at fpath. If it
succeeds it returns Right Database, otherwise it
returns Left EstError.
The Database can be shared by multiple threads, but there is one
important limitation in the current implementation of the
HyperEstraier itself. A single process can NOT open the same
database twice simultaneously. Such attempt results in
AccessForbidden.
|
|
closeDatabase :: Database -> IO () |
closeDatabase db closes the database db. If the db has
already been closed, this operation causes nothing.
|
|
Manipulating database
|
|
addAttrIndex :: Database -> String -> AttrIndexType -> IO () |
addAttrIndex db attr idxType creates an index of type
idxType for attribute attr into the database db.
|
|
flushDatabase :: Database -> Int -> IO () |
flushDatabase db numWords flushes at most numWords index
words in the cache of the database db. If numWords <= 0 all the
index words will be flushed.
|
|
syncDatabase :: Database -> IO () |
Synchronize a database to the disk.
|
|
optimizeDatabase :: Database -> [OptimizeOption] -> IO () |
Optimize a database.
|
|
mergeDatabase :: Database -> FilePath -> [RemoveOption] -> IO () |
mergeDatabase db fpath opts merges another database at fpath
(source) to the db (destination). The flags of the two databases
must be the same. If any documents in the source database have the
same URI as the documents in the destination, those documents in
the destination will be overwritten.
|
|
setCacheSize |
:: Database | The database.
| -> Int | Maximum size of the index cache. (default: 64 MiB)
| -> Int | Maximum records of cached attributes. (default: 8192 records)
| -> Int | Maximum number of cached document text. (default: 1024 documents)
| -> Int | Maximum number of the cached search results. (default: 256 records)
| -> IO () | | Change the size of various caches of a database. Passing negative
values leaves the old values unchanged.
|
|
|
Getting documents in and out
|
|
putDocument :: Database -> Document -> [PutOption] -> IO () |
Put a document into a database. The document must have an
"@uri" attribute. If the database already has a document whose
URI is the same as of the new document, the old one will be
overwritten. See setURI and
updateDocAttrs.
|
|
removeDocument :: Database -> DocumentID -> [RemoveOption] -> IO () |
Remove a document from a database.
|
|
updateDocAttrs :: Database -> Document -> IO () |
Update attributes of a document in a database. The document to be
updated is determined by the document ID. It is an error to change
the URI of the document to be the same as of one of existing
documents. Note that the document body will not be updated. See
putDocument.
|
|
getDocument :: Database -> DocumentID -> [GetOption] -> IO Document |
Find a document in a database by an ID.
|
|
getDocAttr :: Database -> DocumentID -> String -> IO (Maybe String) |
Get an attribute of a document in a database.
|
|
getDocIdByURI :: Database -> URI -> IO DocumentID |
Find a document in a database by an URI and return its ID.
|
|
Statistics of databases
|
|
getDatabaseName :: Database -> IO String |
Get the name of a database.
|
|
getNumOfDocs :: Database -> IO Int |
Get the number of documents in a database.
|
|
getNumOfWords :: Database -> IO Int |
Get the number of words in a database.
|
|
getDatabaseSize :: Database -> IO Integer |
Get the size of a database.
|
|
hasFatalError :: Database -> IO Bool |
Return True iff the document has a fatal error.
|
|
Searching for documents
|
|
searchDatabase :: Database -> Condition -> IO [DocumentID] |
Search for documents in a database by a condition.
|
|
searchDatabase' :: Database -> Condition -> IO ([DocumentID], [(String, Int)]) |
Search for documents in a database by a condition. The second item
of the resulting tuple is a map from each search words to the
number of documents which are matched to the word.
|
|
metaSearch :: [Database] -> Condition -> IO [(Database, DocumentID)] |
Search for documents in many databases at once.
|
|
metaSearch' :: [Database] -> Condition -> IO ([(Database, DocumentID)], [(String, Int)]) |
Search for documents in many databases at once. The second item of
the resulting tuple is a map from each search words to the number
of documents which are matched to the word.
|
|
scanDocument :: Database -> Document -> Condition -> IO Bool |
Check if a document matches to every phrases in a condition.
To be honest with you, the author of this binding doesn't really
know what est_db_scan_doc() does. Its documentation is way too
ambiguous across the board. Moreover, the names of symbols of the
HyperEstraier are very badly named. Can you imagine what, say
est_db_out_doc() does? How about the constant named
ESTCONDSURE? The author got tired of examining the commentless
source code over and over again to write this binding. Its
functionality is awesome though...
|
|
Produced by Haddock version 0.8 |