Search

From v. 2.0 HNK is accessible and searchable with client program Bonito.

Installation of Bonito

After downloading Bonito for your operating system (Windows, Linux/Unix/MacOs) from its official web-site (http://www.textforge.cz/download), you will have to install it. In Windows it is sufficient to unzip the original distribution zip-file in a directory (e.g. c:\Program Files\Bonito), make a shortcut on the desktop and start the program. For installation and launching Bonito under Linux/Unix/MacOs you will need to install Tcl/Tk version 8.2 or higher.

Setting options in Bonito

At the first start of Bonito the connection parameters and user account options should be set.

Setting connection parameters

Connection parameters are available through menu Manager -> Connection:

During the testing period of HNK v 2.0 a provisional access will be granted to all users (user name: gost; no password). If your computer is already connected to Internet, the connection to HNK server should be established by pressing OK (in the upper right corner of Bonito window, a button with the name of default subcorpus will appear).

Setting user options

User options are available through menu Manager -> Options:

If you want to store settings for current program session only, you should confirm the changes with Apply. If you confirm with Save, settings will be remembered not for this session but for all subsequent program starts. New option settings will be applied only with the next starting of the program.

Selecting the corpus

Before any search, the desired corpus or subcorpus should be selected by pressing the button in the upper right corner of the main program window:

Simple queries

Criteria for queries i.e. search of desired keyword in order to get its concordance, are input in the row New query on the top of the main program window:

By pressing the arrow mark on the far right of the input row the list of previous queries is opened for selection.

Examples of simple queries:

Complex queries

Queries with regular expressions

Table with examples of usage of basic regular expressions


Expression Description Example Expected results
. dot
denotes any single character
glav. glava, glave, glavu
* asterix
denotes zero or more times the previous character
glav* glav, glavv, glavvv, glavvvv, glavvvvv,...
+ plus sign
denotes one or more times the previous character
glav+ glavv, glavvv, glavvvv...
{x} braces
denote the number of repeatings
of the previous character
glav.{3} glavama, glavica, glavicu, glavice, glavici, glavara, glavare, glavari...
{x,y} span in braces
denotes span of repeatings of
the previous character
glav{1,4}
.{5,10}
glav, glavv, glavvv, glavvvv
all strings between 5 and 10 characters in length
| vertical bar
denotes logical operation or
"glava"|"ruka" glava, ruka
[ ] brackets
denote set or span of characters for resolving the expression
glav[aeiou]
[g-k]lava
glava, glave, glavi, glavo, glavu
glava, hlava, ilava, jlava, klava
( ) parenthesis
used for grouping of subexpressions
(G|g)lava
([Gg]|[Pp])lava
Glava, glava
Glava, glava, Plava, plava
(?i) this expression is used for
ignoring the case
(?i)glava Glava, glava

If you would like to use dot ., plus sign +, asterix *, parenthesis ( ), braces { } or brackets [ ] litterary, each of this characters should be escaped with \.

Queries with structural tags in text

The list of structural tags used in corpus is available through menu Corpus -> Information summary or with shortcut (ctrl+I):

Queries with lemma or morphosyntactic description

Notice: This type of queries will be available for the whole HNK in v 2.5 scheduled for spring 2006. In the meantime it is possible to test this type of queries on cw2000 subcorpus only.

Explanation and detailed description of morphosyntactic tags (MSD) used in HNK can be found at the official web-pages of MulTextEast reccomendations.

The list of attributes used in corpus (i.e. lemma, MSD etc.) is available through menu Corpus -> Information summary or with shortcut (ctrl+I) (see illustration above).

Additional information in concordances

Beside the text in concordances the following additional information can be displayed: source reference, token attributes, structural tags.

Displaying source abbreviation

The source reference can be displayed in concordance through menu selection View -> References or with shortcut F4.

In HNK the source reference is encoded with doc.file so this reference should be selected in order to display the source abbreviation at the beginning of each concordance row.

Displaying token attribute

The token attribute can be displayed in concordance through menu selection View -> Attributes or with shortcut F5.

In HNK token attributes are encoded with lemma for lemmas and msd for morphosyntactic descriptions. By selecting desired attributes they are being displayed next to keyword (Only in KWIC) or next to each token in concordance (For all positions).

Displaying structural tags

The structural tags can be displayed in concordance through menu selection View -> Structures or with shortcut F6.

All structural tags used in selected corpus are available for selection. By default they are displayed in green.

Printing and saving the concordance

Concordances can be printed out (Concordance -> Print) or saved to a file (Concordance -> Save to file or short cut F2). If you send a concordance to a printer, you should know that Bonito automatically selects the default printer and all its default settings without the possibility to change them interactively (depending on the size of context, it should often be printer and paper of size A3). There is an alternative in saving the concordance to file. This file can be opened and reformatted in another program.

While saving the concordance to file it is recommendable to select utf-8 character encoding since the maximum preservation of different character sets is achieved in that way. The files with this universal encoding can be opened today by almost all text editors.

Bonito documentation

Additional features of Bonito are described in its manual which can be opened in HTML format through the menu Help -> Documentation. The same manual is also available in PDF format in subdirectory doc within the directory where Bonito has been installed (e.g. c:\Program Files\Bonito\doc).