Guide To Editing with Lace

Colour Codes

When OCR output is spellchecked, a html attribute is applied to indicate the spellcheck status of the word and what spellchecking strategies were applied to it. In display, these attributes are indicated with different colours. Spellcheck is performed with dictionaries, large lists of known-good words. Here are how the OCR words relate to dictionary words and the corresponding colour codes:

Additionally, when text has been identified as pertaining to the Apparatus Criticus, it is bordered with vertical blue lines, thus:

64 θυραις καθ ημεραν] θυραν Κ* (θυραις κ. η. Bᵃ) 66 αμαρτανοντες εις
εμε 8 | ασεβουσιν] + εις 8SᵃA | om με 8 (hab 2ᵇᵃ)

Simple Editing

To edit the OCR output, click on a word. The content of that word is now editable: you can type additional characters or use the backspace key to delete; alternatively you can select and delete a range of characters in the usual manner. When a word is clicked on, a tooltip pops up with the corresponding image range from the OCR'd page. It is usually easier to compare the text with this image than it is to scan the entire page image on the left side of the screen.

Once the editor is assured that the text in the word is what is in the pop-up image, he or she should press the Return (or Enter) key. This action is all that is required to save the edit in the underlying database. The editor will note that the colour of the word changes to light-blue, indicating that the word has been manually verified. The editing cursor now moves to the next word on the page.

In the case where a text is highly accurate, editing will simply entail clicking on the first word, checking that the word text corresponds to the pop-up image and then pressing Return. The process is repeated again and again, the text being changed only when necessary.

Once the whole page has been verified thus, a 'Download' button appears at the end of the page. It is not necessary to use this: the page is stored in the database in any case.

Advanced Editing

The following advanced editing functions are available:
  1. If an editor verifies a word with the Control key held down while pressing Return, the edit function is applied to all applicable words on all pages of the text. So, if the word is unchanged and verified, all words that contain that string are similarly verified (sight unseen), and will appear coloured as light-blue, no matter what page they appear on. This is a very powerful function and should be used sparingly, especially at first. However, it may become clear that a word like Ἰσραήλ is very unlikely to be incorrectly identified. In which case, verifying all of these words at once saves time. If a word has been changed, all words in all pages of the text that had the original form of the word will be changed to the edited form. Note that by 'original form' it is mean the very first form outputted. Thus if all words reading ἐπ’ are changed to ἐπ' (with a different final character) and then, using this function, one of those words is used to do a global change to ἐφ', this will not change words that originally were output as ἐπ’.

  2. Holding the Control key down while also holding Alt key down while pressing Return causes a new work dialog box to be created for the purpose of notes.

  3. Holding the Alt key down while pressing Return causes the creation of a space to indicate a new section of text.

  4. Holding Shift down while pressing Return causes a new blank line to be inserted into the document, directly following the current one. This, too, is editable, but its content is not broken into words. Pressing Return in this line will save its content as expected. This allows one to add content to the page which has been missed by the OCR engine.

  5. If a word is split in the editor, meaning that it apears in multiple boxes rather than altoghether in a single box, you may correct this error by completing the word as it appears in the text in the first editing box and deleting the content of the subsequent box(es). If multiple words are detected as a single word, you may solve this by simply separating them by a Space as the computer will detect this as being seprate words.

Using Unicode

Special reminders regarding character use in the editor:

  1. Unicode is a universal set of characters meant to act as a consistent method for encoding plain-text. It standardizes text by assigning every character a universal and unique numeric value and name. This means that unicode creates a unification of characters, making them dynamic to use and simple to convert.
  2. The use of unicode characters is imperative to creating a convertable, searchable document. When using Greek characters, and especially a Greek keyboard, you must remember that the keys you use may not be the correct symbols which unicode requires. For example, when inserting left or right-angle brackets, you must use that specific symbol (ex. U+2329, Ps.) rather than the less than or greater than symbols found on your keyboard. Due to the nature of unicode characters formatting bold or italic words is unnecessary.
  3. When a character is unavalible on the standard keyboard, it is likely to be found in the index of Unicode characters which can be found via online search engine.
  4. Font is irrelivent when using unicode, so if a character must be pasted into the editing environment, although likely apearing on a different coloured background, the character, provided it is unicode, will be detectable to the OCR engine.
  5. In order to transfer your keyboard to Greek there are various resources avalible online for both Mac and PC users to help with the creation of accents, breathing marks, and iota subscripts. Some programs make it possible to create on on-screen keyboard, which is controlled by Mouse. Others may prefer to use their own keyboard in Greek Polytonic form.