This collection now permits users to correct permanently the OCR output of certain (though not all) texts. While the text of all output can be changed, those that are connected with an xml database, and therefore whose changes will be permanently stored have a gray text background. In contrast, most pages have a beige background; these can be edited, but the changes are not stored, and upon leaving the page all edits will be lost. Moreover, no other user will see the edits made on a beige background page. In contrast, changes to gray background pages will be stored in the database and visible to all other users when they load the page. The rest of this guide will pertain to the gray background pages.
¹²παρελθεῖν: this word has passed spellcheck. I.e., it matches one of the words in the spellcheck dictionary. Notice that there are characters allowed to be before and after the dictionary word, such as the '¹²' here, which do not disrupt the spellchecking.
Κυρίου: this word passed spellcheck when it was transformed to its lowercase form.
(11): this word comprises numbers or punctuation.
κατοικήσομεν: this word has been corrected by substituting one character for another. In this case, a 'α' was replaced with a 'ο'. (The exact substitution is encoded in the html attribute, but not visible to the reader).
εὑρήματα: this word was matched to a dictionary word when a pair of the same letter was replaced with only one instance of that letter. In this case, εὑρήμματα was the OCR output, a word that is not in the dictionary.
besides, -five: in this case, there was no space between the punctuation separating two dictionary words.
τῆς αὐλητρίδος: these words passed spellcheck only when a space was inserted between them. The original OCR output was τῆςαὐλητρίδος.
ὥμοσεν: this word cannot be matched with a dictionary word by any of the strategies.
Additionally, when text has been identified as pertaining to the Apparatus Criticus, it is bordered with vertical blue lines, thus:64 θυραις καθ ημεραν] θυραν Κ* (θυραις κ. η. Bᵃ) 66 αμαρτανοντες εις
To edit the OCR output, click on a word. The content of that word is now editable: you can type additional characters or use the backspace key to delete; alternatively you can select and delete a range of characters in the usual manner. When a word is clicked on, a tooltip pops up with the corresponding image range from the OCR'd page. It is usually easier to compare the text with this image than it is to scan the entire page image on the left side of the screen.
Once the editor is assured that the text in the word is what is in the pop-up image, he or she should press the
Enter) key. This action is all that is required to save the edit in the underlying database. The editor will note that the colour of the word changes to light-blue, indicating that the word has been manually verified. The editing cursor now moves to the next word on the page.
In the case where a text is highly accurate, editing will simply entail clicking on the first word, checking that the word text corresponds to the pop-up image and then pressing
Return. The process is repeated again and again, the text being changed only when necessary.
Once the whole page has been verified thus, a 'Download' button appears at the end of the page. It is not necessary to use this: the page is stored in the database in any case.
If an editor verifies a word with the
Control key held down while pressing
Return, the edit function is applied to all applicable words on all pages of the text. So, if the word is unchanged and verified, all words that contain that string are similarly verified (sight unseen), and will appear coloured as light-blue, no matter what page they appear on. This is a very powerful function and should be used sparingly, especially at first. However, it may become clear that a word like Ἰσραήλ is very unlikely to be incorrectly identified. In which case, verifying all of these words at once saves time.
If a word has been changed, all words in all pages of the text that had the original form of the word will be changed to the edited form. Note that by 'original form' it is mean the very first form outputted. This if all words reading ἐπ’ are changed to ἐπ' (with a different final character) and then, using this function, one of those words is used to do a global change to ἐφ', this will not change words that originally were output as ἐπ’.
Control key down while also holding
Alt key down while pressing
Return causes a new work dialog box to be created for the purpose of notes.
Alt key down while pressing
Return causes the creation of a space to indicate a new section of text.
Shift down while pressing
Return causes a new blank line to be inserted into the document, directly following the current one. This, too, is editable, but its content is not broken into words. Pressing
Return in this line will save its content as expected. This allows one to add content to the page which has been missed by the OCR engine.
If a word is split in the editor, meaning that it apears in multiple boxes rather than altoghether in a single box, you may correct this error by completing the word as it appears in the text in the first editing box and deleting the content of the subsequent box(es). If multiple words are detected as a single word, you may solve this by simply separating them by a
Space as the computer will detect this as being seprate words.
greater thansymbols found on your keyboard. Due to the nature of unicode characters formatting bold or italic words is unnecessary.
Mouse. Others may prefer to use their own keyboard in Greek Polytonic form.