Курсовая работа: Division of the sentence into phrases

(iv) List No 4: of, to (as Preposition)

(v) List No 5: the, a, an

(vi) List No 6: so much as, so far as, so far, as long as, as soon as, so long as, in order that, in order to, lest, as well as, and, or, noretc.

(vii) List No 7: such, than, onto, until, all, near, even, when, while, within, last, next, also, less, more, most, whether, much, once, one, any, many, some, where, another, other, each, then, whose, who, whoever, till, until, what, across, whence, according, due to, owing, whereby, prior, wherever, whenever, already, moreover, likewise, howeveretc.

(viii) List No 8: out, in, on, down etc.

Some examples of the performance of Algorithm No 2

Below we will present a text divided into phrases according to the instructions for the algorithm:

(i) Many countries also have established or have under construction a free zone, where exporters have access to shipping facilities, a pool of labour and freedom from exchange controls.

(ii) The Caribbean Basin Initiative, a US package of aid and trade incentives to encourage manufacturing, has given an added boost to industrial development in this region.

The analysis of the sentence starts with checking the contents of the memory and taking to print any information stored up to this moment (this is done at the start of each new sentence), also with ascertaining whether the sentence has ended or not and recording the analysed word in the memory if it is not recorded yet ia procedure carried out after each word). Then the algorithm reads the next word (in No 4a), which in the case of (i) above is many, and proceeds to analyse it in 5. Since it is not a full stop or any other Punctuation Mark (5, 7), nor a word specified in 9, 11, 13, 15, 17 or 19, the analysis yields no result until the program gets to operation No 21, where the word many is located in List No 7. Here the program, through operation No 22, checks whether many is followed by yet another word from the Lists. Operation 22ab certifies that it is not, and instructs the program to cut the sentence at this point and to leave three spaces (before many) when recording it, then to return to operation No 2 to start the analysis of the next word. The next word, countries, could not be identified (it is not registered in the Lists), therefore operation 27 instructs the program to record it in the memory as the next consecutive word of the phrase and to return to 2 to continue the analysis of the sentence.

The word also follows next. The program cannot locate the word and proceeds further, after registering it. The next words have and established are dealt with in a similar way. Next comes the Conjunction or. The program locates the word in operation No 17, then it checks if other words from the Lists follow (18). A single space is left before recording it (No 18b). The word have is registered next and the program reaches under (15) to draw a dividing line by leaving four spaces (16ab), and this carries on till the end of the text.

These procedures can be applied to any English language texts. The actual users of the algorithm can improve it by adding new words to the Lists or by changing the dividing lines to suit other strategies and other interpretations of the boundaries of the English phrase.


Conclusion

Algorithm No 2 was developed with the special purpose of aiding the overall automatic analysis of the sentence. The division of the sentence into smaller units helps us understand better its meaning, though the division, as presented in this section, is not based on meaning but on formal features. The reader will find somewhat different and much more accurate interpretation of the existing boundaries within a sentence in Part 2.

In the course of this study it was observed that each foregoing phrase finds further interpretation of its meaning in the next phrase. In other words, the first phrase of a sentence carries a certain meaning, which with each successive phrase becomes more and more clear and complete - the next phrase simply adds more information to the meaning of the previous phrase. The phrases have varied mutual interdependence, which we tried to express with a margin left between them. We will express this graphically in Figure 2.2, which considers two sentences.

The brackets show the dependence of each succeeding phrase both on the previous one and on all preceding ones. In the second sentence, the phrases are separated with equal space left between them. In those cases where the space left is smaller, this means that the tie with the previous phrase is stronger (i.e. the next phrase is an integral part of the preceding one). A sudden surge of the interval signals the division between two phrases, as in the example in Figure 2.3. In this example, the second large phrase (Clause) explains the meaning of the first. This is indicated with the interval left and with the brackets.


References

1. Brill, E. and Mooney, R. J. (1997), ‘An overview of empirical natural language processing', in AI Magazine, 18 (4): 13-24.

2. Chomsky, N. (1957), Syntactic Structures. The Hague: Mouton.

4. Curme, G.O. (1955), English Grammar. New York: Barnes and Noble.

5. Dowty, D.R., Karttunen, L. and Zwicky, A.M. (eds) (1985), Natural Language Parsing. Cambridge: Cambridge University Press.

6. Garside, R. (1986), 'The CLAWS word-tagging system', in R. Garside,

7. G. Leech and G. Sampson (eds) The Computational Analysis of English. Harlow: Longman.

8. Gazdar, G. and Mellish, C. (1989), Natural Language Processing in POP-11. Reading, UK: Addison-Wesley.

9. Georgiev, H. (1976), 'Automatic recognition of verbal and nominal word groups in Bulgarian texts', in t.a. information, Revue International du traitement automatique du langage, 2, 17-24.

10. Georgiev, H. (1991), 'English Algorithmic Grammar', in Applied Computer Translation, Vol. 1, No. 3, 29-48.

11. Georgiev, H. (1993a), 'Syntparse, software program for parsing of English texts', demonstration at the Joint Inter-Agency Meeting on Computer-assisted Terminology and Translation, The United Nations, Geneva.

12. Georgiev, H. (1993b), 'Syntcheck, a computer software program for orthographical and grammatical spell-checking of English texts', demonstration at the Joint Inter-Agency Meeting on Computer-assisted Terminology and Translation, The United Nations, Geneva.

13. Georgiev, H. (1994—2001), Softhesaurus, English Electronic Lexicon, produced and marketed by LANGSOFT, Sprachlernmittel, Switzerland; platform: DOS/ Windows.

14. Georgiev, H. (1996-2001a), Syntcheck, a computer software program for orthographical and grammatical spell-checking of German texts, produced and marketed by LANGSOFT, Sprachlernmittel, Switzerland; platform: DOS/Windows.

К-во Просмотров: 214
Бесплатно скачать Курсовая работа: Division of the sentence into phrases