Jakov Pavlek | University of Zagreb (original) (raw)
Address: Zagreb, Grad Zagreb, Croatia
less
Uploads
Papers by Jakov Pavlek
Zbornik radova okupio je 57 domaćih i inozemnih autora/ica, koji/e kroz 33 rada, iz različitih is... more Zbornik radova okupio je 57 domaćih i inozemnih autora/ica, koji/e kroz 33 rada, iz različitih istraživačkih kutova, obrađuju recentne teme o proizvodnji i percepciji govora, te o njihovoj međuovisnosti u govornom procesu. Knjiga je posvećena profesoru Damiru Horgi povodom njegova sedamdesetog rođendana.
U svakom sustavu za automatsku sintezu govora neke dijelove teksta treba pretprocesirati, tj. nor... more U svakom sustavu za automatsku sintezu govora neke dijelove teksta treba pretprocesirati, tj. normalizirati, da bi postali izgovorljivi. To se opceniti odnosi na brojeve, kratice, simbole razlicitih jedinica i strana imena. Hrvatski sustav pisanja je u osnovi fonoloski, sto olaksava preslikavanje grafema u foneme pri strojnoj tvorbi govora, no strana imena u hrvatskome u pravili zadržavaju svoju izvornu grafiju. Stoga njih u sustavu za sintezu govora treba transkribirati prema hrvatskim transkripcijskim pravilima. U radu se, polazeci od usporedne analize dva hrvatska megakorpusa, prvo istražuje udio stranih imena u prosjecnom hrvatskom tekstu i dinamika njihova ulaska u hrvatski. Nadalje se opisuje postupak za automatsku identifikaciju jezika, koji je testiran nad uzorkom od preko 30.000 stranih imena i njihovih kosih oblika. Polazeci od rezultata ove klasifikacije, programski se pokusavaju transkribirati imena razvrstana kao njemacka ili talijanska. Tocnost transkripcije od preko 9...
Public service Hascheck (Croatian Academic Spell CHECKer) is a free Web service on the global lev... more Public service Hascheck (Croatian Academic Spell CHECKer) is a free Web service on the global level with continually growing base of its users and with rapidly increasing service volume. In this paper we discuss methods used for processing and learning new, previously unknown words to the Hascheck system. Interface for manual word acquisition has been developed using Google Web Search engine from appropriate given domains as a part of the improvement of the Hascheck service. In this matter already existing systematized knowledge resources, specifically Wikipedia and Croatian Spell Checker for MS Word, have been intensively used. Program modules for automatic retrieval and classification of word types based on information about domain, language, and way of spelling have been developed. As a result, some 135000 of new word types have been processed and classified into adequate classes using the developed software. We also evaluate earlier methods used in the same process and compare t...
Proizvodnja i percepcija govora, 2010
The present paper presents the framework and the results of Project 2: "Multimodal tools and inte... more The present paper presents the framework and the results of Project 2: "Multimodal tools and interfaces for the intercommunication between visually impaired and "deaf and mute" people", which has been developed during the eNTERFACE-2006 summer workshop in the context of the SIMILAR NoE. The developed system aims to provide alternative tools and interfaces to blind and deaf-and-mute persons so as to enable their intercommunication as well as their interaction with the computer. All the involved technologies are integrated into a treasure hunting game application that is jointly played by the blind and deaf-and-mute user. The reason for choosing to integrate the multimodal interfaces into a game application is that it serves both as an entertainment and as a pleasant education tool to its users. The proposed application integrates haptics, audio, visual output as well as computer vision, sign language analysis and synthesis, speech recognition and synthesis, in order to provide an interactive environment where the blind and deaf and mute users can collaborate in order to play the treasure hunting game.
Zbornik radova okupio je 57 domaćih i inozemnih autora/ica, koji/e kroz 33 rada, iz različitih is... more Zbornik radova okupio je 57 domaćih i inozemnih autora/ica, koji/e kroz 33 rada, iz različitih istraživačkih kutova, obrađuju recentne teme o proizvodnji i percepciji govora, te o njihovoj međuovisnosti u govornom procesu. Knjiga je posvećena profesoru Damiru Horgi povodom njegova sedamdesetog rođendana.
U svakom sustavu za automatsku sintezu govora neke dijelove teksta treba pretprocesirati, tj. nor... more U svakom sustavu za automatsku sintezu govora neke dijelove teksta treba pretprocesirati, tj. normalizirati, da bi postali izgovorljivi. To se opceniti odnosi na brojeve, kratice, simbole razlicitih jedinica i strana imena. Hrvatski sustav pisanja je u osnovi fonoloski, sto olaksava preslikavanje grafema u foneme pri strojnoj tvorbi govora, no strana imena u hrvatskome u pravili zadržavaju svoju izvornu grafiju. Stoga njih u sustavu za sintezu govora treba transkribirati prema hrvatskim transkripcijskim pravilima. U radu se, polazeci od usporedne analize dva hrvatska megakorpusa, prvo istražuje udio stranih imena u prosjecnom hrvatskom tekstu i dinamika njihova ulaska u hrvatski. Nadalje se opisuje postupak za automatsku identifikaciju jezika, koji je testiran nad uzorkom od preko 30.000 stranih imena i njihovih kosih oblika. Polazeci od rezultata ove klasifikacije, programski se pokusavaju transkribirati imena razvrstana kao njemacka ili talijanska. Tocnost transkripcije od preko 9...
Public service Hascheck (Croatian Academic Spell CHECKer) is a free Web service on the global lev... more Public service Hascheck (Croatian Academic Spell CHECKer) is a free Web service on the global level with continually growing base of its users and with rapidly increasing service volume. In this paper we discuss methods used for processing and learning new, previously unknown words to the Hascheck system. Interface for manual word acquisition has been developed using Google Web Search engine from appropriate given domains as a part of the improvement of the Hascheck service. In this matter already existing systematized knowledge resources, specifically Wikipedia and Croatian Spell Checker for MS Word, have been intensively used. Program modules for automatic retrieval and classification of word types based on information about domain, language, and way of spelling have been developed. As a result, some 135000 of new word types have been processed and classified into adequate classes using the developed software. We also evaluate earlier methods used in the same process and compare t...
Proizvodnja i percepcija govora, 2010
The present paper presents the framework and the results of Project 2: "Multimodal tools and inte... more The present paper presents the framework and the results of Project 2: "Multimodal tools and interfaces for the intercommunication between visually impaired and "deaf and mute" people", which has been developed during the eNTERFACE-2006 summer workshop in the context of the SIMILAR NoE. The developed system aims to provide alternative tools and interfaces to blind and deaf-and-mute persons so as to enable their intercommunication as well as their interaction with the computer. All the involved technologies are integrated into a treasure hunting game application that is jointly played by the blind and deaf-and-mute user. The reason for choosing to integrate the multimodal interfaces into a game application is that it serves both as an entertainment and as a pleasant education tool to its users. The proposed application integrates haptics, audio, visual output as well as computer vision, sign language analysis and synthesis, speech recognition and synthesis, in order to provide an interactive environment where the blind and deaf and mute users can collaborate in order to play the treasure hunting game.