Juraj Hreško - Academia.edu (original) (raw)
Uploads
Papers by Juraj Hreško
In this paper, we describe a simple approach to filter unwanted web pages, according to their con... more In this paper, we describe a simple approach to filter unwanted web pages, according to their content. The result of this work is a demo of an application that is usable in realtime filtering and in non-real-time indexing of any given web pages. We describe a proposed technique step by step, while discussing possible alternative ways for each part. In the end we discuss the overall quality and proposed next steps that could lead to a fully usable business application.
Proceedings of the first international workshop on Entity recognition & disambiguation - ERD '14, 2014
This paper describes our system for the Entity Recognition and Disambiguation Challenge 2014. The... more This paper describes our system for the Entity Recognition and Disambiguation Challenge 2014. There are two tasks: one to find entities in queries (Short Track), the other to find entities in texts from web pages (Long Track). We have participated in both tracks with the same system tuned to each of the tasks. On the final test set, we reached the f-measure of 71.9% on the Long Track and of 66.9% on the Short Track. We describe our system and its components in depth, together with their influence on performance. The specifics of each of the tasks are also discussed.
In this paper, we describe a simple approach to filter unwanted web pages, according to their con... more In this paper, we describe a simple approach to filter unwanted web pages, according to their content. The result of this work is a demo of an application that is usable in realtime filtering and in non-real-time indexing of any given web pages. We describe a proposed technique step by step, while discussing possible alternative ways for each part. In the end we discuss the overall quality and proposed next steps that could lead to a fully usable business application.
Proceedings of the first international workshop on Entity recognition & disambiguation - ERD '14, 2014
This paper describes our system for the Entity Recognition and Disambiguation Challenge 2014. The... more This paper describes our system for the Entity Recognition and Disambiguation Challenge 2014. There are two tasks: one to find entities in queries (Short Track), the other to find entities in texts from web pages (Long Track). We have participated in both tracks with the same system tuned to each of the tasks. On the final test set, we reached the f-measure of 71.9% on the Long Track and of 66.9% on the Short Track. We describe our system and its components in depth, together with their influence on performance. The specifics of each of the tasks are also discussed.