ermiyas birhanu | Bahir Dar University (original) (raw)

Papers by ermiyas birhanu

arXiv (Cornell University), Oct 5, 2022

Content delivery networks (CDNs) are key components of high throughput, low latency services on t... more Content delivery networks (CDNs) are key components of high throughput, low latency services on the internet. CDN cache servers have limited storage and bandwidth and implement state-of-the-art cache admission and eviction algorithms to select the most popular and relevant content for the customers served. The aim of this study was to utilize state-of-the-art recommender system techniques for predicting ratings for cache content in CDN. Matrix factorization was used in predicting content popularity which is valuable information in content eviction and content admission algorithms run on CDN edge servers. A custom implemented matrix factorization class and MyMediaLite were utilized. The input CDN logs were received from a European telecommunication service provider. We built a matrix factorization model with that data and utilized grid search to tune its hyper-parameters. Experimental results indicate that there is promise about the proposed approaches and we showed that a low root mean square error value can be achieved on the real-life CDN log data.

arXiv (Cornell University), Oct 11, 2022

Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high ... more Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high quality video on demand (VoD), web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the customers as possible. CDN operators face a significant challenge when analyzing billions of web server and proxy logs generated by their systems. The main objective of this study was to analyze the applicability of various clustering methods in CDN error log analysis. We worked with real-life CDN proxy logs, identified key features included in the logs (e.g., content type, HTTP status code, time-of-day, host) and clustered the log lines corresponding to different host types offering live TV, video on demand, file caching and web content. Our experiments were run on a dataset consisting of proxy logs collected over a 7-day period from a single, physical CDN server running multiple types of services (VoD, live TV, file). The dataset consisted of 2.2 billion log lines. Our analysis showed that CDN error clustering is a viable approach towards identifying recurring errors and improving overall quality of service.

Lecture notes in networks and systems, 2024

2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)

Industrial Control Systems (ICSs) utilize different sensors and various embedded systems to opera... more Industrial Control Systems (ICSs) utilize different sensors and various embedded systems to operate. Devices often communicate using protocols like Siemens Step 7 and Modbus, which were designed for use in closed networks many years ago and are vulnerable to attacks. The goal of this study is to detect anomalies in industrial control systems using a proximity-based approach on the Securing Water Treatment (SWaT) dataset. We encoded categorical data using one hot encoding and normalized numerical data using min max scaling. The experiment shown that by adopting a proximity-based approach, we can obtain state-of-the-art 99% precision and 98% recall and able to identify 35 out of 37 attack points, indicating that the suggested methodology is suitable for usage in industrial control system scenarios.

2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS), 2022

International Journal of Emerging Research in Management and Technology, 2018

Software systems are any software product or applications that support business domains such as M... more Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.

International Journal on Computational Science & Applications, 2018

Software metrics have a direct link with measurement in software engineering. Correct measurement... more Software metrics have a direct link with measurement in software engineering. Correct measurement is the prior condition in any engineering fields, and software engineering is not an exception, as the size and complexity of software increases, manual inspection of software becomes a harder task. Most Software Engineers worry about the quality of software, how to measure and enhance its quality. The overall objective of this study was to asses and analysis's software metrics used to measure the software product and process. In this Study, the researcher used a collection of literatures from various electronic databases, available since 2008 to understand and know the software metrics. Finally, in this study, the researcher has been identified software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from others for this reason it is better to apply the software metrics to measure the quality of software and the current most common software metrics tools to reduce the subjectivity of faults during the assessment of software quality. The central contribution of this study is an overview about software metrics that can illustrate us the development in this area, and a critical analysis about the main metrics founded on the various literatures.

IOSR Journal of Computer Engineering, 2017

In the 30 years since HIV/AIDS was first discovered, the disease has become a disturbing pandemic... more In the 30 years since HIV/AIDS was first discovered, the disease has become a disturbing pandemic, taking the lives of 30 million people around the world. In 2010 alone, HIV/AIDS killed 1.8 million people, 1.2 million of whom were living in sub-Saharan Africa. In Ethiopia,HIV/AIDS is one of the key challenges for the overall development of Ethiopia, as it has led to a seven-year decrease in life expectancy and a greatly reduced workforce. Even if there are a number of voluntarily counseling and testing centers that work on HIV/AIDS prevention located in several cities of the country, they didn't change and solve the problem related with HIV/AIDS. In addition in most of Countries counseling and Testing centers ,the data collected is simply put together and maximum used for statics purpose rather than analyzing to discover relevant and interesting previously unknown data characteristics,relationships,dependencies etc. The main objective of this study was pattern discovery and generating interesting hidden association rules from data which is taken from Marie stopes Gondar branch clinic. The contribution of this Study is by analyzing customer's data that did HIV/AIDS test on the clinic, to identify which customer is more vulnerable to HIV/AIDS. It helps counselors in VCT centers in predicting some hidden but interesting relationships among the attributes they use during the course of counseling. For doing this, methodology such as data collection and tool selection was used. After data was collected, the main data preprocessing tasks are applied on data sets to clean data and to make it ready for experiment purpose. Out of 1992 instances of original data 1861 was made ready for the experiment. Weka3.4. tool is used for experiment and the well known association rule mining algorithm Apriori was used to extract those interesting rules from data. In order to get those interesting rules three basic experiment was conducted .Experiment I was conducted by using the whole data set. Experiment II was conducted by considering only those positive classes. Experiment III was done by only considering those positive classes but with the absence of positive class attribute. One of the result of experiments showed that customers that donot use condom during sexual intercourse and non employed person are vulnerable to HIV/AIDS.

International Journal of Advanced Research in Computer Science and Software Engineering, 2017

Software architecture is the structural solution that achieves the overall technical and operatio... more Software architecture is the structural solution that achieves the overall technical and operational requirements for software developments. Software engineers applied software architectures for their software system developments; however, they worry the basic benchmarks in order to select software architecture styles, possible components, integration methods (connectors) and the exact application of each style. The objective of this research work was a comparative analysis of software architecture styles by its weakness and benefits in order to select by the programmer during their design time. Finally, in this study, the researcher has been identified architectural styles, weakness, and Strength and application areas with its component, connector and Interface for the selected architectural styles.

International Journal of Emerging Research in Management &Technology , 2017

oftware systems are any software product or applications that support business domains such as Ma... more oftware systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.

IOSR Journal of Computer Engineering (IOSR-JCE), 2017

International Journal on Computational Science & Applications, 2018

A high prediction accuracy of the students' performance is helpful to identify the low performanc... more A high prediction accuracy of the students' performance is helpful to identify the low performance students at the beginning of the learning process. Machine learning is used to attain this objective. Machine learning techniques are used to discover models or patterns of data, and it is helpful in the decision-making. The ability to predict performance of students is very crucial in our present education system. We applied Machine learning concepts for this study. The dataset used in our study is taken from the Wolkite university registries office for college of computing and informatics from 2004 up to 2007 E.C with respect to each department. In this study, we have been collected student's transcript data that included their final GPA and their grades in all courses. After pre-processing the data, we applied the machine learning methods, neural networks, Naive Bayesian and Support Vector Machine (SMO). Finally, we built the model for each method, evaluate the performance and compare the results of each model. Using machine learning, the aim was to develop a model which can derive the conclusion on students' academic success. I. INTRODUCTION For higher education institutions whose goal is to contribute to the improvement of quality of higher education. The quality of higher education institutions implies providing the services, which most likely meet the needs of students, academic staff, and other participants in the education system. Tekeste writes " The golden age of modern education in Ethiopia " is usually dated to the years between 1941 and 1970 (the regime of HIM Hailesellassie). Education was free and it applied more to the poorer section of the population; the rich and the aristocracy were less enticed by the economic returns of education [1][7]. Currently, the Ethiopian Government gives higher education a central position in its strategy for social and economic development. Ethiopia has radically expanded the numbers of its higher education institutions: from two Federal universities to 33; among this 10 of them are opened before 5 years and one of this is Wolkite University. Nowadays, the data base that store data and information for organization becomes complicated and difficult to analysis [2]; for this case we are going to apply Machine Learning techniques to resolve those problems. Wolkite University has its own student management information system that was developed by Bahir Dar University Course and Curriculum Management System. However, this database contains so much data that it becomes almost impossible to manually analyze them for valuable decision-making information. In order to analysis this complex data base we can able to use machine learning techniques. This Study conducted in 993students from college of computing and informatics within Wolkite University with respective departments. We were using WEKA open source software to test the prediction of the student performance. It provides many different algorithms for data mining and machine learning. WEKA is open source and freely available. It is also platform-independent[3] .We may have various factors for education with in Wolkite University such as environment, family standard of each student, gender, teacher's educational background and education policy[1][4][5], but our research is not going through each factor because it is physiological factor instead of learning once. This study has the following contributions.

Conference Presentations by ermiyas birhanu

23th Industrial Conference on Data Mining ICDM 2022, 2022

Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high ... more Content delivery networks (CDNs) are the backbone of the
Internet and are key in delivering high quality video on demand (VoD),
web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the
customers as possible. CDN operators face a significant challenge when
analyzing billions of web server and proxy logs generated by their systems. The main objective of this study was to analyze the applicability
of various clustering methods in CDN error log analysis. We worked
with real-life CDN proxy logs, identified key features included in the
logs (e.g., content type, HTTP status code, time-of-day, host) and clustered the log lines corresponding to different host types offering live TV,
video on demand, file caching and web content. Our experiments were
run on a dataset consisting of proxy logs collected over a 7-day period
from a single, physical CDN server running multiple types of services
(VoD, live TV, file). The dataset consisted of 2.2 billion log lines. Our
analysis showed that CDN error clustering is a viable approach towards
identifying recurring errors and improving overall quality of service.

arXiv (Cornell University), Oct 5, 2022

arXiv (Cornell University), Oct 11, 2022

Lecture notes in networks and systems, 2024

2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)

2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS), 2022

International Journal of Emerging Research in Management and Technology, 2018

International Journal on Computational Science & Applications, 2018

IOSR Journal of Computer Engineering, 2017

International Journal of Advanced Research in Computer Science and Software Engineering, 2017

International Journal of Emerging Research in Management &Technology , 2017

IOSR Journal of Computer Engineering (IOSR-JCE), 2017

International Journal on Computational Science & Applications, 2018

23th Industrial Conference on Data Mining ICDM 2022, 2022

Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high ... more Content delivery networks (CDNs) are the backbone of the
Internet and are key in delivering high quality video on demand (VoD),
web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the
customers as possible. CDN operators face a significant challenge when
analyzing billions of web server and proxy logs generated by their systems. The main objective of this study was to analyze the applicability
of various clustering methods in CDN error log analysis. We worked
with real-life CDN proxy logs, identified key features included in the
logs (e.g., content type, HTTP status code, time-of-day, host) and clustered the log lines corresponding to different host types offering live TV,
video on demand, file caching and web content. Our experiments were
run on a dataset consisting of proxy logs collected over a 7-day period
from a single, physical CDN server running multiple types of services
(VoD, live TV, file). The dataset consisted of 2.2 billion log lines. Our
analysis showed that CDN error clustering is a viable approach towards
identifying recurring errors and improving overall quality of service.