Edmond Ho | Northumbria University (original) (raw)

Papers by Edmond Ho

Creating realistic characters that can react to the users' or another character's movemen... more Creating realistic characters that can react to the users' or another character's movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful researches in adapting the generative adversarial network (GAN) in synthesizing single human actions, there are very few on modelling human-human interactions. In this paper, we propose a semi-supervised GAN system that synthesizes the reactive motion of a character given the active motion from another character. Our key insights are two-fold. First, to effectively encode the complicated spatial-temporal information of a human motion, we empower the generator with a part-based long short-term memory (LSTM) module, such that the temporal movement of different limbs can be effectively modelled. We further include an attention module such ...

2019 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA)

2020 25th International Conference on Pattern Recognition (ICPR)

Facial makeup style transfer is an extremely challenging sub-field of image-to-image-translation.... more Facial makeup style transfer is an extremely challenging sub-field of image-to-image-translation. Due to this difficulty, state-of-the-art results are mostly reliant on the Face Parsing Algorithm, which segments a face into parts in order to easily extract makeup features. However, this algorithm can only work well on high-definition images where facial features can be accurately extracted. Faces in many real-world photos, such as those including a large background or multiple people, are typically of low-resolution, which considerably hinders stateof-the-art algorithms. In this paper, we propose an end-to-end holistic approach to effectively transfer makeup styles between two low-resolution images. The idea is built upon a novel weighted multi-scale spatial attention module, which identifies salient pixel regions on low-resolution images in multiple scales, and uses channel attention to determine the most effective attention map. This design provides two benefits: low-resolution images are usually blurry to different extents, so a multi-scale architecture can select the most effective convolution kernel size to implement spatial attention; makeup is applied on both a macro-level (foundation, fake tan) and a micro-level (eyeliner, lipstick) so different scales can excel in extracting different makeup features. We develop an Augmented CycleGAN network that embeds our attention modules at selected layers to most effectively transfer makeup. Our system is tested with the FBD data set, which consists of many low-resolution facial images, and demonstrate that it outperforms state-of-the-art methods, particularly in transferring makeup for blurry images and partially occluded images.

IEEE Transactions on Neural Systems and Rehabilitation Engineering

The Visual Computer, 2021

3D car models are heavily used in computer games, visual effects, and even automotive designs. As... more 3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important. To tackle the challenge, we propose a novel system to reconstruct a 3D car using a single sketch image. The system learns from a synthetic database of 3D car models and their corresponding 2D contour sketches and segmentation masks, allowing effective training with minimal data collection cost. The core of the system is a machine learning pipeline that combines the use of a generative adversarial network (GAN) and lazy learning. GAN, being a deep learning method, is capable of modelling complicated data distributions, enabling the effective modelling of a large variety of cars. Its major weakness is that as a global method, modelling the fine details in the local region is challenging. Lazy learning works well to preserve local features by generating a local subspace with relevant data samples. We ...

ArXiv, 2021

While multiple studies have proposed methods for the formation control of unmanned aerial vehicle... more While multiple studies have proposed methods for the formation control of unmanned aerial vehicles (UAV), the trajectories generated are generally unsuitable for tracking targets where the optimum coverage of the target by the formation is required at all times. We propose a path planning approach called the Flux Guided (FG) method, which generates collision-free trajectories while maximising the coverage of one or more targets. We show that by reformulating an existing least-squares flux minimisation problem as a constrained optimisation problem, the paths obtained are 1.5× shorter and track directly toward the target. Also, we demonstrate that the scale of the formation can be controlled during flight, and that this feature can be used to track multiple scattered targets. The method is highly scalable since the planning algorithm is only required for a sub-set of UAVs on the open boundary of the formation’s surface. Finally, through simulating a 3d dynamic particle system that tra...

Motion, Interaction and Games, 2019

In this paper, we propose a new data-driven framework for 3D hand motion emotion transfer. Specif... more In this paper, we propose a new data-driven framework for 3D hand motion emotion transfer. Specifically, we first capture highquality hand motion using VR gloves. The hand motion data is then annotated with the emotion type and converted to images to facilitate the motion synthesis process and the new dataset will be available to the public. To the best of our knowledge, this is the first public dataset with annotated hand motions. We further formulate the emotion transfer for 3D hand motion as an Image-to-Image translation problem, and it is done by adapting the StarGAN framework. Our new framework is able to synthesize new motions, given target emotion type and an unseen input motion. Experimental results show that our framework can produce high quality and consistent hand motions.

With the advancement in motion sensing technology, acquiring high-quality human motions for creat... more With the advancement in motion sensing technology, acquiring high-quality human motions for creating realistic character animation is much easier than before. Since motion data itself is not the main obstacle anymore, more and more effort goes into enhancing the realism of character animation, such as motion styles and control. In this paper, we explore a less studied area: the emotion of motions. Unlike previous work which encode emotions into discrete motion style descriptors, we propose a continuous control indicator called motion strength, by controlling which a data-driven approach is presented to synthesize motions with fine control over emotions. Rather than interpolating motion features to synthesize new motion as in existing work, our method explicitly learns a model mapping low-level motion features to the emotion strength. Since the motion synthesis model is learned in the training stage, the computation time required for synthesizing motions at run-time is very low. As a...

2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 2021

Early prediction of cerebral palsy is essential as it leads to early treatment and monitoring. De... more Early prediction of cerebral palsy is essential as it leads to early treatment and monitoring. Deep learning has shown promising results in biomedical engineering thanks to its capacity of modelling complicated data with its non-linear architecture. However, due to their complex structure, deep learning models are generally not interpretable by humans, making it difficult for clinicians to rely on the findings. In this paper, we propose a channel attention module for deep learning models to predict cerebral palsy from infants' body movements, which highlights the key features (i.e. body joints) the model identifies as important, thereby indicating why certain diagnostic results are found. To highlight the capacity of the deep network in modelling input features, we utilize raw joint positions instead of hand-crafted features. We validate our system with a real-world infant movement dataset. Our proposed channel attention module enables the visualization of the vital joints to this disease that the network considers. Our system achieves 91.67% accuracy, suppressing other state-of-the-art deep learning methods.

2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 2021

Providing early diagnosis of cerebral palsy (CP) is key to enhancing the developmental outcomes f... more Providing early diagnosis of cerebral palsy (CP) is key to enhancing the developmental outcomes for those affected. Diagnostic tools such as the General Movements Assessment (GMA), have produced promising results in early diagnosis, however these manual methods can be laborious. In this paper, we propose a new framework for the automated classification of infant body movements, based upon the GMA, which unlike previous methods, also incorporates a visualization framework to aid with interpretability. Our proposed framework segments extracted features to detect the presence of Fidgety Movements (FMs) associated with the GMA spatiotemporally. These features are then used to identify the body-parts with the greatest contribution towards a classification decision and highlight the related body-part segment providing visual feedback to the user. We quantitatively compare the proposed framework's classification performance with several other methods from the literature and qualitatively evaluate the visualization's veracity. Our experimental results show that the proposed method performs more robustly than comparable techniques in this setting whilst simultaneously providing relevant visual interpretability. Index Terms-infants, cerebral palsy, general movements assessment, machine learning, explainable AI, visualization This project is supported in part by the Royal Society (Ref: IES/R1/191147).

Sensors, 2020

State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based str... more State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based streams, especially in the scenarios of intelligent remote control and communications, where virtual and augmented reality will soon become outdated and are forecasted to be replaced by point cloud streams providing explorable 3D environments of communication and industrial data. One of the most novel approaches employed in modern object reconstruction methods is to use a priori knowledge of the objects that are being reconstructed. Our approach is different as we strive to reconstruct a 3D object within much more difficult scenarios of limited data availability. Data stream is often limited by insufficient depth camera coverage and, as a result, the objects are occluded and data is lost. Our proposed hybrid artificial neural network modifications have improved the reconstruction results by 8.53% which allows us for much more precise filling of occluded object sides and reduction of noise d...

2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA), 2018

Given the complexity of the human facial anatomy, animating facial expressions and lip movements ... more Given the complexity of the human facial anatomy, animating facial expressions and lip movements for speech is a very time-consuming and tedious task. In this paper, a new text-toanimation framework for facial animation synthesis is proposed. The core idea is to improve the expressiveness of lip-sync animation by incorporating facial expressions in 3D animated characters. This idea is realized as a plug-in in Autodesk Maya, one of the most popular animation platforms in the industry, such that professional animators can effectively apply the method in their existing work. We evaluate the proposed system by conducting two sets of surveys, in which both novice and experienced users participate in the user study to provide feedback and evaluations from different perspectives. The results of the survey highlights the effectiveness of creating realistic facial animations with the use of emotion expressions. Video demos of the synthesized animations are available online at

Motion, Interaction and Games, 2019

Proceedings of the 2nd International Conference on Robotics, Computer Vision and Intelligent Systems, 2021

Computers & Graphics, 2021

Studies in Big Data, 2021

2019 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA)

2020 25th International Conference on Pattern Recognition (ICPR)

IEEE Transactions on Neural Systems and Rehabilitation Engineering

The Visual Computer, 2021

ArXiv, 2021

Motion, Interaction and Games, 2019

2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 2021

Sensors, 2020

2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA), 2018

Motion, Interaction and Games, 2019

Proceedings of the 2nd International Conference on Robotics, Computer Vision and Intelligent Systems, 2021

Computers & Graphics, 2021

Studies in Big Data, 2021