hoang tran vu | National Chung Cheng University (original) (raw)
Papers by hoang tran vu
arXiv (Cornell University), Dec 25, 2017
Many methods have been proposed to solve the domain adaptation problem recently. However, the suc... more Many methods have been proposed to solve the domain adaptation problem recently. However, the success of them implicitly funds on the assumption that the information of domains are fully transferrable. If the assumption is not satisfied, the effect of negative transfer may degrade domain adaptation. In this paper, a better learning network has been proposed by considering three tasks-domain adaptation, disentangled representation, and style transfer simultaneously. Firstly, the learned features are disentangled into common parts and specific parts. The common parts represent the transferrable features, which are suitable for domain adaptation with less negative transfer. Conversely, the specific parts characterize the unique style of each individual domain. Based on this, the new concept of feature exchange across domains, which can not only enhance the transferability of common features but also be useful for image style transfer, is introduced. These designs allow us to introduce five types of training objectives to realize the three challenging tasks at the same time. The experimental results show that our architecture can be adaptive well to full transfer learning and partial transfer learning upon a well-learned disentangled representation. Besides, the trained network also demonstrates high potential to generate style-transferred images.
2017 IEEE International Conference on Image Processing (ICIP), 2017
In the demonstration, we would show our live and real-time parking space detection system. The de... more In the demonstration, we would show our live and real-time parking space detection system. The detection function is founded on a video surveillance system built in an outdoor parking lot. As we might know, it is challenging to implement a practical vision system in an outdoor environment owing to the dramatic lighting changes and uncontrollable variations from weather conditions. Based on a well-trained deep learning network, we hope to show the audiences the robustness of our vacant space detection system and its ability to handle the parking displacement problem, the non-unified car size problem, the inter-object occlusion problem, and the lighting variation problem. The system can infer the status of 71 parking spaces within 0.6 seconds on an Intel Core i5 3.2 GHz processor with an NVIDIA GeForce GTX TITAN X card. Moreover, the system upon our designed deep learning method has achieved promising detection accuracy in a wide variety of outdoor conditions. The novelty lies in the use of a convolutional spatial transformer network to adaptively transform the cropped local image patch so that the normalized patch could be less sensitive to different car sizes and parking displacement. Another novelty of our network is that it also takes the neighboring spaces into concern when estimating the status of the target space. Thereby, the inter-object occlusion problem could be well addressed. Finally, the robust features extracted by deep learning network help to overcome the lighting problem. To make our demonstration attractive, we plan to provide a website to show the live and real-time camera views, indicate the vacant space detection results, and illustrate our designed deep learning framework. In particular, we hope our system is able to attract the audience's eyes and demonstrate a practical vision application for the parking lot management.
arXiv: Computer Vision and Pattern Recognition, 2017
In order to solve unsupervised domain adaptation problem, recent methods focus on the use of adve... more In order to solve unsupervised domain adaptation problem, recent methods focus on the use of adversarial learning to learn the common representation among domains. Although many designs are proposed, they seem to ignore the negative influence of domain-specific characteristics in transferring process. Besides, they also tend to obliterate these characteristics when extracted, although they are useful for other tasks and somehow help preserve the data. Take into account these issues, in this paper, we want to design a novel domain-adaptation architecture which disentangles learned features into multiple parts to answer the questions: what features to transfer across domains and what to preserve within domains for other tasks. Towards this, besides jointly matching domain distributions in both image-level and feature-level, we offer new idea on feature exchange across domains combining with a novel feed-back loss and a semantic consistency loss to not only enhance the transfer-ability...
2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), 2019
Many methods have been proposed to solve the domain adaptation problem recently. However, the suc... more Many methods have been proposed to solve the domain adaptation problem recently. However, the success of them implicitly funds on the assumption that the information of domains are fully transferrable. If the assumption is not satisfied, the effect of negative transfer may degrade domain adaptation. In this paper, a better learning network has been proposed by considering three tasks-domain adaptation, disentangled representation, and style transfer simultaneously. Firstly, the learned features are disentangled into common parts and specific parts. The common parts represent the transferrable features, which are suitable for domain adaptation with less negative transfer. Conversely, the specific parts characterize the unique style of each individual domain. Based on this, the new concept of feature exchange across domains, which can not only enhance the transferability of common features but also be useful for image style transfer, is introduced. These designs allow us to introduce five types of training objectives to realize the three challenging tasks at the same time. The experimental results show that our architecture can be adaptive well to full transfer learning and partial transfer learning upon a well-learned disentangled representation. Besides, the trained network also demonstrates high potential to generate style-transferred images.
IEEE Transactions on Circuits and Systems for Video Technology, 2017
In a practical environment, the viewing angle and height of a video surveillance camera are uncon... more In a practical environment, the viewing angle and height of a video surveillance camera are uncontrollable. This may cause severe inter-object occlusion and complicate the detection problem. In this paper, we proposed a novel inference framework with multiple layers forvacantparking space detection. The framework consists of an Image layer, a Patch layer, a Space layer, and a Lot layer. In the Image layer, image patches were selected based on the 3D parking lot structure. We found that the occlusion pattern within each patchrevealscuesof the parking status. Thus, our system extracted lighting-invariant features of patches and trained weak classifiers for the recognition of the occlusion pattern in the Patch layer. The outputs of the classifiers, presenting the types of inter-object occlusion, were treated as the mid-level features and inputted to the Space layer. Next, a boosted space classifier was trained to recognize the mid-level features and output the status of a 3-space unit in a probability fashion. In the Lot layer, we regarded the local status decision of 3-space units as high-level evidences and proposed a Markov Random Field to refine the parking status. In addition, we extended the framework to bridge multiple cameras and integrate the complementary information for vacant space detection.Our results show that the proposed framework can overcome the inter-object occlusion and achieve betterstatus inferencein many environmental variations and different weather conditions.We also presented a real-time system to demonstrate the computing efficiency and the system robustness.
2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), 2015
In this paper, we proposed a new multi-layer discriminative framework for vacant parking space de... more In this paper, we proposed a new multi-layer discriminative framework for vacant parking space detection. From bottom to top, the framework consists of an image feature extraction layer, a patch classification layer, a weighted combination layer, and a status inference layer. In the feature extraction layer, the framework extracts lighting-invariant features to relieve the effects from lighting and shadow. In the patch classification layer, image patches are selected. In order To overcome perspective distortion, each patch was normalized. For different patch, we trained classifiers to recognize the occlusion patterns, which are treated as the middle-level feature of the parking status. In the weighted combination layer, three spaces are grouped as a unit to easily handle inter-object occlusion. Based on the middle-level features, a boosted space classifier was trained to determine the local status of a 3-space unit. In the status inference layer, we regarded these local status decisions as high-level evidences and inferred the final status of the parking lot. The results in an outdoor parking lot show our system can well handle inter-object occlusion and achieve robust vacant space detection under many environmental variations. A real-time system was also implemented to demonstrate its computing efficiency.
2014 International Conference on Connected Vehicles and Expo (ICCVE), 2014
In this paper, we proposed a crowd-sensing idea to construct the driving environment so that the ... more In this paper, we proposed a crowd-sensing idea to construct the driving environment so that the driver could have better understanding of his/her surroundings on the roadway. We assume that intelligent vehicles will embed a sensing system, which is composed of three basic modules including inter-vehicle communication, vehicle license plate verification, and distance estimation. Through the help of inter-vehicle communication, a vehicle can receive a set of IDs from its nearby vehicles. Those received IDs, with the license plate numbers of the nearby vehicles, could further improve the license plate verification function in an uncontrolled environment. Moreover, we proposed a regression method, which models the relationship between the image coordinate and the geometric distance, to estimate the front vehicle distance. Finally, by fusing the vehicle verification and distance information from nearby vehicles, the system would provide a global view to tell the driver the information of those vehicles around him and their distances. Comparing with the existing advanced driver assistance system (ADAS), this system would support a wider view of the driving environment, and provide a more comfortable and safer driving experience. To fulfill the sensing system, a license plate verification method with the help from inter-vehicle communication and a regression method for distance estimation are detailed in this paper. Based on the results, our system could verify the license plate with a high accuracy rate and provide robust distance estimation.
2015 IEEE International Conference on Consumer Electronics - Taiwan, 2015
ABSTRACT
2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), 2016
By fusing a sequence of exposure images, we could generate a high dynamic range (HDR) image and e... more By fusing a sequence of exposure images, we could generate a high dynamic range (HDR) image and enhance the image details. However, to display the HDR image on a low dynamic range (LDR) device, HDR compression is necessary. In the paper, a new method for HDR Compression based on matting Laplacian is proposed. The major assumption behind is that the tone-mapped LDR image must preserve the HDR local structure so that the image details could be well represented. Precisely, we treat the HDR image as a guidance image and embed the object structure of the HDR image into a matting Laplacian matrix. Further, we formulate HDR compression as an optimization problem. Through incorporating the matting Laplacian matrix into the objective function, the optimal LDR image is forced to have the similar local structures like the HDR image. Our experiments show the extracted LDR image could enhance the image details well without introducing severe edge effects or color artifacts.
arXiv (Cornell University), Dec 25, 2017
Many methods have been proposed to solve the domain adaptation problem recently. However, the suc... more Many methods have been proposed to solve the domain adaptation problem recently. However, the success of them implicitly funds on the assumption that the information of domains are fully transferrable. If the assumption is not satisfied, the effect of negative transfer may degrade domain adaptation. In this paper, a better learning network has been proposed by considering three tasks-domain adaptation, disentangled representation, and style transfer simultaneously. Firstly, the learned features are disentangled into common parts and specific parts. The common parts represent the transferrable features, which are suitable for domain adaptation with less negative transfer. Conversely, the specific parts characterize the unique style of each individual domain. Based on this, the new concept of feature exchange across domains, which can not only enhance the transferability of common features but also be useful for image style transfer, is introduced. These designs allow us to introduce five types of training objectives to realize the three challenging tasks at the same time. The experimental results show that our architecture can be adaptive well to full transfer learning and partial transfer learning upon a well-learned disentangled representation. Besides, the trained network also demonstrates high potential to generate style-transferred images.
2017 IEEE International Conference on Image Processing (ICIP), 2017
In the demonstration, we would show our live and real-time parking space detection system. The de... more In the demonstration, we would show our live and real-time parking space detection system. The detection function is founded on a video surveillance system built in an outdoor parking lot. As we might know, it is challenging to implement a practical vision system in an outdoor environment owing to the dramatic lighting changes and uncontrollable variations from weather conditions. Based on a well-trained deep learning network, we hope to show the audiences the robustness of our vacant space detection system and its ability to handle the parking displacement problem, the non-unified car size problem, the inter-object occlusion problem, and the lighting variation problem. The system can infer the status of 71 parking spaces within 0.6 seconds on an Intel Core i5 3.2 GHz processor with an NVIDIA GeForce GTX TITAN X card. Moreover, the system upon our designed deep learning method has achieved promising detection accuracy in a wide variety of outdoor conditions. The novelty lies in the use of a convolutional spatial transformer network to adaptively transform the cropped local image patch so that the normalized patch could be less sensitive to different car sizes and parking displacement. Another novelty of our network is that it also takes the neighboring spaces into concern when estimating the status of the target space. Thereby, the inter-object occlusion problem could be well addressed. Finally, the robust features extracted by deep learning network help to overcome the lighting problem. To make our demonstration attractive, we plan to provide a website to show the live and real-time camera views, indicate the vacant space detection results, and illustrate our designed deep learning framework. In particular, we hope our system is able to attract the audience's eyes and demonstrate a practical vision application for the parking lot management.
arXiv: Computer Vision and Pattern Recognition, 2017
In order to solve unsupervised domain adaptation problem, recent methods focus on the use of adve... more In order to solve unsupervised domain adaptation problem, recent methods focus on the use of adversarial learning to learn the common representation among domains. Although many designs are proposed, they seem to ignore the negative influence of domain-specific characteristics in transferring process. Besides, they also tend to obliterate these characteristics when extracted, although they are useful for other tasks and somehow help preserve the data. Take into account these issues, in this paper, we want to design a novel domain-adaptation architecture which disentangles learned features into multiple parts to answer the questions: what features to transfer across domains and what to preserve within domains for other tasks. Towards this, besides jointly matching domain distributions in both image-level and feature-level, we offer new idea on feature exchange across domains combining with a novel feed-back loss and a semantic consistency loss to not only enhance the transfer-ability...
2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), 2019
Many methods have been proposed to solve the domain adaptation problem recently. However, the suc... more Many methods have been proposed to solve the domain adaptation problem recently. However, the success of them implicitly funds on the assumption that the information of domains are fully transferrable. If the assumption is not satisfied, the effect of negative transfer may degrade domain adaptation. In this paper, a better learning network has been proposed by considering three tasks-domain adaptation, disentangled representation, and style transfer simultaneously. Firstly, the learned features are disentangled into common parts and specific parts. The common parts represent the transferrable features, which are suitable for domain adaptation with less negative transfer. Conversely, the specific parts characterize the unique style of each individual domain. Based on this, the new concept of feature exchange across domains, which can not only enhance the transferability of common features but also be useful for image style transfer, is introduced. These designs allow us to introduce five types of training objectives to realize the three challenging tasks at the same time. The experimental results show that our architecture can be adaptive well to full transfer learning and partial transfer learning upon a well-learned disentangled representation. Besides, the trained network also demonstrates high potential to generate style-transferred images.
IEEE Transactions on Circuits and Systems for Video Technology, 2017
In a practical environment, the viewing angle and height of a video surveillance camera are uncon... more In a practical environment, the viewing angle and height of a video surveillance camera are uncontrollable. This may cause severe inter-object occlusion and complicate the detection problem. In this paper, we proposed a novel inference framework with multiple layers forvacantparking space detection. The framework consists of an Image layer, a Patch layer, a Space layer, and a Lot layer. In the Image layer, image patches were selected based on the 3D parking lot structure. We found that the occlusion pattern within each patchrevealscuesof the parking status. Thus, our system extracted lighting-invariant features of patches and trained weak classifiers for the recognition of the occlusion pattern in the Patch layer. The outputs of the classifiers, presenting the types of inter-object occlusion, were treated as the mid-level features and inputted to the Space layer. Next, a boosted space classifier was trained to recognize the mid-level features and output the status of a 3-space unit in a probability fashion. In the Lot layer, we regarded the local status decision of 3-space units as high-level evidences and proposed a Markov Random Field to refine the parking status. In addition, we extended the framework to bridge multiple cameras and integrate the complementary information for vacant space detection.Our results show that the proposed framework can overcome the inter-object occlusion and achieve betterstatus inferencein many environmental variations and different weather conditions.We also presented a real-time system to demonstrate the computing efficiency and the system robustness.
2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), 2015
In this paper, we proposed a new multi-layer discriminative framework for vacant parking space de... more In this paper, we proposed a new multi-layer discriminative framework for vacant parking space detection. From bottom to top, the framework consists of an image feature extraction layer, a patch classification layer, a weighted combination layer, and a status inference layer. In the feature extraction layer, the framework extracts lighting-invariant features to relieve the effects from lighting and shadow. In the patch classification layer, image patches are selected. In order To overcome perspective distortion, each patch was normalized. For different patch, we trained classifiers to recognize the occlusion patterns, which are treated as the middle-level feature of the parking status. In the weighted combination layer, three spaces are grouped as a unit to easily handle inter-object occlusion. Based on the middle-level features, a boosted space classifier was trained to determine the local status of a 3-space unit. In the status inference layer, we regarded these local status decisions as high-level evidences and inferred the final status of the parking lot. The results in an outdoor parking lot show our system can well handle inter-object occlusion and achieve robust vacant space detection under many environmental variations. A real-time system was also implemented to demonstrate its computing efficiency.
2014 International Conference on Connected Vehicles and Expo (ICCVE), 2014
In this paper, we proposed a crowd-sensing idea to construct the driving environment so that the ... more In this paper, we proposed a crowd-sensing idea to construct the driving environment so that the driver could have better understanding of his/her surroundings on the roadway. We assume that intelligent vehicles will embed a sensing system, which is composed of three basic modules including inter-vehicle communication, vehicle license plate verification, and distance estimation. Through the help of inter-vehicle communication, a vehicle can receive a set of IDs from its nearby vehicles. Those received IDs, with the license plate numbers of the nearby vehicles, could further improve the license plate verification function in an uncontrolled environment. Moreover, we proposed a regression method, which models the relationship between the image coordinate and the geometric distance, to estimate the front vehicle distance. Finally, by fusing the vehicle verification and distance information from nearby vehicles, the system would provide a global view to tell the driver the information of those vehicles around him and their distances. Comparing with the existing advanced driver assistance system (ADAS), this system would support a wider view of the driving environment, and provide a more comfortable and safer driving experience. To fulfill the sensing system, a license plate verification method with the help from inter-vehicle communication and a regression method for distance estimation are detailed in this paper. Based on the results, our system could verify the license plate with a high accuracy rate and provide robust distance estimation.
2015 IEEE International Conference on Consumer Electronics - Taiwan, 2015
ABSTRACT
2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), 2016
By fusing a sequence of exposure images, we could generate a high dynamic range (HDR) image and e... more By fusing a sequence of exposure images, we could generate a high dynamic range (HDR) image and enhance the image details. However, to display the HDR image on a low dynamic range (LDR) device, HDR compression is necessary. In the paper, a new method for HDR Compression based on matting Laplacian is proposed. The major assumption behind is that the tone-mapped LDR image must preserve the HDR local structure so that the image details could be well represented. Precisely, we treat the HDR image as a guidance image and embed the object structure of the HDR image into a matting Laplacian matrix. Further, we formulate HDR compression as an optimization problem. Through incorporating the matting Laplacian matrix into the objective function, the optimal LDR image is forced to have the similar local structures like the HDR image. Our experiments show the extracted LDR image could enhance the image details well without introducing severe edge effects or color artifacts.