From Individual to Whole: Reducing Intra-class Variance by Feature Aggregation (original) (raw)

References

Carreira-Perpiñán, M. Á. (2006). Fast nonparametric clustering with gaussian blurring mean-shift. In ICML.
Chang, X., Hospedales, T. M., & Xiang, T. (2018). Multi-level factorisation net for person re-identification. In CVPR.
Chen, D., Li, H., Xiao, T., Yi, S., & Wang, X. (2018a). Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In CVPR.
Chen, G., Zhang, T., Lu, J., & Zhou, J. (2019). Deep meta metric learning. In ICCV.
Chen, K., Wang, J., Yang, S., Zhang, X., Xiong, Y., Loy, C. C., & Lin, D. (2018b). Optimizing video object detection via a scale-time lattice. In CVPR.
Chen, Y., Zhu, X., & Gong, S. (2017). Person re-identification by deep learning multi-scale representations. In ICCV.
Chen, Z., Huang, S., & Tao, D. (2018c). Context refinement for object detection. In ECCV.
Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., & Le, Q. V. (2019). Autoaugment: Learning augmentation strategies from data. In CVPR.
Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In CVPRW.
Damen, D., Doughty, H., Maria Farinella, G., Fidler, S., Furnari, A., Kazakos, E., Moltisanti, D., Munro, J., Perrett, T., Price, W., et al. (2018). Scaling egocentric vision: The epic-kitchens dataset. In Proceedings of the European conference on computer vision (ECCV) (pp. 720–736).
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In CVPR.
DeVries, T. & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552.
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., & Brox, T. (2015). FlowNet: Learning optical flow with convolutional networks. In CVPR.
Feichtenhofer, C., Pinz, A., & Zisserman, A. (2017). Detect to track and track to detect. In ICCV.
Fu, Y., Wang, X., Wei, Y., & Huang, T. (2019a). Sta: Spatial-temporal attention for large-scale video-based person re-identification. In AAAI.
Fu, Y., Wei, Y., Zhou, Y., Shi, H., Huang, G., Wang, X., Yao, Z., & Huang, T. (2019b). Horizontal pyramid matching for person re-identification. In AAAI.
Gu, X., Ma, B., Chang, H., Shan, S., & Chen, X. (2019). Temporal knowledge propagation for image-to-video person re-identification. In ICCV.
Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In CVPR.
Han, W., Khorrami, P., Paine, T. L., Ramachandran, P., Babaeizadeh, M., Shi, H., Li, J., Yan, S., & Huang, T. S. (2016). Seq-NMS for video object detection. arXiv:1602.08465.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In ICCV.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.
Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv:1703.07737.
Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., & Chen, X. (2019). VRSTC: Occlusion-free video person re-identification. In CVPR.
Hu, H., Gu, J., Zhang, Z., Dai, J., & Wei, Y. (2018). Relation networks for object detection. In CVPR.
Ioffe, S. & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML.
Jegou, H., Harzallah, H., & Schmid, C. (2007). A contextual dissimilarity measure for accurate and efficient image search. In CVPR.
Kang, K., Li, H., Yan, J., Zeng, X., Yang, B., Xiao, T., Zhang, C., Wang, Z., Wang, R., Wang, X., et al. (2017). T-CNN: Tubelets with convolutional neural networks for object detection from videos. In TCSVT.
Kang, K., Ouyang, W., Li, H., & Wang, X. (2016). Object detection from video tubelets with convolutional neural networks. In CVPR.
Li, J., Wang, J., Tian, Q., Gao, W., & Zhang, S. (2019a). Global-local temporal representations for video person re-identification. In ICCV.
Li, J., Zhang, S., & Huang, T. (2019b). Multi-scale 3d convolution network for video based person re-identification. In AAAI.
Li, S., Bak, S., Carr, P., & Wang, X. (2018a). Diversity regularized spatiotemporal attention for video-based person re-identification. In CVPR.
Li, W., Zhao, R., Xiao, T., & Wang, X. (2014). DeepReID: Deep filter pairing neural network for person re-identification. In ICCV.
Li, W., Zhu, X., & Gong, S. (2018b). Harmonious attention network for person re-identification. In CVPR.
Lin, Y., Zheng, L., Zheng, Z., Wu, Y., & Yang, Y. (2017). Improving person re-identification by attribute and identity learning. arXiv:1703.07220.
Liu, C.-T., Wu, C.-W., Wang, Y.-C. F., & Chien, S.-Y. (2019). Spatially and temporally efficient non-local attention network for video-based person re-identification. In BMVC.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016a). SSD: Single shot multibox detector. In ECCV.
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In CVPR.
Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016b). Large-margin softmax loss for convolutional neural networks. In ICML.
Lu, Y., Lu, C., & Tang, C.-K. (2017). Online video object detection using association LSTM. In ICCV.
Luo, C., Chen, Y., Wang, N., & Zhang, Z. (2019a). Spectral feature transformation for person re-identification. In ICCV.
Luo, H., Jiang, W., Zhang, X., Fan, X., Qian, J., & Zhang, C. (2019b). Alignedreid++: Dynamically matching local information for person re-identification. Pattern Recognition, 94, 53–61.
Article Google Scholar
Meila, M. & Shi, J. (2001). A random walks view of spectral segmentation. In AISTATS.
Movshovitzattias, Y., Toshev, A., Leung, T. K., Ioffe, S., & Singh, S. (2017). No fuss distance metric learning using proxies. In ICCV.
Oh Song, H., Xiang, Y., Jegelka, S., & Savarese, S. (2016). Deep metric learning via lifted structured feature embedding. In CVPR.
Qian, X., Fu, Y., Jiang, Y.-G., Xiang, T., & Xue, X. (2017). Multi-scale deep learning architectures for person re-identification. In ICCV.
Qin, D., Gammeter, S., Bossard, L., Quack, T., & Van Gool, L. (2011). Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors. In CVPR.
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In ECCV workshop.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. In IJCV.
Sarfraz, M. S., Schumann, A., Eberle, A., & Stiefelhagen, R. (2018). A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In CVPR.
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). FaceNet: A unified embedding for face recognition and clustering. In CVPR.
Shen, Y., Li, H., Xiao, T., Yi, S., Chen, D., & Wang, X. (2018a). Deep group-shuffling random walk for person re-identification. In CVPR.
Shen, Y., Li, H., Yi, S., Chen, D., & Wang, X. (2018b). Person re-identification with deep similarity-guided graph neural network. In ECCV.
Si, J., Zhang, H., Li, C.-G., Kuen, J., Kong, X., Kot, A. C., & Wang, G. (2018). Dual attention matching network for context-aware feature sequence based person re-identification. In CVPR.
Sohn, K. (2016). Improved deep metric learning with multi-class n-pair loss objective. In NeurIPS.
Subramaniam, A., Nambiar, A., & Mittal, A. (2019). Co-segmentation inspired attention networks for video-based person re-identification. In ICCV.
Suh, Y., Wang, J., Tang, S., Mei, T., & Lee, K. M. (2018). Part-aligned bilinear representations for person re-identification. In ECCV.
Sun, Y., Zheng, L., Deng, W., & Wang, S. (2017). SVDNet for pedestrian retrieval. In ICCV.
Sun, Y., Zheng, L., Yang, Y., Tian, Q., & Wang, S. (2018). Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In ECCV.
Tripathi, S., Lipton, Z. C., Belongie, S., & Nguyen, T. (2016). Context matters: Refining object detection in video with recurrent neural networks. arXiv:1607.04648.
Wang, C., Zhang, Q., Huang, C., Liu, W., & Wang, X. (2018a). Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In ECCV.
Wang, F., Cheng, J., Liu, W., & Liu, H. (2018b). Additive margin softmax for face verification. IEEE Signal Processing Letters, 25(7), 926–930.
Article Google Scholar
Wang, G., Yuan, Y., Chen, X., Li, J., & Zhou, X. (2018c). Learning discriminative features with multiple granularities for person re-identification. In ACM MM.
Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., & Liu, W. (2018d). CosFace: Large margin cosine loss for deep face recognition. In CVPR.
Wang, S., Zhou, Y., Yan, J., & Deng, Z. (2018e). Fully motion-aware network for video object detection. In ECCV.
Wang, X. & Gupta, A. (2018). Videos as space-time region graphs. In ECCV.
Wang, Y., Chen, Z., Wu, F., & Wang, G. (2018f). Person re-identification with cascaded pairwise convolutions. In CVPR.
Wang, Y., Wang, L., You, Y., Zou, X., Chen, V., Li, S., Huang, G., Hariharan, B., & Weinberger, K. Q. (2018g). Resource aware person re-identification across multiple resolutions. In CVPR.
Wei, L., Zhang, S., Gao, W., & Tian, Q. (2018). Person transfer GAN to bridge domain gap for person re-identification. In CVPR.
Wei, L., Zhang, S., Yao, H., Gao, W., & Tian, Q. (2017). GLAD: Global–local-alignment descriptor for pedestrian retrieval. In ACM MM.
Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In ECCV.
Wu, H., Chen, Y., Wang, N., & Zhang, Z. (2019). Sequence level semantics aggregation for video object detection. In ICCV.
Wu, Y., Lin, Y., Dong, X., Yan, Y., Ouyang, W., & Yang, Y. (2018). Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In CVPR.
Xiao, F. & Lee, Y. J. (2018). Video object detection with an aligned spatial-temporal memory. In ECCV.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In CVPR.
Yang, W., Huang, H., Zhang, Z., Chen, X., Huang, K., & Zhang, S. (2019). Towards rich feature discovery with class activation maps augmentation for person re-identification. In CVPR.
Yu, R., Zhou, Z., Bai, S., & Bai, X. (2017). Divide and fuse: A re-ranking approach for person re-identification. In BMVC.
Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., & Yoo, Y. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV.
Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv:1710.09412.
Zhao, Y., Shen, X., Jin, Z., Lu, H., & Hua, X. (2019). Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In CVPR.
Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., & Tian, Q. (2016). MARS: A video benchmark for large-scale person re-identification. In ECCV.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: A benchmark. In ICCV.
Zheng, Z., Zheng, L., & Yang, Y. (2017a). A discriminatively learned CNN embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications, 14(1), 13.
Zheng, Z., Zheng, L., & Yang, Y. (2017b). Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In ICCV.
Zhong, Z., Zheng, L., Cao, D., & Li, S. (2017). Re-ranking person re-identification with k-reciprocal encoding. In CVPR.
Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random erasing data augmentation. In AAAI.
Zhu, X., Dai, J., Yuan, L., & Wei, Y. (2018). Towards high performance video object detection. In CVPR.
Zhu, X., Wang, Y., Dai, J., Yuan, L., & Yichen, W. (2017a). Flow-guided feature aggregation for video object detection. In ICCV.
Zhu, X., Xiong, Y., Dai, J., Yuan, L., & Wei, Y. (2017b). Deep feature flow for video recognition. In CVPR.

Download references