參考文獻 |
[1] torchinfo. https://pypi.org/project/torchinfo/, 2014.
[2] Covid-19. https://www.mohw.gov.tw/cp-4634-52410-1.html, 2019.
[3] Sota. https://reurl.cc/r5kp3x, 2019.
[4] Itaewon. https://reurl.cc/V8mvaN, 2022.
[5] sports events. https://reurl.cc/Gez8Gv, 2022.
[6] Shuai Bai, Zhiqun He, Yu Qiao, Hanzhe Hu, Wei Wu, and Junjie Yan. Adaptive
dilated network with self-correction supervision for counting. In 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), pages 4593–4602,
2020.
[7] Lokesh Boominathan, Srinivas S S Kruthiventi, and R. Venkatesh Babu. Crowdnet:
A deep convolutional network for dense crowd counting. New York, NY, USA, 2016.
Association for Computing Machinery.
[8] Junxu Cao, Qi Chen, Jun Guo, and Ruichao Shi. Attention-guided context feature
pyramid network for object detection, 2020.
[9] Antoni B. Chan and Nuno Vasconcelos. Bayesian poisson regression for crowd counting. In 2009 IEEE 12th International Conference on Computer Vision, pages 545–
551, 2009.
[10] Ke Chen, Chen Change Loy, Shaogang Gong, and Tony Xiang. Feature mining for
localised crowd counting. In Proceedings of the British Machine Vision Conference,
pages 21.1–21.11. BMVA Press, 2012.
[11] Zhi-Qi Cheng, Qi Dai, Hong Li, JingKuan Song, Xiao Wu, and Alexander G. Hauptmann. Rethinking spatial invariance of convolutional networks for object counting,
2022.
[12] Fran¸cois Chollet. Xception: Deep learning with depthwise separable convolutions. In
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages
1800–1807, 2017.
[13] Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen
Wei. Deformable convolutional networks, 2017.
[14] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR’05), 1:886–893 vol. 1, 2005.
[15] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua
Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold,
Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words:
Transformers for image recognition at scale, 2020.
[16] Juergen Gall, Angela Yao, Nima Razavi, Luc Van Gool, and Victor S. Lempitsky.
Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33:2188–2202, 2011.
[17] Chenqiang Gao, Jun Liu, Qi Feng, and Jing Lv. People-flow counting in complex
environments by combining depth and color information. Multimedia Tools and Applications, 75:9315 – 9331, 2016.
[18] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
for image recognition. CoRR, abs/1512.03385, 2015.
[19] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pages 770–778, 2016.
[20] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications, 2017.
[21] Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. Multi-source multiscale counting in extremely dense crowd images. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 2547–2554, 2013.
[22] Xiaoheng Jiang, Li Zhang, Mingliang Xu, Tianzhu Zhang, Pei Lv, Bing Zhou, Xin
Yang, and Yanwei Pang. Attention scaling for crowd counting. In 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), pages 4705–4714,
2020.
[23] Harold W. Kuhn. The hungarian method for the assignment problem. Naval Research
Logistics (NRL), 52, 1955.
[24] Harold W. Kuhn. The hungarian method for the assignment problem. In Michael
J¨unger, Thomas M. Liebling, Denis Naddef, George L. Nemhauser, William R. Pulleyblank, Gerhard Reinelt, Giovanni Rinaldi, and Laurence A. Wolsey, editors, 50
Years of Integer Programming 1958-2008 - From the Early Years to the State-of-theArt, pages 29–47. Springer, 2010.
[25] Yinjie Lei, Yan Liu, Pingping Zhang, and Lingqiao Liu. Towards using count-level
weak supervision for crowd counting. Pattern Recognition, 109:107616, 2021.
[26] Min Li, Zhaoxiang Zhang, Kaiqi Huang, and Tieniu Tan. Estimating the number of
people in crowded scenes by mid based foreground segmentation and head-shoulder
detection. 2008 19th International Conference on Pattern Recognition, pages 1–4,
2008.
[27] Sheng-Fuu Lin, Jaw-Yeh Chen, and Hung-Xin Chao. Estimation of number of people
in crowded scenes using perspective transformation. IEEE Trans. Syst. Man Cybern.
Part A, 31:645–654, 2001.
[28] Tsung-Yi Lin, Piotr Doll´ar, Ross Girshick, Kaiming He, and Serge Hariharan,
Bharathand Belongie. Feature pyramid networks for object detection. In 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pages 936–944,
2017.
[29] Jiang Liu, Chenqiang Gao, Deyu Meng, and Alexander G. Hauptmann. Decidenet:
Counting varying density crowds through attention guided detection and density
estimation, 2017.
[30] Weizhe Liu, Mathieu Salzmann, and Pascal Fua. Context-aware crowd counting,
2018.
[31] Weizhe Liu, Mathieu Salzmann, and Pascal Fua. Context-aware crowd counting,
2019.
[32] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin,
and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV),
pages 9992–10002, 2021.
[33] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin,
and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted
windows. CoRR, abs/2103.14030, 2021.
[34] Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell,
and Saining Xie. A convnet for the 2020s. In 2022 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), pages 11966–11976, 2022.
[35] Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019.
[36] Ze Lu, Xudong Jiang, and Alex Kot. Deep coupled resnet for low-resolution face
recognition. IEEE Signal Processing Letters, 25(4):526–530, 2018.
[37] Zhiheng Ma, Xing Wei, Xiaopeng Hong, and Yihong Gong. Bayesian loss for crowd
count estimation with point supervision, 2019.
[38] Zhiheng Ma, Xing Wei, Xiaopeng Hong, and Yihong Gong. Bayesian loss for crowd
count estimation with point supervision. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 6141–6150, 2019.
[39] Yunqi Miao, Zijia Lin, Guiguang Ding, and Jungong Han. Shallow feature based
dense attention network for crowd counting, 2020.
[40] Xianfeng Ou, Pengcheng Yan, Yiming Zhang, Bing Tu, Guoyun Zhang, Jianhui Wu,
and Wujing Li. Moving object detection method via resnet-18 with encoder–decoder
structure in complex scenes. IEEE Access, 7:108152–108160, 2019.
[41] N. Paragios and V. Ramesh. A mrf-based approach for real-time subway monitoring.
In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision
and Pattern Recognition. CVPR 2001, volume 1, pages I–I, 2001.
[42] Viet Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, and Ryuzo Okada. Count
forest: Co-voting uncertain number of targets using random forest for crowd density
estimation. 2015 IEEE International Conference on Computer Vision (ICCV), pages
3253–3261, 2015.
[43] Viet-Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, and Ryuzo Okada. Count
forest: Co-voting uncertain number of targets using random forest for crowd density
estimation. In 2015 IEEE International Conference on Computer Vision (ICCV),
pages 3253–3261, 2015.
[44] David Ryan, Simon Denman, Clinton Fookes, and Sridha Sridharan. Crowd counting
using multiple local features. In 2009 Digital Image Computing: Techniques and
Applications, pages 81–88, 2009.
[45] Payam Sabzmeydani and Greg Mori. Detecting pedestrians by learning shapelet
features. 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages
1–8, 2007.
[46] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh
Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
[47] Miaojing Shi, Zhaohui Yang, Chao Xu, and Qijun Chen. Revisiting perspective
information for efficient crowd counting, 2018.
[48] Xiaowen Shi, Xin Li, Caili Wu, Shuchen Kong, Jing Yang, and Liang He. A realtime deep network for crowd counting. In ICASSP 2020 - 2020 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2328–2332,
2020.
[49] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for largescale image recognition, 2014.
[50] Vishwanath A. Sindagi and Vishal M. Patel. Cnn-based cascaded multi-task learning
of high-level prior and density estimation for crowd counting. In 2017 14th IEEE
International Conference on Advanced Video and Signal Based Surveillance (AVSS),
pages 1–6, 2017.
[51] Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai, Chengjie
Wang, Jilin Li, Feiyue Huang, and Yang Wu. Rethinking counting and localization
in crowds: A purely point-based framework, 2021.
[52] Russell Stewart, Mykhaylo Andriluka, and Andrew Y. Ng. End-to-end people detection in crowded scenes. In 2016 IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 2325–2333.
IEEE Computer Society, 2016.
[53] Pongpisit Thanasutives, Ken-ichi Fukui, Masayuki Numao, and Boonserm Kijsirikul.
Encoder-decoder based convolutional neural networks with multi-scale-aware modules for crowd counting. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 2382–2389, 2021.
[54] Pongpisit Thanasutives, Ken ichi Fukui, Masayuki Numao, and Boonserm Kijsirikul.
Encoder-decoder based convolutional neural networks with multi-scale-aware modules for crowd counting. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, jan 2021.
[55] Vladimir Vapnik, Steven E. Golowich, and Alex Smola. Support vector method for
function approximation, regression estimation and signal processing. In Proceedings of the 9th International Conference on Neural Information Processing Systems,
NIPS’96, page 281–287, Cambridge, MA, USA, 1996. MIT Press.
[56] Paul A. Viola and Michael Jones. Robust real-time face detection. International
Journal of Computer Vision, 57:137–154, 2001.
[57] Paul A. Viola, Michael J. Jones, and Daniel Snow. Detecting pedestrians using patterns of motion and appearance. International Journal of Computer Vision, 63:153–
161, 2003.
[58] Boyu Wang, Huidong Liu, Dimitris Samaras, and Minh Hoai. Distribution matching
for crowd counting, 2020.
[59] Qian Wang and Toby P. Breckon. Crowd counting via segmentation guided attention networks and curriculum loss. IEEE Transactions on Intelligent Transportation
Systems, 23(9):15233–15243, 2022.
[60] Yi Wang, Junhui Hou, Xinyu Hou, and Lap-Pui Chau. A self-training approach for
point-supervised object detection and counting in crowds. IEEE Transactions on
Image Processing, 30:2876–2887, 2021.
[61] Zijun Wei, Boyu Wang, Minh Hoai, Jianming Zhang, Xiaohui Shen, Zhe Lin, Radom´ır
Mech, and Dimitris Samaras. Sequence-to-segments networks for detecting segments
in videos. IEEE Trans. Pattern Anal. Mach. Intell., 43(3):1009–1021, 2021.
62] Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So
Kweon, and Saining Xie. Convnext v2: Co-designing and scaling convnets with
masked autoencoders, 2023.
[63] Bo Wu and Ramakant Nevatia. Detection of multiple, partially occluded humans
in a single image by bayesian combination of edgelet part detectors. Tenth IEEE
International Conference on Computer Vision (ICCV’05) Volume 1, 1:90–97 Vol. 1,
2005.
[64] Saining Xie, Ross Girshick, Piotr Doll´ar, Zhuowen Tu, and Kaiming He. Aggregated
residual transformations for deep neural networks. In 2017 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pages 5987–5995, 2017.
[65] Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convolutions, 2015.
[66] Cong Zhang, Hongsheng Li, Xiaogang Wang, and Xiaokang Yang. Cross-scene crowd
counting via deep convolutional neural networks. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 833–841, 2015.
[67] Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. Single-image
crowd counting via multi-column convolutional neural network. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
[68] Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. Single-image
crowd counting via multi-column convolutional neural network. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 589–597, 2016. |