;Deep learning has demonstrated remarkable performance in the field of object detection. However, its extensive computational and memory requirements make it challenging to deploy on edge devices with limited computing resources requiring real-time operations. To address this issue, this paper proposes a YOLO-based object detection network based on a binary backbone network. The backbone network primarily utilizes binary convolution operations from ReActNet, significantly reducing model parameters and size. Furthermore, a flexible architecture BB-YOLO is introduced using a hierarchical modular design approach. Additionally, a hardware accelerator for BB-YOLO integrates pipelined design and fixed-point arithmetic, replacing floating-point operations to enhance neural network inference speed and reduce hardware resource usage. Experimental results show that the hardware accelerator requires 9.0967μs for single image detection, demonstrating excellent inference acceleration performance compared to a computer server with Graphics Processing Unit. The proposed BB-YOLO hardware accelerator not only features a flexible architecture but also enables real-time inference. Consequently, it offers a viable solution for real-time object detection in hardware-constrained environments.