研究期間:10208~10307;As multiprocessors become the mainstream in both mainframe and embedded systems, the scalability and the concurrency are still limited by power wall and memory wall. The thermal issues and the limited capacity of batteries should be considered in multiprocessor systems. The affinity on different threads also restricts the resources from fully utilized; part of the processors become dark silicon. Locality and load balance are two decisive design issues to exploit the superior performance of many-core architectures. However, optimizing only for one issue could deteriorate the potential benefit of the other. The highly correlated impacts of these two factors require a cooperative optimization. Furthermore, the dynamic voltage and frequency scaling and the other power management techniques would even change the computing ability of cores on the fly, which can also lead to load unbalance situation. This proposal will develop highly efficient optimization algorithm to enhance the locality and load balance of massive parallel applications. This project considers three issues of power management for multiprocessor systems: energy, thermal and performance penalty. Reducing energy consumption or lowering temperature may cause performance penalties, including frequency degradation and wakeup delay. In this project, it is co-optimized with the load-balance problem to obtain shorter latency. Since this project focuses on many-core architectures, we will design scalable and online policies. In summary, the current design environment lacks an effective methodology to explore the huge design space. In this proposal, we will consider the performance and power issues in multiprocessor systems. The research process of this project will be scheduled as following: we will consider static locality-aware optimization and prepare the fundamental policies for power management in the first year, develop dynamic load-balance-aware thread management and hierarchical power management techniques in the second year, and architecture-aware co-optimized runtime management in the last year. In conclusion, we will combine the results of three years to form a cross-layer multi-objective methodology for many-core architecture.