摘要: | 平行工作計算(Job-Parallel Computing)式的網格系統將每個需要執行的程式當成工作(Job)送往網格上適合的機器上面執行,其優點就是管理容易而且程式不必重寫即可在網格上執行,缺點則是無法在計算期間使用複雜的通訊而限制其可程式化的能力。相反的,全球計算 (Worldwide Computing)式的網格系統透過網路與虛擬機器(Virtual Machine)的技術,將所有的異質性計算資源整合成一個具有單一性質計算平台。這種網格系統提供強大的通訊、同步與分散式計算、動態系統重組等等功能。在本計劃中,我們提出一個嶄新的網格計算系統 - 一個建立在平行工作系統上的動態全球計算平台。這個計算平台將Condor(一種平行工作計算網格系統)當成系統的主骨幹,然後把SALSA (一種全球計算系統) 的虛擬機器與SALSA應用程式一起上傳到Condor上的不特定計算資源上面執行。因此這種計算架構將具有更大的彈性、管理容易、而且可保有全球計算的各項優點。我們的作法是利用Condor的工具將可以使用的計算資源整合成可執行的一個集合,也就是Condor Pool,再建立一個支援動態新增與移除SALSA虛擬機器的機制,然後進行執行介面的開發。我們的目標是建立SALSA使用者和Condor System之間的中介軟體,並將研發成果以開放原始碼的形式開放公眾下載使用。 ; A job-parallel grid system considers each program to be executed as a job, and looks for available computing resources for the job. The major advantages of a job-parallel grid system are: (1) job execution can be easily handled by users, and (2) executable files can be submitted to the system without program re-engineering. The disadvantage is that its programmability is not good enough to support advanced communication primitives. On the contrary, a worldwide computing grid system utilizes the power of the internet and the technology of virtual machines to integrate heterogeneous computing resources as a whole. It provides high-level communication primitives for better programmability, supports numerous coordination approaches for distributed computing, and enables dynamic system reconfiguration for dynamic load-balancing. In this proposal, we suggest a novel, dynamic worldwide computing platform which operates on a task-parallel computing system. The proposed platform uses Condor, a task-parallel computing system, as its fundamental infrastructure, and it submits the virtual machines of SALSA, which is a worldwide computing system, along with the SALSA applications to the Condor system for execution. The proposed platform will be more flexible, more manageable, and runs as a complete worldwide computing platform because it is actually a system of two faces. To construct the proposed platform, we will use Condor to build a Condor pool (a set of computing resources) first. Consequently, we will devise a mechanism to dynamically create/remove SALSA virtual machines on the Condor pool. Then we will implement necessary middleware to shorten the gap between the users of SALSA and the Condor system. Any research results developed in this project, including the source code, will be open to public access. ; 研究期間 9808 ~ 9907 |