A large class of loop programs applied in solving differential equations, Fourier transforms, image processing and neural processing can be translated or rewritten into a vector execution form with a pi-block dependence graph, In the paper we propose a multithreading strategy to partition such vectorized loops into multithread execution form. Each partitioned thread consists of instances of statements with localities in vector registers. The multithreading scheme gives a novel combination of loop unrolling, statement instances reordering, index shifting, vector register reuse exploiting and multithreading. For some cases of loop program with pi-bloek dependence graph, experimental results show that our scheme assists vector compilers of the Convex C38 series to reduce the number of memory accesses and synchronizations among CPUs.