About APC

Our Proposal

Table of Contents | Summary of Proposal(1|2)| The Proposal(1|2|3|4|5|6|7|8)

[Subcontract Proposal]

Advanced Parallelizing Compiler Technology Proposal

Research and Development Theme

"Development of APC Technology"

May 15, 2000

University

The Toho University

Representative Mitsumasa Okada, Chief of Department of Science

Location

2-2-1 Mitsuyama, Funabashi-city, Chiba (274-8510)

Contact Akimasa Yoshida, Full-time Lecturer

Information Science Studies of Department of Science

TEL 047-472-8201

FAX 047-470-0966

e-mail yoshida@is.sci.toho-u.ac.jp

[Body]

Name of research and development project:
Advanced Parallelizing Compiler Technology

Research and development theme:
Technologies for the Evaluation of Parallelizing Compiler Performance

[1] Details and Targets of Research and Research and Development Capability

1. Proposed Details and Targets of Research and Development

1-1.Overview of Research and Development

This project will develop automatic data distribution technology for the multigrain parallelization technology used in the research and development item "development of APC technology." To achieve high effective performance in multigrain parallel processing, the maximum number of task parallels needs to be extracted from multiple grains throughout the entire program, and data transfer between processors needs to be minimized. In this project, automatic data distribution technology is developed to conduct intertask data transfer through local memory.

1-2. Details and Targets of Research and Development

To achieve a high level of effective performance from multigrain parallel processing, this project proposes the use of automatic data distribution, which will extract a high number of parallels and minimize data transfer between processors.

In conventional approaches to multigrain parallel processing, a compiler automatically extracts intertask parallels in multiple grains, then coarse-grain tasks are assigned to processors when the program runs. Data shared between coarse-grain tasks is stored in central memory. To achieve more efficient parallel processing, the data transfer overhead from sending memory through the central shared memory must be reduced.

To accomplish this goal, this project will develop an automatic data distribution method, by which the transfer of data between coarse-grain tasks over a wide range of the program is conducted not through central shared memory but through processor local memory. The procedure for conducting this research and development is to focus on coarse-grain task sets, where data transfer is high, and distribute data processing so that data definition and reference ranges are equal. In addition, a timing scheduling method will be developed for the distributed multigrain tasks so that data can be received through local memory.

The novelty of this method is that, whereas conventional automatic data distribution technology used in dynamic scheduling environments could only be applied to specific loop sets, this method achieves data reception through local memory between coarse-grain tasks over a wide range of the program, reducing data-transfer overhead.

<Final target>

The final target of this project is to use the automatic data distribution technology developed herein to achieve a performance improvement of 30% or more in one or more of the evaluation programs selected in the final year of the project, in comparison with cases where this technology is not used.

1-3. Research and Development Plan

2. Research and Development Capability

2-1. Research and Development Track Record

Akimasa Yoshida, have long been active in research in automatic data distribution (data localization) for automatic parallelizing compilers. My work has been published in six academic journals and at five international conferences. A list of my theses presented within the past three years is given below.

1) An Interloop Same-level Data Localization Method in Hierarchical Coarse-grain Parallelization (Akimasa. Yoshida, Kenichi. Koshizuka, Masami Okamoto, Hiromori. Kasahara), Journal of the Information Processing Society, Vol. 40, No.5, pp.2054-2063, May, 1999

2) A Data-Localization Compilation Scheme Using Partial-Static Task Assingment (H. Kasahara, A Yoshida)

Jounal of Parallel Computing, Vol. 24, No.3, pp. 579-596, May 1998.

3) Data-Localization Scheduling inside Processor-Cluster for Multigrain Parallel Processing (A. Yoshida, K.Koshizuka, W.Ogata, H.Kasahara)

Journal of the Electronic Information Communication Society, Vol.E80-D, No.4, pp.473-479, April. 1997

4) Data-Localization among Doall and Sequential Loops in Coarse Grain Parallel Processing (A. Yoshida, Y.Ujinaga, M. Obata, K.Kimura, H.Kasahara), Proc. of Seventh International Workshop on Compilers for Parallel Computers, pp. 266-27, Jun. 1998

2-2. Equipment Used in Research and Development

Type of Equipment

Details

Sun Enterprise 3000 (University equipment)

Parallel computer (4 CPUs)

Sun Ultra5

Workstations (2)

2-3. Effects of Research and Development on Industry

The proposed automatic data distribution technology is indispensable for achieving high effective performance in parallel processing in multiprocessor systems equipped with both local memory and DSM. The achievement of this technology is expected to contribute significantly to the commercialization of automatic parallelizing compilers for multiprocessor systems.

However, a tradeoff exists between using multigrain parallels in a program and improving data locality. Commercialization of this technology for automatic optimum data distribution over an entire program is not expected to happen soon, but automatic data distribution technology for coarse-grain task sets connected to data-sensitive edges can be expected to be commercialized within the next few years.

[2] FY 2000 Research Plan (first year)

1. Details of Research and Development (FY 2000)

In this project, in connection with the automatic multigrain parallelizing technology developed as part of the APC technology, an automatic data distribution technology is developed to reduce data transfer overhead between processors. To achieve high effective performance in multigrain parallel processing, sufficient parallels must be extracted between tasks in multiple grains to reduce data transfer overhead.

In FY 2000, focusing on coarse-grain task sets that do not include conditional branching, an automatic data distribution method will be developed to enable the transfer of data between coarse-grain tasks connected by data-sensitive edges over a wide range of the program through local memory on the same processor, rather than through central shared memory.

First, methods will be developed to determine the domains for application of data reception through local memory, in consideration of coarse-grain task parallels. Second, coarse-grain task allocation methods are developed to distribute data and processing so that data definitions and memory ranges are equal in domains (coarse-grain task sets) that are candidates for application of data reception through local memory. Third, dynamic scheduling methods are developed to allocate data and processing to processors when the program is run, so that the allocated multigrain tasks can be received in local memory. Through the development of these technologies, the project team expects that the range of application of data transfer through local memory will expand to a wide range of the program, dramatically reducing data transfer overhead.

The novelty of this method is that, whereas conventional automatic data distribution technology used in dynamic scheduling environments could only be applied to specific loop sets, this method automatically distributes to each processor data shared between coarse-grain tasks over a wide range of the program.

3 Research and Development System

1. Research Organization and Management System

1-1.Officers Responsible for Research Implementation

Mitsumasa Okada, Chief of Department of Science, The Toho University

Accounting Manager

Tatsuro Yamatsuchi, Director of Secretariat of Department of Science, The Toho University

1-2. Organization Chart

1-3. Research Locations

2-2-1 Mitsuyama, Funabashi-city, Chiba

Yoshida Laboratory, Information Science Studies of Department of Science, The Toho University (Room 4641 and Room 4660 of IV Block, 70m2 altogether)

2. Names of Researchers

Name

Location and position

Main research track record and results

Akimasa Yoshida (Doctor of Engineering)

Full-time Lecturer, Information Science Studies of Department of Science, The Toho University

  1. An Interloop Same-level Data Localization Method in Hierarchical Coarse-grain Parallelization (A. Yoshida, K. Koshizuka, Masami Okamoto, H. Kasahara), Journal of the Information Processing Society, Vol. 40, No.5, pp.2054-2063, May 1999
  2. A Data-localization Compilation Scheme Using Partial-Static Task Assignment (H.Kasahara, A.Yoshida) Journal of Parallel Computing, Vol.24, No. 3, pp.579-596, May 1998.
  3. Data-Localization Scheduling inside
    Processor-Cluster for Multigrain Parallel Processing (A. Yoshida, K. Koshizaka, W.Ogata, H.Kasahara), Journal of the Electronic Information Communication Society, Vol.E80-D, No.4, pp.473-479, April 1997
  4. Data-Localization among Doall and Sequential Loops in Coarse Grain Parallel Processing
    (A.Yoshida, Y.Ujinaga, M. Obata, K.Kimura, H.Kasahara)
    Proc. of Seventh International workshop on Compilers for Parallel Computers, pp. 266-277, Jun. 1998.

Years of research experience: 9 years

3 Agreement on Contracts

In presenting this proposal, Prof. Mitsumasa Okada confirms that the contract for the research and development project "advanced parallelizing compiler technology" is not dependent on the conclusion of a contract based on the conditions described in the Industrial Technology Research and Development Contract Agreement and Contract (model) submitted by the Organization.