Execution and Cache Performance of a Decoupled Non-Blocking Multithreaded Architecture

Article

Execution and Cache Performance of a Decoupled Non-Blocking Multithreaded Architecture

Krishna M. Kavi,Roberto Giorgi,Joseph Arul-2000-01-01

0

TL;DRAbstract

In this paper we will present an evaluation of the execution performance and cache behavior of a new multithreaded architecture being investigated by the authors. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are post-stored after the completion of the thread execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to the enabled threads. The non-blocking nature of threads reduces the number of context switches, thus reducing the overhead in scheduling threads. Our functional execution paradigm eliminates complex hardware required for dynamic scheduling of instructions used in modern superscalar architectures. We will present our preliminary results obtained from

Chat with Paper

AI Agents for this Paper

In this paper we will present an evaluation of the execution performance and cache behavior of a new multithreaded architecture being investigated by the authors. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are post-stored after the completion of the thread execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to the enabled threads. The non-blocking nature of threads reduces the number of context switches, thus reducing the overhead in scheduling threads. Our functional execution paradigm eliminates complex hardware required for dynamic scheduling of instructions used in modern superscalar architectures. We will present our preliminary results obtained from

Keywords

Computer scienceBlocking (statistics)Parallel computingCacheOperating systemComputer architectureComputer network

Chat

Click to start Chat