首頁 > 網路資源 > 大同大學數位論文系統

Title page for etd-0813108-143807


URN etd-0813108-143807 Statistics This thesis had been viewed 2729 times. Download 1449 times.
Author Ming-Yuan Zhong
Author's Email Address No Public.
Department Computer Science and Enginerring
Year 2007 Semester 2
Degree Master Type of Document Master's Thesis
Language English Page Count 46
Title Power Improvement Using Block-Based Loop Buffer with Innermost Loop Control
Keyword
  • Basic block
  • Trace cache
  • Innermost loop
  • Loop buffer
  • Loop buffer
  • Innermost loop
  • Trace cache
  • Basic block
  • Abstract A loop buffer is a memory located between CPU and level one instruction cache, called IL1 hereafter. The difference between the loop buffer and the cache dedicate for instructions is that the loop buffer only keeps the instructions in sequence. Therefore it contains the advantages of smaller size and high speed over the main cache. The instruction fetch unit can obtain the maximum benefit from loop buffer while the size of loop buffer is large enough to contain whole instructions in a loop, the instructions just need to be fetched from the cache only one time and then it can deliver instructions to CPU core at very low energy level.
    In the previous researches, the controller begins to detect the innermost loop at the fetch stage. The branches whether are predicted taken or not taken mainly depend on the branch predictor. Once the backward branches or the forward branches in the loop are miss-predicted, the controlled have to flush the instructions in the buffer, detect and refill a new loop from the main cache. Especially, the forward branches are so instable that the predictor cannot bring its value into play. Instead, this appearance will cause more wasted fetch power. Here, we attempt to lead the concept of a trace cache, which is quiet bulky and complicated in the architecture of the loop buffer. If using a trace cache as a loop buffer, we do save the energy. Contrarily, it debases the integral performance due to long latency at fetch stage. We therefore propose these methods of (1) doing innermost loop detection at commit stage and filling/active at fetch stage; and (2) assisting loop buffer in storing the innermost loops with forward branches to pack the instructions captured from the instruction cache as basic blocks. With the preceding modifications, we hope to strengthen the loop buffer for gaining performance and reducing more power.
    Results with SPEC2000 indicate that up to 45% (integer benchmarks) and 55% (floating benchmarks) of reductions in instruction fetch power compared with the design without loop buffer. Furthermore, we got 3% (integer benchmarks) and 2% (floating benchmarks) of power improvement than the design of the loop buffer that deal with loops at fetch stage.
    Advisor Committee
  • Jong-Jiann Shieh - advisor
  • Chia-Ming Chang - co-chair
  • Rung-Bin Lin - co-chair
  • Files indicate access worldwide
    Date of Defense 2008-06-26 Date of Submission 2008-08-13


    Browse | Search All Available ETDs