2022年1月6日星期四

OpenSPARC T1是怎样刷流水线?

在ifu/rtl/sparc_ifu_fcl.v里看到这么一段

//-------------------------
// Rollback
//-------------------------

   // 04/05/02
   // Looks like we made a mistake with rollback.  Should never
   // rollback to S.  In the event of a dmiss or mul contention, just
   // kill all the instructions and rollback to F.  This adds one
   // cycle to the dmiss penalty and to the mul latency if we have to
   // wait, both not a very high price to pay.  This would have saved
   // lots of hours of design and verif time.
   //    
   assign rb2_inst_d = thr_match_dw & inst_vld_d & dtu_fcl_rollback_g;
   assign rb1_inst_s = thr_match_fw & inst_vld_s & dtu_fcl_rollback_g;
   assign rb0_inst_bf = thr_match_nw & switch_bf & dtu_fcl_rollback_g;


   assign retract_iferr_d1 = erb_dtu_ifeterr_d1 & inst_vld_d1;

   assign retract_inst_d = retract_iferr_d1 & thr_match_de &
                           fcl_dtu_inst_vld_d |
                           mark4rb_d |
                           dtu_fcl_retract_d;

   assign rt1_inst_s = thr_match_fd & inst_vld_s & dtu_fcl_retract_d |
                       mark4rb_s;


看来是既有rollback (rb)也有kill。



   // determine rollback amount
   assign rb_frome = {4{(rb2_inst_e | rt2_inst_e) &
                        (inst_vld_e | intr_vld_e)}} & thr_e;
   assign rb_fromd = {4{(rb1_inst_d | rt1_inst_d) &
                        (inst_vld_d | intr_vld_d)}} & thr_d;
   assign rb_froms = {4{rb_stg_s & inst_vld_s_crit}} & thr_f;
   assign rb_w2 = rb_frome | rb_fromd;
   assign rb_for_iferr_e = {4{retract_iferr_e}} & thr_e;



感觉应该有控制流水线寄存器的部分,但还没找到。。

我之前清流水线是把流水线寄存器reset,流水线寄存器清0。相当于加入流水线泡泡 NOP就是00000000。

 

发现opensparc t1不是这个思路。一条指令最终被执行,实质就是它改写了寄存器或者改写了内存,或者是对其它系统状态结果有影响。比如一个add指令, add x2 x0 x3,把x2寄存器的值给到x3。这条指令可以在inst流水线寄存器里被清0,也可以最终不写regfile,这样这条指令也相当于没执行(其实是执行了,但是没有效果)。

 opensparc应该就是这种方式的。比如这个ifu_exe_kill_e信号,注释也说的很清楚了

(ifu产生的,发给exu,kill当前流水线E里的这条指令)

input        ifu_exu_kill_e;         // kill instruction in e-stage


进入sparc_exu()->sparc_exu_ecl()

然后ifu_exu_kill_e分别进入

sparc_exu_eclccr()
sparc_exu_ecl_wb()
sparc_exu_eclbyplog_rs1()
sparc_exu_eclbyplog byplog_rs2()
sparc_exu_eclbyplog byplog_rs3()
sparc_exu_eclbyplog byplog_rs3h()


寄存器bypassing 逻辑都需要用到这个信号,应该基本上所有更新寄存器的地方都要用这个信号。

看主要的sparc_exu_ecl_wb() Writeback control logic

//  Module Name: sparc_exu_ecl_wb
//      Description:  Implements the writeback logic for the exu.
//              This includes the control signals for the w1 and w2 input
//      muxes as well as keeping track of the wen signal for ALU ops.

keeping track of the wen signal for ALU ops. 应该就是说的这个ifu_exu_kill_e了。


   assign wen_w_inst_vld = valid_w | inst_vld_noflush_wen_w;
   assign ecl_irf_wen_w = ifu_exu_inst_vld_w & wen_w_inst_vld | wen_no_inst_vld_w;

   // bypass valid logic and flops
   dff_s dff_wb_d2e(.din(ifu_exu_wen_d), .clk(clk), .q(wb_e), .se(se),
                  .si(), .so());
   dff_s dff_wb_e2m(.din(valid_e), .clk(clk), .q(wb_m), .se(se),
                  .si(), .so());
   dffr_s dff_wb_m2w(.din(valid_m), .clk(clk), .q(wb_w), .se(se),
                  .si(), .so(), .rst(reset));
   assign  valid_e = wb_e & ~ifu_exu_kill_e & ~restore_e & ~wrsr_e;// restore doesn't finish on time
   assign  bypass_m = wb_m;// bypass doesn't need to check for traps or sehold
   assign  valid_m = bypass_m & ~rml_ecl_kill_m & ~sehold;// sehold turns off writes from this path
   assign  valid_w = (wb_w & ~early_flush_w & ~ifu_tlu_flush_w);// check inst_vld later
   // don't check flush for bypass
   assign  bypass_w = wb_w | inst_vld_noflush_wen_w | wen_no_inst_vld_w;

最终ifu_exu_kill_e这个信号混合其它信号,再经过几个流水线级,最终影响这个ecl_irf_wen_w。这个信号output出sparc_exu_ecl(),进入bw_r_irf irf(),也就是整数register file。


//  Module Name: bw_r_irf
//      Description: Register file with 3 read ports and 2 write ports.  Has
//                              32 registers per thread with 4 threads.  Reading and writing
//                              the same register concurrently produces x.



module bw_r_irf (/*AUTOARG*/
   // Outputs
   so, irf_byp_rs1_data_d_l, irf_byp_rs2_data_d_l,
   irf_byp_rs3_data_d_l, irf_byp_rs3h_data_d_l,
   // Inputs
   rclk, reset_l, si, se, sehold, rst_tri_en, ifu_exu_tid_s2,
   ifu_exu_rs1_s, ifu_exu_rs2_s, ifu_exu_rs3_s, ifu_exu_ren1_s,
   ifu_exu_ren2_s, ifu_exu_ren3_s, ecl_irf_wen_w, ecl_irf_wen_w2,
   ecl_irf_rd_m, ecl_irf_rd_g, byp_irf_rd_data_w, byp_irf_rd_data_w2,
   ecl_irf_tid_m, ecl_irf_tid_g, rml_irf_old_lo_cwp_e,
   rml_irf_new_lo_cwp_e, rml_irf_old_e_cwp_e, rml_irf_new_e_cwp_e,
   rml_irf_swap_even_e, rml_irf_swap_odd_e, rml_irf_swap_local_e,
   rml_irf_kill_restore_w, rml_irf_cwpswap_tid_e, rml_irf_old_agp,
   rml_irf_new_agp, rml_irf_swap_global, rml_irf_global_tid
   ) ;


这个register file有点复杂,参数也有点多。。。但可以看出如果这个wen (ecl_irf_wen_w)没有的话,是不会写register file的。

 

但为什么用这种方式,而不是用pipeline bubble呢? 还没搞清楚。。。

但似乎ifu_exu_kill_e这个信号并没有到lsu,比如:

ld [%L1],%L2 
ld invalid address (触发异常)

遇到异常或中断,要刷流水线,如果是用opensparc t1的这种方式,那么前面这条指令是会被送到lsu,只是最后回写结果的时候没有写进寄存器。但这部操作已经会触发cache相关的操作了吧。

没有评论:

发表评论