05 Flip-Flops, Registers, and Counters

This chapter transitions from combinational to sequential logic. It starts with the basic SR latch built from cross-coupled NOR or NAND gates, then introduces the gated SR latch, gated D latch, and the crucial edge-triggered D flip-flop (master-slave configuration). T flip-flops and JK flip-flops are also covered

Visual tour: latches → flip-flops

1. Gated SR Latch — adds a clock/enable input to the basic SR latch. The cross-coupled NOR loop only responds when CLK=1. The S=R=1 input is still forbidden.

S R CLK AND AND NOR NOR Q

Gated SR latch — Q sets/resets only while CLK=1

t0t1t2t3t4t5t6t7t8t9t10CLKSRQ

2. Gated D Latch — eliminates the forbidden state by deriving R from D̄. With one D input, S = D·CLK and R = D̄·CLK, so Q always follows D when the latch is enabled. Still level-sensitive: while CLK=1, every change of D propagates through.

D CLK NOT AND AND NOR NOR Q

Gated D latch — Q transparently follows D while CLK=1; holds when CLK=0

t0t1t2t3t4t5t6t7t8t9t10CLKDQ

3. Edge-Triggered D Flip-Flop (master-slave) — two D-latches in series with inverted clock to the second. While CLK=0 the master tracks D (slave is locked); when CLK rises, the master locks and the slave opens — capturing exactly one snapshot of D per rising edge. This is true edge-triggering: D can change freely between edges without affecting Q.

D CLK Master D-latch D EN Q Slave D-latch D EN Q Q NOT NOT

Edge-triggered D FF — Q samples D only on rising edge of CLK

t0t1t2t3t4t5t6t7t8t9t10CLKDQ

4. T Flip-Flop — toggle FF: when T=1, Q flips on each clock edge; when T=0, Q holds. Built from a D-FF with D = T ⊕ Q. Used as the building block of binary counters (each FF divides clock by 2).

T CLK XOR D D-FF posedge D CLK Q Q

T flip-flop — Q toggles on rising edge when T=1, holds when T=0

t0t1t2t3t4t5t6t7t8t9t10CLKTQ

5. JK Flip-Flop — the most general single-bit storage element: hold (00), reset (01), set (10), and toggle (11). Effectively combines SR and T behavior. Historically common in TTL designs; modern FPGAs prefer D-FFs because synthesis is simpler.

J K CLK JK FF edge-triggered Q Characteristic table: J K Q⁺ action 0 0 Q hold 0 1 0 reset 10 → Q⁺=1 set 11 → Q⁺=Q̄ toggle

JK flip-flop — JK = 00 hold, 01 reset, 10 set, 11 toggle

t0t1t2t3t4t5t6t7t8t9t10CLKJKQ

Summary: latches are level-sensitive (transparent while enabled); flip-flops are edge-sensitive (sample on a clock edge, hold otherwise). The master-slave construction is the bridge — two latches with opposite clock phases turn level-sensitivity into edge-sensitivity.

. Timing parameters (setup time, hold time, propagation delay) are defined. The chapter then builds up to registers, shift registers (serial/parallel), and counters. Counter types include asynchronous (ripple) counters, synchronous counters, counters with parallel load, BCD counters, ring counters, and Johnson counters. The Verilog section introduces always @(posedge clk) for sequential logic, blocking vs. non-blocking assignments
What is blocking vs. non-blocking?

In Verilog/SystemVerilog, the choice between = (blocking) and <= (non-blocking) is fundamental. The difference matters most inside always @(posedge clk) blocks — getting it wrong silently changes your hardware.

The core distinction:

  • Blocking (=): Executes sequentially, like software. Each statement finishes before the next begins.
  • Non-blocking (<=): All right-hand sides are evaluated first using the old values, then all left-hand sides update simultaneously at the end of the time step — modeling how real flip-flops work.

Classic shift-register example — same code, different hardware:

Non-blocking (correct: 3-stage shift)Blocking (collapses to 1 flip-flop)
always @(posedge clk) begin
    b <= a;
    c <= b;
    d <= c;
end
// → a → [FF] → b → [FF] → c → [FF] → d
// Three flip-flops, true shift register
always @(posedge clk) begin
    b = a;
    c = b;
    d = c;
end
// b, c, d all = a after the edge
// Synthesizer collapses to one FF

Why the blocking version "collapses": synthesizers see that b, c, d always end the clock edge with the same value, so they share storage. You get one flip-flop, not three. The simulation result depends on statement ordering, which is exactly what real hardware does not do.

Why non-blocking matches hardware: physical flip-flops sample their D input at the clock edge and propagate to Q after a small delay (Tco). They can't see each other's new outputs within the same edge. Non-blocking semantics (evaluate-then-update) capture this exactly.

Two-rule guideline that prevents almost all assignment bugs:

  • always @(posedge clk) — always use <=
  • always @* (combinational) — always use =

Mixing them in a single always block is where simulation/synthesis mismatches creep in. If your testbench passes but the synthesized netlist behaves differently on FPGA, this is usually the culprit.

, and how to describe registers and counters in Verilog. A reaction-timer design example ties everything together, and timing analysis of flip-flop circuits is covered.

SR / D / T / JK Flip-Flops Latches vs Flip-Flops Master-Slave Setup / Hold Time Shift Registers Synchronous Counters Ring / Johnson Counters Blocking vs Non-Blocking Timing Analysis
5.1EasyTier 1?R
Draw the circuit for a basic SR latch using NOR gates. Write the characteristic table showing all four input combinations including the forbidden state.
5.2EasyTier 1?R
Explain the difference between a latch and a flip-flop. Why are edge-triggered flip-flops preferred in synchronous circuits?
5.3MediumTier 1?R
Draw the timing diagram for a positive-edge-triggered D flip-flop with the following D input sequence: 0,1,1,0,1,0,0,1 (one value per clock cycle). Assume Q starts at 0.
5.4MediumTier 1?R
Show how a T flip-flop can be built from a D flip-flop with an XOR gate. Write the characteristic equation.
5.5MediumTier 1?R
Design a synchronous 3-bit up counter using T flip-flops. Derive the logic for each T input and draw the complete circuit.
5.6MediumTier 1?R
Compare the maximum clock frequency of a 4-bit ripple counter versus a 4-bit synchronous counter. Assume each flip-flop has a propagation delay of 5 ns.
5.7MediumTier 1?R
Design a synchronous 4-bit up/down counter with a control signal UpDown. When UpDown=1 the counter counts up; when 0 it counts down. Give the Verilog code.
5.8MediumTier 1?R
Explain the difference between blocking (=) and non-blocking (<=) assignments in Verilog. Show a code example where using the wrong type produces incorrect hardware.
5.9MediumTier 1?R
Write Verilog code for a D flip-flop with synchronous reset and an enable signal. When Enable=0, the flip-flop holds its current value.
5.10EasyTier 1?R
What are the setup time, hold time, and clock-to-Q delay of a flip-flop? Why does each matter for correct operation?
5.11MediumTier 1?R
Draw the master-slave D flip-flop circuit using two gated D latches. Explain why it is edge-triggered rather than level-sensitive.
5.12MediumTier 1?R
Explain what metastability is and why it matters. What is the purpose of a synchronizer (two flip-flops in series)?
5.13EasyTier 1E?R
Write a Verilog module D_latch with inputs D and Clk and output Q that uses an always block sensitive to D and Clk with an if (Clk) assignment, so that a level-sensitive gated D latch is inferred.
5.14EasyTier 1E?R
Write a Verilog module for a positive-edge-triggered D flip-flop using always @(posedge Clock).
5.15MediumTier 1E?R
Inside an always @(posedge Clock) block, two blocking assignments Q1 = D; Q2 = Q1; appear. Explain why this does NOT synthesize cascaded flip-flops, and draw the circuit that is actually inferred.
5.16MediumTier 1E?R
Show that replacing the blocking assignments in Example 5.3 with non-blocking assignments (Q1 <= D; Q2 <= Q1;) produces a two-stage shift register. Explain the semantic difference.
Context: Examples 5.3 / 5.4 — blocking vs non-blocking in clocked always blocks

Recap of Example 5.3 (blocking — wrong for cascading). The textbook starts with this code:

module example5_3 (D, Clock, Q1, Q2);
  input D, Clock;
  output reg Q1, Q2;
  always @(posedge Clock) begin
    Q1 = D;     // blocking
    Q2 = Q1;    // blocking — Q1 already updated
  end
endmodule

Blocking assignments evaluate sequentially. After Q1 = D;, the variable Q1 already holds the new value D, so Q2 = Q1; reads D as well. Both FFs latch the same input on every clock edge:

Figure 5.37 — blocking: two parallel D-FFs (one is dead) D D CLK FF1 Q Q1 D CLK FF2 Q Q2 Clock Q1 = Q2 = D (redundant)

Problem 5.16 — non-blocking (right): swap = for <=:

always @(posedge Clock) begin
  Q1 <= D;     // non-blocking
  Q2 <= Q1;    // reads OLD Q1 (value at block entry)
end

Non-blocking assignments all sample their right-hand sides at the moment the always block fires, then commit new values together at the end. So Q2 <= Q1; reads the previous value of Q1 — exactly what a real cascaded FF does. The synthesized hardware is a true 2-stage shift register:

Figure 5.39 — non-blocking: cascaded D-FFs (true shift register) D FF1 D CLK Q Q1 FF2 D CLK Q Q2 Clock

Side-by-side semantics for one rising edge. Suppose before the edge: $D=1, Q_1=0, Q_2=0$.

Style$Q_1$ after edge$Q_2$ after edge
Blocking =$D = 1$new $Q_1 = 1$
Non-blocking <=$D = 1$old $Q_1 = 0$

After the second clock edge with non-blocking, $Q_2$ finally becomes 1 — the input has propagated through two stages, hence "shift register".

Rule of thumb. Use non-blocking (<=) for everything inside a clocked always @(posedge clk) block. Use blocking (=) only in combinational always @(*) blocks where sequential evaluation matches gate-level dataflow. Mixing the two in the same block is the most common source of synthesis-vs-simulation mismatches.

— example 5.3
5.17MediumTier 1E?R
In an always @(posedge Clock) block with blocking assignments f = x1 & x2; g = f | x3;, draw the circuit that is synthesized. Explain why an AND gate appears between the input and the OR gate that feeds the g flip-flop.
5.18MediumTier 1E?R
Repeat Example 5.5 with non-blocking assignments and explain why the synthesized circuit changes: the OR gate that feeds g is now driven by the previous-cycle output of the f flip-flop instead of the AND-gate output.— example 5.5
5.19EasyTier 1E?R
Write Verilog for a D flip-flop with an asynchronous active-low reset (Resetn) using a sensitivity list of negedge Resetn, posedge Clock.
5.20EasyTier 1E?R
Write Verilog for a D flip-flop with a synchronous active-low reset, using a sensitivity list of posedge Clock only.
5.21EasyTier 1E?R
Write a parameterized Verilog module regn for an $n$-bit register with asynchronous clear, where $n$ is a parameter with a default value.
What is this problem asking for?

Four requirements:

1. parameterized Verilog module regn — module name is regn ("register, parameterized n"). Must use Verilog's parameter mechanism so the bit width is not hard-coded — the same module file works for any width by passing a different value at instantiation.

2. for an $n$-bit register — a bank of $n$ D flip-flops in parallel: input bus D[n-1:0], output bus Q[n-1:0], one shared clock. On each rising clock edge, the $n$ inputs latch into the $n$ outputs simultaneously.

3. with asynchronous clear — a reset signal that clears all $n$ bits to 0 immediately when asserted, not waiting for the next clock edge. Implementation-wise, the reset signal goes into the always-block sensitivity list along with the clock (always @(posedge Clock or negedge Resetn)), so the block fires the moment reset asserts. Contrast: a synchronous clear would only take effect on the next rising clock edge.

4. where $n$ is a parameter with a default value:

  • n declared with parameter (not localparam, not a hard-coded number).
  • Default value provided so instantiating without specifying width gets a sensible fallback (e.g., 8 bits).

What the grader checks: one module file that supports regn r1(); (default), regn #(.n(16)) r2(); (16-bit), regn #(32) r3(); (32-bit) — all without editing the source. Asserting Resetn drops Q to 0 between clock edges (visible in waveform). All $n$ bits load D on the rising clock edge when Resetn is inactive.

5.22MediumTier 1E?R
Write hierarchical Verilog code for a 4-bit parallel-load shift register, instantiating four copies of a muxdff subcircuit (a D flip-flop with a 2-to-1 multiplexer on its D input). Inputs: R (parallel data), L (load), w (serial in), Clock; output Q.
Context: what is a muxdff?

A muxdff is the workhorse cell of a parallel-load / shift register: a single D flip-flop preceded by a 2-to-1 multiplexer that selects between two next-state values. The MUX's select line is usually called L (load):

  • L = 0 — MUX picks the "shift" input w; the FF clocks in the bit coming from a neighbour cell. The chain shifts.
  • L = 1 — MUX picks the "parallel" input R; the FF clocks in a fresh data bit from outside. The register loads.

It's the smallest building block that lets a register do both jobs (shift OR load) under a single control bit.

R w L MUX2 1 0 D D-FF Q Q Clock

Verilog for the muxdff cell:

module muxdff (
  input  R, w, L, Clock,
  output reg Q
);
  always @(posedge Clock)
    Q <= L ? R : w;   // L=1 → load R; L=0 → shift in w
endmodule

Building a 4-bit parallel-load shift register by chaining four muxdff cells:

w muxdff[3] R[3] / w Q[3] muxdff[2] R[2] / Q[3] muxdff[1] R[1] / Q[2] muxdff[0] R[0] / Q[1]

Each cell's MUX bottom input is wired to the previous cell's Q. With L=0, bits ripple through the chain (shift). With L=1, every cell loads its corresponding R[i] in parallel.

Hierarchical Verilog (problem 5.22's task):

module shift_load_4 (
  input  [3:0] R,
  input        L, w, Clock,
  output [3:0] Q
);
  muxdff bit3 (.R(R[3]), .w(w),    .L(L), .Clock(Clock), .Q(Q[3]));
  muxdff bit2 (.R(R[2]), .w(Q[3]), .L(L), .Clock(Clock), .Q(Q[2]));
  muxdff bit1 (.R(R[1]), .w(Q[2]), .L(L), .Clock(Clock), .Q(Q[1]));
  muxdff bit0 (.R(R[0]), .w(Q[1]), .L(L), .Clock(Clock), .Q(Q[0]));
endmodule

Why the muxdff abstraction matters: almost every register in a real design needs more than just "clock D into Q on each edge". Adding load, shift, hold, and reset capabilities can pile up ifs and edge cases. Capturing each combination as a tiny named cell (muxdff, plus enable/reset variants) and instantiating n copies for an n-bit register is the cleanest way to scale — verify the cell once, reuse everywhere.

5.23MediumTier 1E?R
Write an alternative non-hierarchical Verilog description of the 4-bit parallel-load shift register, using a single always @(posedge Clock) block with non-blocking assignments to perform the load or shift action.
5.24MediumTier 1E?R
Write a parameterized Verilog module for an $n$-bit shift register with parallel-load and serial-input. Use a for loop to express the bit-by-bit shift in the else branch.
5.25EasyTier 1E?R
Write Verilog for a 4-bit synchronous up-counter with asynchronous reset Resetn and enable input E. The count increments on the positive edge of Clock when E=1.
5.26EasyTier 1E?R
Extend the up-counter of Example 5.13 with a parallel-load input L and a 4-bit data input R, so that the counter loads $R$ on the next clock edge when $L=1$.— example 5.13
5.27EasyTier 1E?R
Write Verilog for a parameterized down-counter with parallel load. On each positive clock edge, if L=1 load R; otherwise if E=1 decrement.
5.28MediumTier 1E?R
Write Verilog for a parameterized up/down counter with parallel load and a direction control up_down that selects whether the counter increments or decrements.
5.29HardTier 1E?R
Two flip-flops $Q_1$ and $Q_2$ are connected through combinational logic, with clock-signal delays $\delta_1$ and $\delta_2$ at each flip-flop. Define $t_{skew}=\delta_2-\delta_1$ and derive (a) the minimum clock period $T_{min}=t_{cQ}+t_L+t_{su}-t_{skew}$ and (b) the hold-time condition $t_{cQ}+t_l \ge t_h+t_{skew}$ for the shortest path delay $t_l$.
5.30MediumTier 1E?R
For a small combinational circuit driven by a square-wave input $C$ with 50% duty cycle (Figure 5.70a), draw a timing diagram showing the waveforms at internal nodes $A$ and $B$, assuming each gate has propagation delay $\Delta$.
Figure 5.70a — circuit reference

Figure 5.70a — two cascaded inverters tap-off internal nodes A and B

C A B (open) INV (Δ delay) INV (Δ delay)

So $A = \overline{C}$ delayed by $\Delta$, and $B = C$ delayed by $2\Delta$ (i.e. $A$ further inverted).

— example 5.18
5.31MediumTier 1E?R
Determine the functional behavior of the two-JK-flip-flop circuit in Figure 5.71, in which input $w$ is driven by a square-wave signal. Show that successive values of $Q_1 Q_0$ form the sequence $00,01,10,00,\ldots$ and conclude that the circuit is a modulo-3 counter.
Context: Example 5.19 — Figure 5.71 mod-3 counter (two JK FFs)

The circuit. Two edge-triggered JK flip-flops share clock input $w$ and a common asynchronous Clear. Both K inputs are tied to logic 1, so each FF either holds (when $J=0$) or toggles (when $J=1$). The combinational wiring is:

  • $J_0 = \overline{Q_1}$, $K_0 = 1$  (FF0 toggles unless $Q_1=1$, in which case it resets)
  • $J_1 = Q_0$,  $K_1 = 1$  (FF1 toggles when $Q_0=1$, otherwise resets)
FF0 J K Q FF1 J K Q Q0 Q1 w 1 1 Q̄1 Common async Clear → both FFs reset to 0 (not drawn)

JK behavior recap. On each rising edge of $w$:

$J$$K$$Q^{+}$name
00$Q$hold
010reset
101set
11$\overline{Q}$toggle

State walk-through. Start from Clear $\Rightarrow Q_1Q_0=00$:

before$J_1K_1$$J_0K_0$FF1 actionFF0 actionafter
$Q_1Q_0=00$$0,1$$1,1$reset → 0toggle → 1$01$
$Q_1Q_0=01$$1,1$$1,1$toggle → 1toggle → 0$10$
$Q_1Q_0=10$$0,1$$0,1$reset → 0reset → 0$00$

The cycle closes after three pulses: $00 \to 01 \to 10 \to 00 \to \ldots$ — exactly the count sequence 0, 1, 2, 0, 1, 2, … so the circuit is a modulo-3 (divide-by-3) counter.

Timing diagram (matches Figure 5.72; both FFs are positive-edge-triggered on $w$):

w Q0 Q1 Q1Q0: 00 01 10 00 01 10

Why this wiring works. State $11$ is excluded by construction: from state $10$, $J_0=\overline{Q_1}=0$ keeps $Q_0$ at 0, so the next state can never be $11$. With only three reachable states, three clock pulses return to the start — the defining property of a mod-3 counter.

Connection to Chapter 6. This is a hand-crafted FSM. In §6.7 the same mod-3 counter is re-derived systematically: pick a state assignment, build the excitation table for JK FFs (using the inverse of the JK characteristic table), and read $J_i$, $K_i$ off K-maps. The wiring above is the optimal result of that procedure.

— example 5.19
5.32HardTier 1E?R
Design a vending-machine controller with inputs $Q$ (quarter), $D$ (dime), $N$ (nickel), $\text{Coin}$ pulse, and $\overline{\text{Resetn}}$. The output $Z$ is asserted when the cumulative deposit reaches at least 30 cents (no change is given). Use a 6-bit adder, a 6-bit register, and gates only.
5.33HardTier 1E?R
Write Verilog code that implements the 30-cent vending-machine controller of Example 5.20, including the encoding of $N$, $D$, $Q$ into a 5-bit increment, the 6-bit accumulator register clocked on the negative edge of Coin, and the output expression $Z = s_5 + s_4 s_3 s_2 s_1$.— example 5.20
5.34HardTier 1E?R
Redesign a 4-bit synchronous counter to reduce the cascaded AND-gate delay between flip-flops. Refactor the gating so that the longest path becomes $T_{min}=t_{cQ}+t_{AND}+t_{XOR}+t_{su}$, and compute $F_{max}$ given $t_{cQ}=1.0$ ns, $t_{AND}=1.4$ ns, $t_{XOR}=1.2$ ns, $t_{su}=0.6$ ns.
5.35HardTier 1E?R
A circuit has three flip-flops $Q_1, Q_2, Q_3$ with corresponding clock delays $\delta_1, \delta_2, \delta_3$. Given $t_{su}=0.6$ ns, $t_h=0.4$ ns, $0.8 \le t_{cQ} \le 1.0$ ns, and gate delay $1+0.1k$ ns for $k$-input gates, compute $F_{max}$ for the clock-skew sets (i) $\delta_1=\delta_2=\delta_3=0$, (ii) $\delta_1=\delta_3=0$, $\delta_2=0.7$ ns, and (iii) $\delta_1=1$, $\delta_2=0$, $\delta_3=0.5$ ns. Check for hold-time violations in each case.