Category Archives: FPGA & VHDL

Articles, tips and advice on FPGA & VHDL development

UART, SPI,I2C and More


Pretty much every FPGA design has to interface to the real world through sensors or external interfaces. Some systems require large volumes of data to be moved around very quickly, in which case high-speed communications interfaces like PCI-X, Ethernet, USB, Fire/SpaceWire, and CAN, or those featuring multi-gigabit transceivers may be employed. However, these interfaces have a considerable overhead in terms of system design and complexity, and they would be overkill for many applications.

Systems with simple control and monitor interfaces — systems that do not have large volumes of data to transfer — can employ one or more of the simpler communications protocols. The four simplest, and therefore most commonly used, protocols are as follows.

    • UART (Universal Asynchronous Receiver Transmitter): This comprises a number of standards defined by the Electronic Industry Association (EIA), the most popular being the RS-232, RS-422, and RS-485 interfaces. These standards are often used for inter-module communication (that is, the transfer of data and supervisory control between different modules forming the system) as opposed to between the FPGA and peripherals on the same board, although I am sure there are plenty of applications that do this also. These standards defined are a mixture of point-to-point and multi-drop buses.


    • SPI (Serial Peripheral Interface): This is a full-duplex, serial, four-wire interface that was originally developed by Motorola, but which has developed into a de facto standard. This standard is commonly used for intra-module communication (that is, transferring data between peripherals and the FPGA within the same system module). Often used for memory devices, analog-to-digital converters (ADCs), CODECs, and MultiMediaCard (MMC) and Secure Digital (SD) memory cards, the system architecture of this interface consists of a single master device and multiple slave devices.


    • I2C (Inter-Integrated Circuit): This is a multi-master, two-wire serial bus that was developed by Phillips in the early 1980s with a similar purpose as SPI. Due to the two-wire nature of this interface, communications are only possible in half-duplex mode.


  • Parallel: Perhaps the simplest method of transferring data between an FPGA and an on-board peripheral, this supports half-duplex communications between the master and the slave. Depending upon the width of data to be transferred, coupled with the addressable range, a parallel interface may be small and simple or large and complex.

Over the next few weeks, I will be exploring each of these protocols in depth, explaining their histories, pros and cons, and typical uses. Also, I will be discussing how to implement these protocols inside an FPGA (there may even be some code floating about as well).

But before we examine these protocols in depth, it’s well worth our while to spend a little time recapping some of the terminology we use when describing protocols and their behavior, as follows.

  • Point-to-point: The simplest of communication protocols that involves just two devices exchanging data (a common example is RS-232).
  • Multi-drop: A more complicated structure that consists of a single master and a number of slaves, thereby facilitating more complicated data transfers and control architectures. Some protocols, such as I2C, can have multiple masters.
  • Simplex: This refers to data communication that is unidirectional. A typical implementation of a simplex data link could be a sensor broadcasting data to an FPGA or a microcontroller.
  • Duplex: This term is used when discussing point-to-point links. The ability of a protocol to support communication in one direction at a time is referred to as “half duplex.” If the protocol can support communication in both directions simultaneously it is referred to as “full duplex.”

As we discuss the different protocols in future columns, we will also be referring to the Open Systems Interconnection (OSI) seven-layer model. This is an abstract model used to describe how protocols function.

  • Layer One: Physical Layer. Describes the physical interconnection.
  • Layer Two: Data Link Layer. Describes the means for actually transferring data over the physical layer (Physical Addressing).
  • Layer Three: Network Layer. Describes the means to transfer data between different networks (Logical Addressing).
  • Layer Four: Transport Layer. Provides transfer of data between end users.
  • Layer Five: Session Layer. Controls the connections between end users.
  • Layer Six: Presentation Layer. Transforms data presentation between higher-level and low-level layers.
  • Layer Seven: Application Layer. Interfaces to the final application where the data transferred to is used or data is gathered for transfer.

It is worth noting at this point that not all layers of the OSI model are required to be present within a particular protocol. We will see examples where only a subset of the layers is employed.

As an aside, while I was writing this, I thought about some of the more obscure EIA protocols out there, such as RS-423, RS-449, and EIA-530.


RS232 & UART


using a UART (Universal Asynchronous Receiver/Transmitter) for the RS-232 standard. This has to be one of the first communication protocols many of us were introduced to at university or college.

Simply put, RS-232 is a single-ended serial data communication protocol between a DTE (Data Terminal Equipment) and a DCE (Data Circuit-terminating Equipment). So what do these terms actually mean? Well, the DTE is the “master” piece of equipment; for example, your laptop or an ASIC or an FPGA or a microcontroller — whatever is initiating the communications. By comparison, the DCE is the “slave” piece of equipment; for instance, a MODEM back in the days of “dial-up” communications or any other device that is subservient to the main piece of equipment.

RS-232 is one of the oldest communication protocols still in common use. It was originally developed in 1962 which makes it 50 years old this year. The “RS” came from the fact that this was first introduced by the Radio Sector of the Electronic Industries Alliance (EIA), but RS-232 is now generally understood to stand for “Recommended Standard 232.”

The standard itself defines only the physical layer of the protocol — addressing the electrical signal requirements, interface connectors, and so on — but nothing at the data link layer such as framing or error correction.

The complete RS-232 standard defines the signals shown in the table below. However the simplest of RS232 links can be achieved with just three lines — Transmitted Data, Received Data, and Common Ground — with flow control between the DTE and DCE being implemented using commands transmitted over the communications lines (this is often realized using ASCII control characters).

Due to its age, RS-232 uses voltage levels that are considerably higher than those used commonly within designs today. For both transmitted (Tx) and received (Rx) data, the standard defines a logic 1 as a voltage between −3 volts and −15 volts, while a logic 0 is defined as a voltage between +3 volts and +15 volts. For this reason, many designs will contain a transceiver that translates between internal logic levels and the correct RS-232 voltages.

A UART (Universal Asynchronous Receiver/Transmitter) can be used to implement many protocols, with RS-232 being perhaps one of the most common, closely followed by RS-485. Implementing a UART within an FPGA is very simple, requiring only a baud rate generator and a shift register for transmission (with the correct start, parity, and stop bits). Following the start bit, the transmitter transmits the data — LSB (least-significant bit) through to the MSB (most-significant bit) –followed by the parity bit and finally the stop bit(s).

As a small aside, the “stop bit” it is not a real bit, but is instead the minimum period of time the line must be idle at the end of each word. On PCs this period can have three lengths — the time equal to 1, 1.5 or 2 bits.

Implementing a receiver is a little more complicated, because most FPGAs operate at clock frequencies in excess of the RS-232 rate. This allows the receiving FPGA to oversample the incoming receive line. In turn, this allows the leading edge of the start bit to be detected, thereby providing the timing reference for the recovery of the remaining bits. The simplest method is to sample in the middle of the nominal bit period; however, an alternate method of sampling at one third and two thirds of the bit period and confirming that the two sampled values are in agreement is also often used.


The example here is a RS232 receiver this receives an RS-232 signal at 115,200 baud and — in the example provided — the received data is wired out to eight LEDs on my development board allowing them to be turned on or off over the serial link.

Testing this RS-232 Receiver also gave me the opportunity to try out the Oscium LogiScope, which turns your iPad into a very good logic analyzer. In the image below, we see my FPGA development board at the bottom of the image. Behind this board is an iPad. The Oscium LogiScope hardware is the small black box sticking out of the right-hand side of the iPad, while the software (which can be downloaded from the iTunes Store) is running on the iPad.


This enabled me to break out the Rx signal going into the FPGA along with the internal capture register through which the Rx signal is shifted as it is captured. I must say I very impressed with how easy it was to use the LogiScope logic analyzer and set up the triggering etc. since — being a typical engineer — I did not read the user manual. The LogiScope provides the option to perform advanced triggering on patterns or multi-level events, and it also decodes I2C, which will come in useful when we come to discuss that protocol in a future column.

Extracting the screen shots and logs from the iPad was also very simple. Using the email option, it was easy to send these to my email account, thereby allowing me to include them in this blog as shown below:


The beauty of an FPGA-based UART is that it can be easily adapted to interface to other protocols like RS-485 and RS-422. This allows you as the FPGA designer to develop a soft UART core that can be reused across a number of projects. Have you developed such a core — or used someone else’s? If so, please share your experiences with the rest of us.




In a  previous column, we discussed the difference between registers and latches, so I decided to dedicate this column to explain what metastability is, what causes it, and how we can learn to live with it since its occurrence cannot be totally prevented.

As illustrated below, metastability can happen to registers when their setup or hold times are violated; that is, if the data input changes within the capture window. As a result, the output of the register may enter a metastable state, which involves oscillating between logic 0 and 1 values. If not treated, this metastable condition may propagate through the system, causing issues and errors. The register will eventually recover from its metastable state and “settle” on a logic 0 or 1 value; the time it takes for this to occur is called the recovery time.


Metastability within an FPGA design will typically occur in one of two ways:

    1. When an incoming signal is asynchronous with regard to the clock domain. This may be an external input signal or a signal crossing between clock domains. In this case, the design engineer is expected to resynchronise the signal to address metastability, which is certain to occur eventually. This is where a multi-stage synchroniser is typically employed as discussed below.
    2. When multiple register elements in a synchronous design are using the same clock, but phase alignments or clock skew issues mean that the output from one register violates another register’s setup and hold time. This may be addressed by modifying the place-and-route constraints or by changing the logic design itself.

Let’s consider the case of an incoming signal that is asynchronous with respect to the system clock. It is the engineer’s responsibility to create the design in such a way as to mitigate against any resultant metastability issues. Many engineers will be familiar with the concept of a two-stage synchronizer, but I wonder how many really understand just how it performs its magic?


A two-stage synchronizer.

In fact, the two-stage synchronizer works by permitting the first register to go metastable. The idea is that the system clock is running — and therefore “sampling” the external signal — significantly faster than the external signal is changing from one state to another. If it should happen that a transition on the asynchronous signal causes the first register to become metastable, then ideally this register will have recovered by the time the next clock edge arrives and loads this value into the second register.

Now, this is where some people become confused. Let’s assume that the original value on the asynchronous signal was a logic 0, and that this value has already been loaded into both of the synchronizer registers. At some stage the asynchronous signal transitions to a logic 1. Let’s explore the possibilities as follows:

The first possibility is that the transition on the asynchronous signal doesn’t violate the first register’s setup or hold times. In this case, the first active clock edge (shown as “Edge #1” is the illustration below) following the transition on the asynchronous signal transition loads its new value into the first register, and the second active clock edge will copy this new value from the first register into the second as shown below:


Transition on input doesn’t cause any problems.

The second possibility is that the transition on the asynchronous signal does violate the first register’s setup or hold times, which means the first active clock edge causes the first register to enter a metastable state. At some stage — hopefully before the next clock edge — the first register will recover, by which we mean it will settle into either a logic 0 or a 1 value. Let’s assume that the first register ends up settling into a logic 1 as shown below:


Metastable state settles on logic 1 value.

This is, of course, what we wanted in the first place. In this case, the second active clock edge will load this 1 into the second register (which originally contained a logic 0). Thus, the end result — as seen at the output from the second register — is exactly the same as if the first register had not gone metastable at all.

The final possibility (at least, the last one we will consider in this column) is that, following a period of metastability, the first register settles into a logic 0 as shown below:


Metastable state settles on logic 0 value.

In this case, the second active clock edge will load this 0 into the second register (which already contained a 0). At the same time, this second active clock edge will load the logic 1 on the asynchronous signal into the first register. Thus, it is the third active clock edge that eventually causes the second register to be loaded with a logic 1.

The end result of using our two-stage synchronizer is that — in a worst-case scenario — the desired output from the synchronizer is delayed by a single clock cycle. Having said this, there is a slight chance that the first register will not recover in time, which might cause the second stage of the synchronizer to enter its own metastable condition.

The alternative would be to have three or even more stages, so how do we determine if two stages are acceptable… or not?

Well, as engineers we can actually calculate this. Sadly, this does involve some math, but I will try and keep the painful parts to a minimum. The mean time between failure (MTBF) for a flip-flop (register) depends upon the manufacturing process. Let’s start with the equation for a single flip-flop as follows:


Based on this, we can calculate the MTBF for a multi-stage synchroniser using the following equation:


For both equations:



I really am sorry about this, and I will do my best to keep math out of future columns. Having said this, by means of this equation, it is possible to determine the mean time between a metastability event occurring for your chosen synchronizer structure (two or more flip-flops). If the resulting MTBF for a two-stage synchronizer shows that the time between metastable events is not acceptable (that is, they will occur too often), then you can introduce a third flip-flop.

The use of a three-stage synchronizer is often required in the case of high-speed or high-reliability systems. If you are designing for a high-reliability application, then you will need to demonstrate that the metastability event cannot occur during the operating life of the equipment (at a minimum). This MTBF (or, more correctly, its reciprocal, which is the failure rate) can also be fed into the system-level reliability calculations to determine the overall reliability of the entire system.

When it comes time to simulate these synchronizers, it quickly becomes obvious that the tools are limited in regard to the way in which they can model metastable events. For example, consider the following results generated by simulating an RTL version of a two-stage synchronizer


The RTL simulation appears to indicate that there are no problems.

Even though there is, in fact, a problem with this design, no errors are detected or displayed, due to the fact that the RTL does not — in this case — contain any timing information.

For a simulation to exhibit metastability, you have to simulate at the gate level using a standard delay file (SDF) that contains the appropriated timing information. The synthesis tool extracts this timing information from the library associated with the target FPGA component. For example, consider the following gate-level simulation results for the same two-stage synchronizer


The gate-level simulation reveals a timing error (where the traces go red).

Also, the following warning messages were generated as part of this gate-level simulation:


If you wish, you can replicate these results for yourself by downloading this ZIP file, which contains the following files:

  • meta_testbench.vhd — The VHDL testbench
  • meta_rtl.vhd — The RTL version of the design
  • meta_gate.vhd — The synthesized gate-level version of the design
  • meta_gate.sdf — The delays associated with the gate-level version of the design

You can replicate the RTL simulation using the “meta_testbench.vhd” and “meta_rtl.vhd” files. Similarly, you can replicate the gate-level simulation using the “meta_testbench.vhd” and “meta_gate.vhd” files with the delays in the “meta_gate.sdf” file being applied to the “/uut/” region.

the RTL simulation of our two-stage synchronizer indicated that there weren’t any problems. However, the gate-level simulation did reveal a timing error (where the traces go red). It’s also interesting to note that even the gate-level simulation will not fully behave like the real-world synchroniser, because the flip-flop models do not contain the required information on recovery time. This means that the unknown “X” state will be propagated through the design and will affect any downstream elements in the design. It’s up to us at the engineers to handle this correctly.

Of course, saying things like “It’s up to us to handle this correctly” sounds good if you say it quickly and wave your arms around a lot, but how do we actually do this? Well, since the flip-flops in an FPGA are already fabricated, this means that — as FPGA designers — we have only two options (ASIC/SoC designers have a third option, because they can implement a special synchroniser flip-flop which has an acceptably high MTBF with regard to these metastable events).

The first approach is to disable all timing checks within the simulator, but this will hide other timing issues that really need to be investigated. The alternate — and often preferred — technique is to find the first register (“synch_reg(0)” in this case) in the SDF file and set the setup and hold time information to 0. This is shown in the example below where the red highlighted text is changed from the original settings to the updated values required for simulation.


Original settings for “synch_reg_0” in the SDF file (show in red).
113436_591582Modified settings for “synch_reg_0” in the SDF file (show in red).

Doing this will prevent the register from being able to experience the metastable event. This is only acceptable, however, if you have already analysed your design and you are confident that your synchroniser has the required MTBF


Edge Detection on Signals


I was recently looking at some code for a friend who is learning VHDL (no, I am not going to name names as to who it was). Along the way, I came across an interesting mistake that they had made, and I thought this would make an excellent addition to our discussions here on All Programmable Planet.

The essence of the problem was that the coder was attempting to detect rising and falling edges on signals in a synthesisable module. Not unsurprisingly for someone new to VHDL working within a clocked process, they attempted to use the “rising_edge” and “falling_edge” functions to detect edges on the signals of interest. A snapshot of the code I was sent is demonstrated below. (I know there are additional issues with this code, but for the moment we will focus only upon the rising and falling edge usage.)


Those who know VHDL will not be surprised to hear that when this code was synthesized, it didn’t not get very far before failing with the following message:

“Logic for signal is controlled by a clock but does not appear to be a valid sequential description.”

To a beginner this might be a little confusing. Why can’t you use the “rising_edge” and “falling_edge” functions to detect these edges? Actually, things can quickly become even more confusing, because it’s possible to create code that will simulate quite happily, but that will fail to synthesize as expected.

Now, if the sort of code shown above was being employed for something like a testbench (i.e., if you never intended to synthesizis this code), then — with a little tweaking (in the case of the example above) — it would simulate perfectly fine and everyone would be happy.

In fact, this is all tied up with the levels of abstraction that are possible with VHDL. If you create something at too high a level of abstraction, it is possible that your code might simulate and appear to function as desired, thereby giving rise to false confidence that your solution is valid and good progress is being made on the project, only to run into issues downstream. One reason for this is that, depending on your simulation tool settings (see the example shown below), you may fail to receive a warning on how synthesizable/unsynthesizable your code will be.


In the case of our example code, the problem arises when we try and implement this code within an FPGA, because then we are working within the stricter confines of supported synthesizable instructions and templates.

Sadly, the example code presented earlier does fall outside the template recognized by most synthesis tools as a clocked process. This is due to the multiple “rising_edge” and “falling_edge” function calls, which make it impossible for the synthesis tool to determine which calls should clock the register elements and which calls are being used only to detect signal edges and hence are not clocking registers.

To ensure synthesis, your process must contain only one “rising_edge” or “falling_edge” function call as shown in the code below, which implements a simple D-Type register. (Some FPGAs do have flip-flops that support double data rate; i.e., data changing on both the rising and falling edge of the clock, but we will address these in a future column so as to keep things simple here.)


So, how do we detect edges on signals within a design without using the “rising_edge” or “falling_edge” functions? Actually, this is very simple and can be achieved using two registers connected in series, such that one register contains the same signal as the other register, but delayed by one clock cycle. The engineer can then use the values contained within these registers to determine if the signal is indeed rising or falling.

So what does this actually look like in code and when implemented? Well, take a look at the code below along with the corresponding schematic diagram:


Of course, the state of the “det_reg” could also be used in other synchronous processes and compared against a predefined constant to detect if the signal edge was rising or falling. This has the advantage that by simply changing the value of the constant, the type of edge being detected — and hence the action taken — can be changed without the need to modify the code itself.



Coding Styles


The major differences among coding styles relate to how the design engineer decides to handle VHDL keywords and any user-defined items (the names of signals, variables, functions, procedures, etc.). Although there are many different possibilities, in practice there are only three commonly used approaches. The first technique is to use a standard text editor and simply enter everything in lowercase (and black-and-white) as shown in the following example of a simple synchronizer:


B&W: Keywords and user-defined items in lowercase.

The second method is to follow the VHDL Language Reference Manual (LRM), which says that keywords (if, case, when, select, etc.) should be presented in lowercase. Strangely, the LRM doesn’t have anything to say about how user-defined items should be presented, but the common practice (when the keywords are in lowercase) is to present user-defined items in uppercase as shown below:


B&W: Keywords lowercase; user-defined items uppercase.

The third approach — and my personal preference — is to turn the LRM on its head; to capitalize keywords (IF, CASE, WHEN, SELECT, etc.) and to present any user-defined items in lowercase as shown below:


B&W: Keywords uppercase; user-defined items lowercase.

The main argument for using uppercase for keywords and lowercase for user-defined items, or vice versa, is that this helps design engineers and reviewers to quickly locate and identify the various syntactical elements in the design. However, this line of reasoning has been somewhat negated by the use of today’s context-sensitive editors, which are language-aware and therefore able to automatically assign different colors to different items in the design.

Let’s look at our original examples through the “eyes” of a context-sensitive editor as illustrated in the following three images:


Color: Keywords and user-defined items in lowercase.

Color: Keywords lowercase; user-defined items uppercase.

Keywords uppercase; user-defined items lowercase.

Although context-sensitive editors are extremely efficacious, it’s important to remember that some users may be color-blind. Also, even if the code is captured using a context-sensitive editor, it may be that some members of the team end up viewing it using a non-context-aware (black-and-white) editor. Furthermore, the code may well be printed out using a black-and-white printer. For all these reasons, my personal preference is to capitalize keywords and for everything else to be in lowercase.

Aside from how we handle the keywords, most companies will have their own sets of coding guidelines, which will also address other aspects of coding, such as:

    • Naming conventions for clock signals; possibly requiring each clock name to include the frequency, e.g., clk_40MHz, clk_100MHz.
    • Naming and handling of reset signals to the device (active-high, active-low), along with the synchronization of the signal and its assertion and de-assertion within the FPGA. The de-assertion of a reset is key, as removing this signal at the wrong time (too close to an active clock edge) could lead to metastability issues in flip-flops.
    • Naming conventions for signals that are active-low (this is common with external enables). Such signals often have a “_z” or “_n” attached to the end, thereby indicating their status as active-low (“ram_cs_n” or “ram_cs_z,” for example).
    • The permissible and/or preferred libraries that will ensure the standard types that can be used. Commonly used libraries for any implementable code are “std_logic_1164” (this is the base library that everyone uses) and the “numeric_std” for mathematical operations (as opposed to “std_logic_arith”). Meanwhile, other libraries such as “textio” and the “math_real” and “math_complex” libraries will be used when creating test benches.
    • The use of “std_logic” or the unresolved “std_ulogic” types on entities and signals (“std_ulogic” is the unresolved type of “std_logic” and is used to indicate concurrent assignments on a signal).
    • Permissible port styles. Many companies prefer to use only “in,” “out,” and “inout” (for buses only) as opposed to “buffer,” as the need to read back an output can be handled internally with a signal. Also, many companies may limit the types permissible in entities to just “std_logic” or “std_logic_vector” to ease interfacing between large design teams.
  • It is also common for companies in some specialist safety-critical designs to have rules regarding the use of variables.

These coding styles can be very detailed, depending upon the end application the FPGA is being developed for.


Handling the Don’t Care Value


In my previous blog, I talked about the nine-value VHDL logic system. As part of those discussions, I mentioned the “Don’t Care” value ‘–’. In particular, we noted how this value is not really meant to be used in comparison operations such as “IF” and “CASE” statements. Instead, the “Don’t Care” value is traditionally used as an assignment when specifying output values.

Having said this, people often ask “Is it possible to use the ‘Don’t Care’ value with comparison operators as it could be very useful?” Like many things, the answer is “yes” with the help of a very useful function — the “std_match()” function provided in the “numeric_std” package (“USE ieee.numeric_std.ALL”).

The “std_match()” function allows the use of “Don’t Care” values during comparison operations. This comes in very useful when we are trying to implement functions like priority encoders, because it allows the “CASE” or “IF” statement to be specified in many fewer lines of code as demonstrated below where a simple priority encoder is implemented.

If we were to perform a comparison just using the “=” operator, we would receive a synthesis warning similar to the following:

Possible simulation mismatch due to ‘-‘ in condition expression

By comparison (no pun intended), if we were to employ the “std_match()” function as demonstrated in the following snippet of code, we would not receive this warning:


In attempt to reduce the need to use the “std_match()” function, VHDL 2008 — which was published by Accellera in 2008, and which is now the official IEEE 1076-2008 standard — introduced many useful upgrades to the language. In addition to including support for new fixed- and floating-point number systems, and the incorporation of the Property Specification Language (PSL) into VHDL, VHDL 2008 also introduced the “CASE?” keyword. This new keyword allows the ‘-‘ to be used as a “Don’t Care” value in comparison operations provided the choices are overlapping. Thus, using the new “CASE?” operator, the code above would be rewritten as follows:


The only remaining question would be why have I included both the “std_match()” approach and the newer “CASE?” method of performing this operation in this column? Why didn’t I just talk about the “CASE?” approach and have done with it?

Well, the answer is both simple and mundane. Although VHDL 2008 was published by Accellera in 2008, and was accepted by the IEEE in January 2009, it takes time for the various tool manufacturers to support new languages features. Often it is pressure from the users that is responsible for the manufacturers introducing new features. As is often the case in life, if you do not ask you will not get. (“The squeaky wheel is the one that gets the oil,” as the old saying goes.)


Logic Values


As I touched upon in my earlier column on metastability, a VHDL signal that uses the “std_logic_1164 package” can undertake one of nine different values.

This may come as a shock to newer FPGA designers, who might wonder why a signal that spends the majority of its time as a 0 or 1 needs seven other values to accurately represent its behavior. As with a lot of these things, there is an interesting history as to why we ended up with this nine-value system.

These nine values are defined by the IEEE standard 1164, which was introduced in 1993. This replaced the earlier IEEE 1076-1987 standard, whose logical type — the “bit” — could assume one of only two values: 0 or 1. Sadly the “bit” type’s limited value system caused issues with models of buses, in which multiple tri-state drive gates may be connected to the same signal.

In order to address this problem, the multi-valued logic system defined by the IEEE 1164 standard not only introduced a set of new values, but it also defined an order of precedence. This order of precedence is very important to ensure that the simulator can correctly resolve the value on a signal.


Another way of visualizing this graphically is illustrated below. What this image tells us is that a conflict between a signal carrying a “U” (Uninitialized) value — for example, a register that has not been loaded with a 0 or 1 — and any other value will result in a “U.” A conflict between a “-” (Don’t Care) value and any other value (apart from a “U”) will result in an “X.” A conflict between a “0” and a “1” will result in an “X.” A conflict between a “0” and an “L,” “W,”, or “H” will result in a “0.” And so on and so forth…


It’s important to note that not all of these values can be assigned in synthesizable code, but they may be used in modeling and — in the case of the “U” (Uninitialized”) value, for example — they may be seen as the initial values on signals and registers during simulation.

While the meanings of the “0,” “1,” and “Z” — which are typically used to model the behavior of both signals and buses — are reasonably straightforward (doing what the “bit” type could not), the uses of the other values might not be immediately apparent. The “H” (Weak High) and “L” (Weak Low) states are used to model high-value pull-up and pull-down resistance values on a signal, respectively. These can subsequently be used to accurately represent the actions of wired-AND and wired-OR circuits during simulation. In turn, this lead to the “W” (Weak Unknown”), which represents the case where a signal is being driven by both “L” and “H” values from two different drivers.

The “-” (Don’t Care) value is not intended to be used for branch or loop selection such as “IF signal_a = ‘-‘ THEN…” as this will not be true unless “signal_a” actually is “-“; instead, the “-” value is intended to be used for defining outputs; for example:


You may have noticed my mentioning that the simulator resolves the signal to the correct value. In fact, two base types are defined within the 1164 package: “std_ulogic” and “std_logic.” Both of these types use the same nine-value logic system; however, “std_logic” is resolved while “std_ulogic” is unresolved. What this means is that — when using “std_logic” — if multiple drivers are present the simulator is capable of resolving the situation and determining the resultant state of the signal using the resolution table below:


Of course, this does not mean that the resolved signal will be one the engineer expects and/or one that is useable in the final design. Frequently, if the engineer has inadvertently connected multiple drivers together, the result will be an “X” (Forcing Unknown).

This problem is somewhat solved by using the “std_ulogic” type, because this will cause the simulator to report an error if multiple drivers are present (generally speaking, most signals in logic designs should have only one driver).

The “std_ulogic” type is not commonly used with FPGA designs (although this depends on each company’s coding rules). However, this type is very popular with ASIC designs where inadvertent multiple drivers can have a serious effect on the ensuing silicon realization


Registers and Latches


Along with other simple logic functions like multiplexers, the programmable blocks primarily consist of lookup tables (LUTs) and registers (flip-flops). Registers are the storage elements within the FPGA that we use to form the cores of things like counters, shift registers, state machines, and DSP functions — basically anything that requires us to store a value between clock edges.

A register’s contents are updated on the active edge of the clock signal (“clk” in the diagram below), at which time whatever value is present on the register’s data input (“a” in the diagram below) is loaded into the register and stored until the next clock edge. (These discussions assume that any required setup and hold times are met. If not, a metastable state may ensue. This will be the topic of a future column.) Some registers are triggered by a rising (or positive) edge on the clock, others are triggered by a falling (or negative) clock edge. The following illustration reflects a positive-edge-triggered register:

Excluding any blocks of on-chip memory, there is another type of storage element that may be used in a design — the transparent latch. Transparent latches differ from registers in that registers are edge-triggered while latches are level-sensitive. When the enable signal (“en” in the diagram below) is in its active state, whatever value is presented to the latches’ data input (“a” in the diagram below) is transparently passed through the latch to its output.

If the value on the data input changes while the enable is in its active state, then this new value will be passed through to the output. When the enable returns to its inactive state, the current data value is stored in the latch. If the enable signal is in its inactive state, any changes on the data input will have no effect. Some latches have active-high enables, others have active-low enables. The following illustration reflects a transparent latch with an active-high enable:

In some FPGA architectures, a register element in a programmable block may be explicitly configured to act as a latch. In other cases, the latch may be implemented using combinatorial logic (in the form of LUTs) with feedback. Designers may explicitly decide to create their VHDL code in such a way that one or more latches will result. More commonly, however, a designer doesn’t actually mean to use a latch, but the synthesis engine infers a latch based on the way the code is captured.

Generally speaking, the use of latches within an FPGA is frowned upon. As we previously noted, the fundamental programmable fabric in an FPGA is based around a LUT and a register. These devices are designed to have local and global clock networks with low skew. FPGA architectures are not intended to have enable signals replacing clock signals.

This means that if the VHDL code implies the use of a latches, the synthesis tool will be forced to implement those latches in a non-preferred way, which will vary depending on the targeted device architecture. In addition to unwanted race conditions, the use of latches can cause downstream issues in the verification phase with regard to things like static timing analysis and formal equivalence checking. For this reason, most FPGA designers agree that latches are best avoided and registers should be used wherever local storage elements are required.

Latches are generally created when the engineer accidentally forgets to cover all assignment conditions within a combinatorial process, therefore implying storage as illustrated in the following VHDL code example:

The IF statement does not check all conditions on the “en” signal, which means that storage is implied and a latch will be inferred for the “op” output signal. In fact, when this code was synthesised, the tool issued the following warning:

WARNING:Xst:737 — Found 1-bit latch for signal <op>. Latches may be generated from incomplete case or if statements. We do not recommend the use of latches in FPGA/CPLD designs, as they may lead to timing problems.
Unit synthesized.

So, how do we correct for this error? Well, in this case we can make the process synchronous as illustrated in the following VHDL code example:

This code will not result in latch generation. It is important to note, however, that only the design engineers know exactly what it is they are trying to achieve, so it is up to them to decide whether they do require storage (in which case they can use a synchronous process and an enable signal) or whether a purely combinatorial process is desired.

Have you had any experiences with unwanted latches being inferred by the synthesis engine? Alternatively, have you created any FPGA designs in which you decided latches were a necessary evil? If so, please explain.

Click here to download a ZIP file containing the code examples presented in this article.


Sensitivity Lists and Simulation


If you are unfamiliar with the VHDL hardware description language (HDL) and are interested in knowing things like the difference between a signal and a variable, or the difference between a function and a procedure, please stay tuned.

As a starter, I thought I would explain the importance of the sensitivity list, which is employed in a VHDL process. At some point, everyone developing VHDL will end up writing something similar to the following lines of code to implement a simple two-stage synchronizer:

In this case, the “sync” signal acts as the output from this process. The “ip” signal is an input. That signal would be declared in the entity associated with the process, but that’s a topic for another column.

The “(reset,clock)” portion of the process is referred to as the sensitivity list. It contains the signals that will cause a simulation tool (e.g., ModelSim) to execute the process and update the signals. This is required because all processes are executed concurrently, so the simulation tool needs to know which processes need updating as its simulation cycle progresses.

In the example above, events occurring on the “reset” or “clock” signals will result in the process being executed. The process will execute sequentially, updating the “sync” signal on the rising edge of the clock.

For clocked processes such as the one presented here, the sensitivity list requires only the “reset” and “clock” signals. Any other signals that appear on the righthand side of the assignment operator (<=) in a clocked process do not need to be included in the sensitivity list. Their states are effectively being sampled on the edge of the “clock.” You could include both the “sync” and “ip” signals in the sensitivity list of the above process, but neither would be used unless any changes on them occurred at the same time as they were being sampled by the rising edge of the “clock.”

By comparison, in the case of combinatorial processes, it is necessary for the sensitivity list to include all the input signals used within the process if you wish to avoid potential issues. Consider a simple 4:1 multiplexer, for example.

Note that, in this case, the “op” signal is the output from the process. Once again, this signal would be declared in the entity associated with the process.

As you can see, this process will be executed whenever a change occurs on the “select_ip” signal (which is a two-bit signal, by the way) or when the values on the “a,” “b,” “c,” or “d” data inputs are updated.

This is the really important point. If we were to include only the “select_ip” signal in our sensitivity list, then we would have a problem. The result would still be legal VHDL. However, the simulator would not respond to any of the changes on the multiplexer’s data inputs. Even worse, the resulting gate-level representation generated by the synthesis tool would be different, because the synthesis engine looks not only at the sensitivity list, but also at the code to extract the behavior of the process when implementing the logic. (This is often referred to as behavioral extraction.)

The result would be a mismatch between the simulation of the original RTL (register transfer level) representation and the simulation of the gate-level representation generated by the synthesis engine. For this reason, the synthesis engine normally reports any signals missing from the sensitivity list that may result in a simulation mismatch. These warnings, which will appear in the synthesis log file, should be investigated and corrected, but it’s a lot easier to create your code in a way that ensures the problem never arises in the first place.


Securing your FPGA Design


Let’s start by considering the high-level issues we face as engineers attempting to secure our designs. These include the following:

  1. Competitors reverse engineering our design
  2. Unauthorized production runs
  3. Unauthorized modification of the design
  4. Unauthorized access to the data within the design
  5. Unauthorized control of the end system

The severity and impact of each of these will vary depending upon the end function of the design. In the case of an industrial control system, for example, someone being able to take unauthorized control could be critical and cause untold damage and loss of life. A secure data processing system will place emphasis on integrity of the data being critical. By comparison, in the case of a commercial product, preventing reverse engineering, unauthorized production runs, or even modification might be the driving factors.

Luckily, as engineers, we can use a number of approaches to prevent this sort of thing from happening.

The first, and most critical, is taking control of your design data — source code, schematics, mechanical assemblies, etc. — and ensuring it’s secure. This information is the lifeblood of your company and must be protected all the way through the project life cycle, and beyond, to keep your competitive edge. Sadly, in this age of cyberattacks by anything from individuals to organized groups to nation states, this means having very good firewalls — maybe even an “air gap” — between your design network and one connected to the external world.

There are also efforts that can be undertaken to secure your design within the design process itself. These efforts can be split into the following approaches, which are in no way mutually exclusive:

No. 1: Restrict physical access to the FPGA
One of the first methods that can be undertaken is to limit physical access to the unit — especially the circuit card and the FPGA(s). This involves using methods to detect someone tampering with the unit and taking action suitable for the system upon detection of any threat.

Examples of suitable action would be to safely power down the unit or to erase functional parameters preventing further use of the unit. This is often the case in many industrial control systems or military systems to prevent unauthorized access attempts. Depending upon the end application, other physical methods can be undertaken, such as conformal coating or potting to prevent identification of key components. The use of soldered — as opposed to socketed — components also goes without saying.

No. 2: Encryption of configuration streams
Many applications use SRAM-based FPGAs due to the ability to update the design in the field. Typically, these designs require a configuration device that loads the FPGA configuration at power-up and other times. This configuration data stream may be accessed by a third party (depending upon what physical precautions you have taken).

Many devices these days allow for encryption (normally AES) of the data stream, or even the need to know an encryption key before the device can be programmed further or data read back. Physically, the designer of the PCB can also limit people’s abillity to probe these points by using a multi-layer PCB and by not routing tracks on the top of the board, but instead using internal layers. This is especially efficacious if external termination resistors are not required or can be embedded in the PCB itself (this does add cost)

No. 3: Disable read back or even reconfiguration
Many devices provide the option to prevent the reading back of data over the JTAG interface. Some devices even provide the option to prevent upgrading the device if a certain flag is set, thereby turning a re-programmable device into a one-time programmable (OTP) component. Of course, if you take this course of action, you need to be certain that you will not need to change the design and that you are programming the correct file. (I am sure we have all, at one point, programmed the wrong file into a device. Or is that just me?)

No. 4: Protect that JTAG port
Most access attempts to reverse engineer, modify, or change the functionality of your design are going to be made initially via your JTAG chain. There is a very interesting paper on this topic that you can access by clicking here. It is therefore imperative that you protect your JTAG interface, which should never appear on an external connector, but instead require that the unit be disassembled in order to access the connector.

Ensuring your physical security measures in the field should provide protection over this interface. It’s also a good idea to provide several small chains that can be joined together via numerous tap controllers or external cabling, instead of creating large JTAG chains. Obviously, your design should not indicate on the silk screen where or what the JTAG connectors are. Some more secure designs do not include physical JTAG connectors, but rather just pads on the PCB to which a “bed of nails” type approach can be used to programme the devices.

If the device TAP controller contains the optional TRST pin, then it is possible to fit a zero ohm link to ground programming to hold the TAP in reset, thereby preventing the TAP controller from working. You can do the same with the TCLK pin if the TRST pin is not available. This means your attacker has to find and remove this resistor before the port will work.

No. 5: Differential power analysis
This is a technique that hackers can use to determine when the unit is processing data or when it is idling. As the power profile changes, it is possible to determine a significant amount about the design and the data passing through the system. One solution to this is to ensure the module / system draws the same power regardless of whether it is processing data full-out or while sitting idling, thereby preventing this information from being collected. This requires a more complicated power management and thermal management systems, but can be achieved by means of a shunt regulator, which becomes a constant current load on the main power supply.

No. 6: Design in the ability to detect counterfeits
There is always the possibility that — no matter how many precautions you have taken — your design, or portions thereof, can be copied and reused. However, there are systems you can implement within your code that will enable you to detect if your design has been copied. One potential method is the DesignTag approach from Algotronix, which uses a very unique and innovative method of identifying your design.

The discussions above present just some of the possible threats that are out there, along with a selection of techniques that can be undertaken to secure your design