Using Customizable Microcontrollers to Beef up Performance and Cut Power Drain
Tim Kubitschek and Jay Johnson, Atmel Corporation
The year-long design cycle for a custom ASIC is often longer than the life of the end-product, which may be as short as six months. On top of that, the NRE costs for
click to enlarge
Figure 1. D-type Flip-flop in 130 nm MPCF and 130 nm Standard Cell
It is faster and more cost effective to implement designs in standard off-the-shelf microcontrollers, many of which are systems-on-chip (SoC) that offer extensive networking capability and human interface functions such as LCD controllers and camera interfaces. These off-the shelf SoCs frequently have all the functionality, performance and low power consumption that can be achieved with a cell-based ASIC. However, all too often some computationally intensive portion of the design requires hardware acceleration. Turbocoding, GPS correlators and graphics processing are all candidates for implementation in hardware. Increasingly, these DSP-type functions are being implemented in FPGAs, which have all but replaced platform ASICs.
However, FPGAs have their drawbacks most notably very high power consumption, slower performance and the relative lack of security for the IP in the FPGA. Power consumption is a particularly serious issue in wireless systems since so many of them are battery powered. Although FPGA costs have declined rapidly, volume price reductions stall at about $10 for 10,000 units. FPGAs are still expensive.
A new ASIC technology that employs a metal-programmable cell fabric (MPCF), achieves very high gate densities of between 170K and 210K gates/mm2. MPCF silicon efficiency is comparable to that of cell-based ASICs. For example, an MPCF cell implementing a D flip-flop (DFF) versus a standard cell DFF both in a 130 nm process consumes nearly the identical area.
MPCF technology is being used in existing MCUs with SoC-level integration to create a customizable SoC platform that achieves the very low unit prices of cell-based ASICs with the quick turn-
click to enlarge
Figure 2. AT91CAP9 Block Diagram
A simpler, ARM7-based version, the AT91CAP7, offers a USB device, SPI master and slave, two USARTs, three 16-bit timer counters, an 8-channel/ 10-bit analog to digital converter, interrupt control and supervisory functions noted above, plus an MP block equivalent to 28K or 50K FPGA LUTs (250K or 450K routable ASIC gates).
The MP block, implemented in MPCF technology, is large enough to implement a second ARM processor core, a digital signal processor (DSP), additional standard (or non-standard) interfaces and complex logic blocks such as GPS correlators. It has multiple distributed Single- and Dual-Port RAM blocks that can be tightly coupled to the logic elements that require them. The MP Block is supplied by all the clocks originating from the Clock Generator and Power Management Controller for maximum flexibility in clocking the application-specific logic elements implemented in it.
As many as 24 peripheral DMA channels, including up to 13 channels in the MP block, are managed by a peripheral DMA controller that off loads data moving tasks between memories and peripherals. Thus, a 20 Mb/s SPI transfer that could completely overwhelm a conventional ARM9-based MCU, can occur with 88% of the ARM9's cycles free for application processing. A separate 4-channel DMA controller handles the Ethernet MAC, LCD controller and camera interface.
A 12-layer high-speed bus matrix on the configurable MCU provides six masters dedicated to the CPU data, CPU instruction, peripheral DMA controller, Ethernet, USB Host, plus six additional bus masters dedicated to the MP block. The bus slaves are the memories, USB device, the peripheral bus bridge, with three slaves for the MP block. Any master can take control of any available bus when needed. Since there are as many busses as masters, there is never any bus contention.
A set of interrupt lines for peripherals implemented in the MP Block, a set of peripheral enable lines, two parallel sets of dedicated I/O ports and a multiplexed connection to the USB device transceiver that allows a second USB Device to be implemented in the MP Block.
An external bus interface (EBI) supporting SDRAM, NAND Flash with error code correction (ECC) and CompactFlash that supports True IDE mode interface to GByte-plus on-board or removable memory including USB sticks.
The design flow of an MPCF-based configurable microcontroller is basically identical to that of a system with an off-the-shelf ARM7 or ARM9 MCU and a Xilinx or Altera FPGA. In fact, the MCU-plus-FPGA design may be manufactured in production volumes to test the market. Once the product's success is verified, the entire design can be migrated directly to the customizable microcontroller.
click to enlarge
Figure 3. AT91CAP7 Block Diagram
The RTL code for the MP Block is validated for compatibility with the fixed portion of the microcontroller. The RTL code is then synthesized using process-specific target libraries supplied by the vendor and functional simulations are performed on the entire device.
The low-level device drivers for the platform are supplied by the MCU vendor, and those for the MP Block originate from the customer or third-party design house. These are integrated with the application modules that program the MCU and peripherals/interfaces. Most popular operating systems have been ported to AT91CAP9 devices.
The AT91CAP emulation board includes a full complement of memories, standard interfaces and network connections together with additional connections that can be configured for the requirements of the application.
Emulation almost always highlights errors in the hardware and/or software, or the hardware/software interface of the device. The ability to correct and re-test the complete design of the device at this stage is a major factor in reducing the design time and cost, and increases the probability of right-first-time silicon and software. An additional benefit is that the emulated version of the final design can be used as the starting point for future design iterations, at a substantial saving of design effort.
ASIC Price & Performance without ASIC NREs and Design Cycles.
Using a customizable microcontroller with a metal-programmable cell fabric allows designers to integrate their custom IP into a near-off-the-shelf solution. It offers the cost, power consumption and performance benefits of a full
By eliminating the external FPGA, Atmel's customizable microcontrollers also eliminate its nearly 2W of static power consumption. The ARM7 and ARM9 based CAP MCUs consumes 3 to 4 mW static about 99.8% less than an FPGA. Performance is also much better. The logic in the AT91CAP customizable MCU can run at 400 MHz about eight times the maximum clock for FPGA logic.
Finally, unit IC costs are 30% to 60% lower than for an ARM-plus-FPGA combination, net of NRE charges. About the Author
Tim Kubitschek is the marketing manager for Atmel's CAP customizable microcontroller products. Jay Johnson manages the America's ASIC Marketing group for Atmel Corporation's Advanced Products Business Unit in Colorado Springs, Colorado.