The File - Dec 16, 2008 - (Page 8)

In Focus | DSPs Design, debug with DSPs continued from page • • ing, the algorithm is divided in to sub-algorithms and run on different DSPs. The intermediate processed data is passed on to the next DSP in the multi-processor chain. In parallel multi-processing, multiple instances of a single algorithm or multiple algorithms are executed on various DSPs in the multiprocessing chain. The DSPs usually use a shared memory resource, which could be an SRAM or SDRAM. Having an on-chip external bus arbiter helps seamless and glue-less interconnections between DSPs in such multi-processing systems. At any point in time, only one DSP is the bus master, which drives the external shared memory bus while the others tri-state it. Using two (or more) processing elements in a single core— Some DSPs are equipped with two processing elements within the same core (figure 2). Such DSPs have two sets of computational units each made up of the ALU (arithmetic and logic unit), MAC (multiplier accumulator), and barrel-shifter. They have a single program sequencer and instruction decoder- execution unit. These DSPs are commonly referred to as SIMD (single instruction multiple data) DSPs as the same instruction is executed on two processing elements that operate on different set of data. This architecture is mainly targets applications where multiple sets of instructions are repeated in a loop. Execution time is halved for these loop code computations. Multi- core multi-processing—In this architecture, the DSP has multiple cores and each core has its own local memory (figure 3 shows a dual-core example). The cores also share a bigger memory space often located Figure 3: Dual-core multi-processing. outside the DSP called shared memory. These processors can run multiple instances of the same application or multiple applications on different cores. The internal bus and memory architecture form a critical portion of the chip design to ensure that there are no huge latencies and deadlock situations leaving DSP cores data-hungry, although they are high-speed, high-performance cores individually. Utilisation of resources • O n- chip memor y utilisation—Almost all DSPs have an on-chip memory. The DSP usually runs application code from its on-chip memory to achieve higher performance. The on-chip memory is certainly faster than any external memory interfaced to the processor. However, it significantly adds to the DSP’s cost and is therefore limited. It is important to use it efficiently so that there is no wastage of on-chip memory and also avoid conflicts between units that use this memory to minimise latencies. DSPs follow Harvard Architecture where the memory is divided into PROGRAM or CODE, and DATA memory sections have their own address and data buses. With this multi-bus architecture, t wo data values can be fetched simultaneously in a single cycle. This provides the biggest advantage to core DSP algorithms like filtering. A fixed set of data called coefficients is stored in PROGRAM memory, whereas real-time discrete data is stored and read from the DATA memory segment. In a single cycle, the instruction can fetch both coefficients and the data to be operated on. DSPs have an instruction cache to handle conflicts when both instruction and data fetch is attempted from the CODE/PROGRAM memory. The instruction is cached and during its next occurrence, the instruction is fetched from the cache. Most DSPs have an on-chip DMA (Direct Memory Access) controller to move data from peripherals like serial and parallel ports directly in and out of the DSP’s internal memory without any core intervention. Earlier, DSPs used dual-ported, doublepumped internal SRAMs, but with increasing core speeds, it is turning out to be quite expensive to design such high speed memories. Internal memory now usually runs at the same speed as the core but is organised as multiple blocks of memory. So while the core accesses • one block to fetch code or data, software should ensure that the DMA uses a different memory block to guarantee high performance. Off-chip memory, software overlay—Sometimes, the internal memory may not be sufficient to store the entire application even after optimising the code for size. But this does not mean that one cannot use the same DSP. Software overlay techniques that use intelligent overlay managers should be implemented. Some functions should be stored in the external memory (live space) and brought in to internal memory (run space) only when they are required. Copying the code from the live space into the internal run space is usually done by the DMA in the background while the core continues to execute some other section of code that is already in the internal memory. Read the full article to learn more about efficient utilisation of DSP core resources, and to know how to successfully debug common DSP issues. | Online Fixed vs. floating point Unite algorithm and hardware design flows EE Times-India | December 16-31, 2008 | www.eetindia.com http://www.embeddeddesignindia.co.in/ART_8800554029_2800002_TA_e55121c8.HTM?ClickFromNewsletter_081216 http://www.eetindia.co.in/article/email_friend.php3?article_id=8800515679&type=TA&cat_id=1800000&back_url=%2Farticle%2Farticle_content.php3%3Fin_param%3D8800515679_1800000_TA_427b9233%26 http://www.eetindia.co.in/article/email_friend.php3?article_id=8800515679&type=TA&cat_id=1800000&back_url=%2Farticle%2Farticle_content.php3%3Fin_param%3D8800515679_1800000_TA_427b9233%26 http://www.embeddeddesignindia.co.in/article/sendInquiry.do?articleId=8800554029&catId=2800002?ClickFromNewsletter_081216 http://www.embeddeddesignindia.co.in/article/emailToFriend.do?articleId=8800554029&catId=2800002?ClickFromNewsletter_081216 http://www.eetindia.co.in/article/email_friend.php3?article_id=8800515679&type=TA&cat_id=1800000&back_url=%2Farticle%2Farticle_content.php3%3Fin_param%3D8800515679_1800000_TA_427b9233%26 http://www.eetindia.co.in/article/email_friend.php3?article_id=8800515679&type=TA&cat_id=1800000&back_url=%2Farticle%2Farticle_content.php3%3Fin_param%3D8800515679_1800000_TA_427b9233%26 http://www.embeddeddesignindia.co.in/ART_8800539783_2800002_TA_9d6730ef.HTM?ClickFromNewsletter_081216 http://www.embeddeddesignindia.co.in/ART_8800538038_2800002_TA_ff5f58d4.HTM?ClickFromNewsletter_081216 http://www.eetindia.com/STATIC/REDIRECT/Newsletter_081216_EETI02.htm?ClickFromNewsletter_081216

Table of Contents for the Digital Edition of The File - Dec 16, 2008

EETimes India - December 16, 2008
Contents
National Semiconductor
Data Plane Processing Challenges DSP
Design, Debug with DSPs
TES 2008, NCMIPMV '08, ICACT '08, VLSI Conferene 2009, ICETiC 2009

The File - Dec 16, 2008