The File - Nov 1, 2008 - (Page 9)
In Focus | Multicore/multiprocessor design Optimise software for multi-core processors continued from page • • for them to complete their work. Pipelining—The developer breaks a computation down into steps, where a different thread handles each step. Different data sets will exist in different stages of a pipeline. Peer— A thread creates other threads and participates as a peer, sharing the work. Let us consider an example of how worker threads can help implement parallelism in a quadcore system. In code listing 1, a single-threaded function called fill_array() updates a large, twodimensional array. The function simply iterates through the array, updating each element. Since the function updates the elements independently (element N value does not depend on N-1 element), it is easily made parallel. To increase the speed of fill_ array(), the developer can create a worker thread for each CPU and have each worker thread update a portion of the array. Code listing 2 shows how the developer can create these threads with the POSIX pthread_create() call. To specify the entry point for a thread (where the thread starts running), the developer passes a function pointer to pthread_create(). In this case, each worker thread starts with the fill_ar- ray_fragment() function, shown in code listing 3. Each worker thread determines which portion of the array it should update, ensuring no overlap with other worker threads. All worker threads can then proceed in parallel, taking advantage of SMP’s ability to schedule any thread on any available CPU core. The main thread must wait for the array to be fully updated before performing additional operations. To ensure that the main thread waits, this example uses barrier synchronisation, pthread_ barrier_wait(), in which any thread that has finished its work waits at a barrier until all other threads arrive at the barrier. Now that the software is converted to a multi-threaded approach, it can scale with the number of CPUs. The example uses a four-CPU system, but it is easy to adjust the number of worker threads to accommodate more or less processors. Read the full article to know what visualisation tools you can use for optimising software. In addition, learn about detection and reduction of resource contention, and more. Online | void multi _ thread _ fill _ array() { int thread, rc; pthread _ t worker _ threads[NUM _ CPUS]; int thread _ index[NUM _ CPUS]; // Sync 5 threads, main + threads 2-5 pthread _ barrier _ init(&barrier, NULL, NUM _ CPUS+1); for (thread = 0; thread < NUM _ CPUS; ++thread) { thread _ index[thread] = thread; rc = pthread _ create(&worker _ threads[thread], NULL, &fill _ array _ fragment, (void *)&thread _ index[thread]); if (rc) { // handle error } } pthread _ barrier _ wait(&barrier); pthread _ barrier _ destroy(&barrier); return; } Code listing 2: Main thread creates worker threads to update the array in parallel and creates synchronisation objects. void *fill _ array _ fragment(int *thread _ index) { int col = 0; int start _ row = 0; int end _ row = 0; // Compute the rows to update by this thread start _ row = *thread _ index * (NUM _ ROWS/NUM _ CPUS); end _ row = start _ row + (NUM _ ROWS/NUM _ CPUS) - 1; while (start _ row <= end _ row) { for (col = 0; col < NUM _ COLUMNS; col++) { array[start _ row][col] = ((start _ row/2 * col) / 3.2) + 1.0; } ++start _ row; } //Wait at barrier for all threads to complete pthread _ barrier _ wait(&barrier); return NULL; } Apply parallel programming basics to multi-processor designs Apply parallel programming basics to multi-processor designs, Part 2 Code listing 3: Each worker thread updates a portion of the array, then waits at the barrier. Managing threads, communications in multi-core partitioning continued from page the right software partitioning for their apps. Think non-sequentially Key to this effort, however, is writing the applications code to be multi-threaded from the outset, but this is still a challenge for many developers. “It takes a long time for developers to get used to the consequences of threads interacting in multi-threaded design on a single processor,” concurred Mark Zarins, vice president of products at GrammaTech. “The consequences of thread interaction are worse in multi-processing,” he added. This challenge has prompted at least one multi-core processor vendor to take an entirely different approach: Ambric has its customers develop software for the company’s massively parallel processor architecture (MPPA) using a structural object programming model. This method has developers describe their applications in block-diagram form comprising sets of software objects that link together. The company’s tools then map each software object to its own processor in an MPPA array with hundreds of processors connected through self-synchronising communications channels. Making the partitioning inherent in the software’s structure eliminates the need to write code before partitioning. Regardless of approach, developers must face the challenges of multi-core development sooner or later. “This is a natural progression,” said Brehmer. Online Embedded multi-core enables greener networks Commentar y: Multicore needs communications by design EE Times-India | November 1-15, 2008 | www.eetindia.com
http://www.embeddeddesignindia.co.in/ART_8800549679_2800001_TA_2f060283.HTM?ClickFromNewsletter_081101
http://www.eetindia.co.in/SEARCH/SUMMARY/technical-articles/POSIX.HTM
http://www.embeddeddesignindia.co.in/article/sendInquiry.do?articleId=8800549679&catId=2800001?ClickFromNewsletter_081101
http://www.embeddeddesignindia.co.in/article/emailToFriend.do?articleId=8800549679&catId=2800001?ClickFromNewsletter_081101
http://www.embeddeddesignindia.co.in/ART_8800544577_2800001_TA_0862e3a1.HTM?ClickFromNewsletter_081101
http://www.embeddeddesignindia.co.in/ART_8800544578_2800001_TA_7fb2447c.HTM?ClickFromNewsletter_081101
http://www.eetindia.co.in/ART_8800545407_1800001_NT_5c163617.HTM?ClickFromNewsletter_081101
http://www.eetindia.co.in/ART_8800543368_1800000_NT_9271c232.HTM?ClickFromNewsletter_081101
http://www.eetindia.com/STATIC/REDIRECT/Newsletter_081101_EETI02.htm?ClickFromNewsletter_081101
Table of Contents for the Digital Edition of The File - Nov 1, 2008
EETimes India - November 1, 2008
Contents
National Semiconductor
Managing Threads, Communications in Multicore Partitioning
Texas Instruments
DigiKey
Improve Multi-core Hypervisor Efficiency
NECE 2008, Power India 2008, CSF: Electronics & Components, Wind India 2008, User2User 2008 India, National Conference On E-Governance
The File - Nov 1, 2008
The File - Nov 1, 2008 - Contents (Page 1)
The File - Nov 1, 2008 - National Semiconductor (Page 2)
The File - Nov 1, 2008 - National Semiconductor (Page 3)
The File - Nov 1, 2008 - Managing Threads, Communications in Multicore Partitioning (Page 4)
The File - Nov 1, 2008 - Texas Instruments (Page 5)
The File - Nov 1, 2008 - Texas Instruments (Page 6)
The File - Nov 1, 2008 - DigiKey (Page 7)
The File - Nov 1, 2008 - Improve Multi-core Hypervisor Efficiency (Page 8)
The File - Nov 1, 2008 - Improve Multi-core Hypervisor Efficiency (Page 9)
The File - Nov 1, 2008 - NECE 2008, Power India 2008, CSF: Electronics & Components, Wind India 2008, User2User 2008 India, National Conference On E-Governance (Page 10)
The File - Nov 1, 2008 - NECE 2008, Power India 2008, CSF: Electronics & Components, Wind India 2008, User2User 2008 India, National Conference On E-Governance (Page 11)
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20090101
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081216
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081201
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081116
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081101
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081016
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20081008
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20080916
https://www.nxtbook.com/nxtbooks/emedia/eetindia_20080901
https://www.nxtbookmedia.com