{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T08:04:59Z","timestamp":1767859499988,"version":"3.49.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2011,10,1]],"date-time":"2011-10-01T00:00:00Z","timestamp":1317427200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Excellent Research Projects of National Taiwan University","award":["99R80304"],"award-info":[{"award-number":["99R80304"]}]},{"name":"Macronix International Co., LTD.","award":["99-S-C25"],"award-info":[{"award-number":["99-S-C25"]}]},{"name":"Etron Technology Inc.","award":["10R70152"],"award-info":[{"award-number":["10R70152"]}]},{"DOI":"10.13039\/501100001868","name":"National Science Council Taiwan","doi-asserted-by":"publisher","award":["NSC 100-2220-E-002-015, NSC 100-2219-E-002-030, NSC 100-2219-E-002-027"],"award-info":[{"award-number":["NSC 100-2220-E-002-015, NSC 100-2219-E-002-030, NSC 100-2219-E-002-027"]}],"id":[{"id":"10.13039\/501100001868","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2011,10]]},"abstract":"<jats:p>As technology continues to shrink, reducing leakage is critical to achieving energy efficiency. Previous studies on low-power GPUs (Graphics Processing Units) focused on techniques for dynamic power reduction, such as DVFS (Dynamic Voltage and Frequency Scaling) and clock gating. In this paper, we explore the potential of adopting architecture-level power gating techniques for leakage reduction on GPUs. We propose three strategies for applying power gating on different modules in GPUs. The Predictive Shader Shutdown technique exploits workload variation across frames to eliminate leakage in shader clusters. Deferred Geometry Pipeline seeks to minimize leakage in fixed-function geometry units by utilizing an imbalance between geometry and fragment computation across batches. Finally, the simple time-out power gating method is applied to nonshader execution units to exploit a finer granularity of the idle time. Our results indicate that Predictive Shader Shutdown eliminates up to 60% of the leakage in shader clusters, Deferred Geometry Pipeline removes up to 57% of the leakage in the fixed-function geometry units, and the simple time-out power gating mechanism eliminates 83.3% of the leakage in nonshader execution units on average. All three schemes incur negligible performance degradation, less than 1%.<\/jats:p>","DOI":"10.1145\/2019608.2019612","type":"journal-article","created":{"date-parts":[[2011,10,18]],"date-time":"2011-10-18T13:01:58Z","timestamp":1318942918000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":34,"title":["Power gating strategies on GPUs"],"prefix":"10.1145","volume":"8","author":[{"given":"Po-Han","family":"Wang","sequence":"first","affiliation":[{"name":"National Taiwan University, Taipei, Taiwan (R.O.C.)"}]},{"given":"Chia-Lin","family":"Yang","sequence":"additional","affiliation":[{"name":"National Taiwan University, Taipei, Taiwan (R.O.C.)"}]},{"given":"Yen-Ming","family":"Chen","sequence":"additional","affiliation":[{"name":"National Taiwan University, Taipei, Taiwan (R.O.C.)"}]},{"given":"Yu-Jung","family":"Cheng","sequence":"additional","affiliation":[{"name":"National Taiwan University, Taipei, Taiwan (R.O.C.)"}]}],"member":"320","published-online":{"date-parts":[[2011,10,18]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1201775.882348"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/1018419.1019622"},{"key":"e_1_2_1_3_1","unstructured":"Beyond3D. 2008. Ati rv635 chip details. http:\/\/www.beyond3d.com\/resources\/chip\/127.  Beyond3D. 2008. Ati rv635 chip details. http:\/\/www.beyond3d.com\/resources\/chip\/127."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.782564"},{"key":"e_1_2_1_5_1","unstructured":"Butler H. 2010. Nvidia geforce gtx 480 1 536mb review. http:\/\/www.bittech.net\/hardware\/2010\/03\/27\/nvidia-geforce-gtx-480-1-5gb-review\/10.  Butler H. 2010. Nvidia geforce gtx 480 1 536mb review. http:\/\/www.bittech.net\/hardware\/2010\/03\/27\/nvidia-geforce-gtx-480-1-5gb-review\/10."},{"key":"e_1_2_1_6_1","first-page":"473","article-title":"Graphics pipeline performance. In GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics, R. Fernando, Ed., Pearson Higher Education","volume":"28","author":"Cebenoyan C.","year":"2004","journal-title":"Chapter"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Chandrakasan A. Bowhill W. J. and Fox F. 2000. Design of High-Performance Microprocessor Circuits. Wiley-IEEE Press.   Chandrakasan A. Bowhill W. J. and Fox F. 2000. Design of High-Performance Microprocessor Circuits. Wiley-IEEE Press.","DOI":"10.1109\/9780470544365"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/285305.285318"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-007-0081-1"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/258694.258706"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2006.1620807"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Conference on Asia South Pacific Design Automation\/VLSI Design (ASPDAC'02)","author":"Duarte D."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA'02)","author":"Flautner K"},{"key":"e_1_2_1_14_1","unstructured":"Goodhead P. 2010. Matrix hd 5870 power consumption and thermals. http:\/\/www.bittech.net\/hardware\/graphics\/2010\/07\/15\/asus-matrix-hd-5870-graphics-card-review\/7.  Goodhead P. 2010. Matrix hd 5870 power consumption and thermals. http:\/\/www.bittech.net\/hardware\/graphics\/2010\/07\/15\/asus-matrix-hd-5870-graphics-card-review\/7."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391659"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTAS.2008.33"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146909.1147063"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815998"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013249"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/285305.285321"},{"key":"e_1_2_1_21_1","unstructured":"ITRS. 2006. International technology roadmap for semiconductors.  ITRS. 2006. International technology roadmap for semiconductors."},{"key":"e_1_2_1_22_1","unstructured":"Iyer A. 2006. Demystify power gating and stop leakage cold. http:\/\/www.eetimes.com\/design\/automotive-design\/4012054\/Demystif'y-power-gating-and-stop-leakage-cold.  Iyer A. 2006. Demystify power gating and stop leakage cold. http:\/\/www.eetimes.com\/design\/automotive-design\/4012054\/Demystif'y-power-gating-and-stop-leakage-cold."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1011528.1011531"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2005.862716"},{"key":"e_1_2_1_25_1","unstructured":"Kanter D. 2008. Nvidia's gt200: Inside a parallel processor. http:\/\/www.realworldtech.com\/page.cfm? ArticleID=RWT090808195242.  Kanter D. 2008. Nvidia's gt200: Inside a parallel processor. http:\/\/www.realworldtech.com\/page.cfm? ArticleID=RWT090808195242."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.848210"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/379240.379268"},{"key":"e_1_2_1_28_1","unstructured":"Khronos. 2010. Opencl overview. http:\/\/www.khronos.org\/opencl\/.  Khronos. 2010. Opencl overview. http:\/\/www.khronos.org\/opencl\/."},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Lindholm E. and Oberman S. 2007. Nvidia geforce 8800 gpu. In Hot Chips 19: A Symposium on High Performance Chips.  Lindholm E. and Oberman S. 2007. Nvidia geforce 8800 gpu. In Hot Chips 19: A Symposium on High Performance Chips.","DOI":"10.1109\/HOTCHIPS.2007.7482490"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/54.914592"},{"key":"e_1_2_1_31_1","unstructured":"Mantor M. 2007. Radeon r600 a 2nd generation unified shader architecture. In Hot Chips 19: A Symposium on High Performance Chips.  Mantor M. 2007. Radeon r600 a 2nd generation unified shader architecture. In Hot Chips 19: A Symposium on High Performance Chips."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'06)","author":"Mochocki B."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146909.1147062"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 22nd ACM SIGGRAPH\/EUROGRAPHICS Symposium on Graphics hardware (GH'07)","author":"Nam B.-G."},{"key":"e_1_2_1_35_1","unstructured":"Nvidia. 2010. Cuda zone. http:\/\/www.nvidia.com\/object\/cuda..home..new.html.  Nvidia. 2010. Cuda zone. http:\/\/www.nvidia.com\/object\/cuda..home..new.html."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/344166.344526"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276489"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1058129.1058142"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2005.1430559"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2006.872869"},{"key":"e_1_2_1_41_1","unstructured":"Spille C. V\u00f6tter R. and Sauter M. 2010. Geforce gtx 480 and gtx 470 reviewed: Fermi performance benchmarks. http:\/\/www.pcgameshardware.comlaid. 7 43498\/Geforce-GTX-480-and -GTX -4 70-reviewed- Fermiperformance-benchmarkslReviews\/.  Spille C. V\u00f6tter R. and Sauter M. 2010. Geforce gtx 480 and gtx 470 reviewed: Fermi performance benchmarks. http:\/\/www.pcgameshardware.comlaid. 7 43498\/Geforce-GTX-480-and -GTX -4 70-reviewed- Fermiperformance-benchmarkslReviews\/."},{"key":"e_1_2_1_42_1","unstructured":"Voicu A. 2008. Ati rv770 - architecture overview. http:\/\/www.rage3d.comlreviews\/video\/atirv770\/architecture\/.  Voicu A. 2008. Ati rv770 - architecture overview. http:\/\/www.rage3d.comlreviews\/video\/atirv770\/architecture\/."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/L-CA.2009.1"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.22"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2019608.2019612","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2019608.2019612","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:07:42Z","timestamp":1750273662000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2019608.2019612"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,10]]}},"alternative-id":["10.1145\/2019608.2019612"],"URL":"https:\/\/doi.org\/10.1145\/2019608.2019612","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,10]]},"assertion":[{"value":"2009-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-10-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}