Jump to content

Graphics processing unit

From Wikipedia, the free encyclopedia
(Redirected fromGPU)
Components of a GPU

Agraphics processing unit(GPU) is a specializedelectronic circuitinitially designed to acceleratecomputer graphicsandimage processing(either on avideo cardor embedded onmotherboards,mobile phones,personal computers,workstations,andgame consoles). After their initial design, GPUs were found to be useful for non-graphic calculations involvingembarrassingly parallelproblems due to theirparallel structure.Other non-graphical uses include the training ofneural networksandcryptocurrency mining.

History[edit]

1970s[edit]

Arcade system boardshave used specialized graphics circuits since the 1970s. In early video game hardware,RAMfor frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor.[1]

A specializedbarrel shiftercircuit helped the CPU animate theframebuffergraphics for various 1970sarcade video gamesfromMidwayandTaito,such asGun Fight(1975),Sea Wolf(1976), andSpace Invaders(1978).[2]TheNamco Galaxianarcade system in 1979 used specializedgraphics hardwarethat supportedRGB color,multi-colored sprites, andtilemapbackgrounds.[3]The Galaxian hardware was widely used during thegolden age of arcade video games,by game companies such asNamco,Centuri,Gremlin,Irem,Konami,Midway,Nichibutsu,Sega,and Taito.[4]

AtariANTICmicroprocessor on an Atari 130XE motherboard

TheAtari 2600in 1977 used a video shifter called theTelevision Interface Adaptor.[5]Atari 8-bit computers(1979) hadANTIC,a video processor which interpreted instructions describing a "display list"—the way the scan lines map to specificbitmappedor character modes and where the memory is stored (so there did not need to be a contiguous frame buffer).[clarification needed][6]6502machine codesubroutinescould be triggered onscan linesby setting a bit on a display list instruction.[clarification needed][7]ANTIC also supported smoothverticalandhorizontal scrollingindependent of the CPU.[8]

1980s[edit]

NECμPD7220A

TheNEC μPD7220was the first implementation of apersonal computergraphics display processor as a singlelarge-scale integration(LSI)integrated circuitchip. This enabled the design of low-cost, high-performance video graphics cards such as those fromNumber Nine Visual Technology.It became the best-known GPU until the mid-1980s.[9]It was the first fully integratedVLSI(very large-scale integration)metal–oxide–semiconductor(NMOS) graphics display processor for PCs, supported up to1024×1024 resolution,and laid the foundations for the emerging PC graphics market. It was used in a number of graphics cards and was licensed for clones such as the Intel 82720, the first ofIntel's graphics processing units.[10]The Williams Electronics arcade gamesRobotron 2084,Joust,Sinistar,andBubbles,all released in 1982, contain customblitterchips for operating on 16-color bitmaps.[11][12]

In 1984,Hitachireleased ARTC HD63484, the first majorCMOSgraphics processor for personal computers. The ARTC could display up to4K resolutionwhen inmonochromemode. It was used in a number of graphics cards and terminals during the late 1980s.[13]In 1985, theAmigawas released with a custom graphics chip including ablitterfor bitmap manipulation, line drawing, and area fill. It also included acoprocessorwith its own simple instruction set, that was capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiple xing, and hardware windowing), or driving the blitter. In 1986,Texas Instrumentsreleased theTMS34010,the first fully programmable graphics processor.[14]It could run general-purpose code, but it had a graphics-oriented instruction set. During 1990–1992, this chip became the basis of theTexas Instruments Graphics Architecture( "TIGA" )Windows acceleratorcards.

TheIBM 8514Micro Channel adapter, with memory add-on

In 1987, theIBM 8514graphics system was released. It was one of the first video cards forIBM PC compatiblesto implementfixed-function2D primitives inelectronic hardware.Sharp'sX68000,released in 1987, used a custom graphics chipset[15]with a 65,536 color palette and hardware support for sprites, scrolling, and multiple playfields.[16]It served as a development machine forCapcom'sCP Systemarcade board. Fujitsu'sFM Townscomputer, released in 1989, had support for a 16,777,216 color palette.[17]In 1988, the first dedicatedpolygonal 3Dgraphics boards were introduced in arcades with theNamco System 21[18]andTaitoAir System.[19]

VGAsection on the motherboard inIBM PS/55

IBMintroduced itsproprietaryVideo Graphics Array(VGA) display standard in 1987, with a maximum resolution of 640×480 pixels. In November 1988,NEC Home Electronicsannounced its creation of theVideo Electronics Standards Association(VESA) to develop and promote aSuper VGA(SVGA)computer display standardas a successor to VGA. Super VGA enabledgraphics display resolutionsup to 800×600pixels,a 36% increase.[20]

1990s[edit]

Tseng LabsET4000/W32p
S3 GraphicsViRGE
Voodoo32000 AGP card

In 1991,S3 Graphicsintroduced theS3 86C911,which its designers named after thePorsche 911as an indication of the performance increase it promised.[21]The 86C911 spawned a variety of imitators: by 1995, all major PC graphics chip makers had added2Dacceleration support to their chips.[22]Fixed-functionWindows acceleratorssurpassed expensive general-purpose graphics coprocessors in Windows performance, and such coprocessors faded from the PC market.

Throughout the 1990s, 2DGUIacceleration evolved. As manufacturing capabilities improved, so did the level of integration of graphics chips. Additionalapplication programming interfaces(APIs) arrived for a variety of tasks, such as Microsoft'sWinGgraphics libraryforWindows 3.x,and their laterDirectDrawinterface forhardware accelerationof 2D games inWindows 95and later.

In the early- and mid-1990s,real-time3D graphics became increasingly common in arcade, computer, and console games, which led to increasing public demand for hardware-accelerated 3D graphics. Early examples of mass-market 3D graphics hardware can be found in arcade system boards such as theSega Model 1,Namco System 22,andSega Model 2,and thefifth-generation video game consolessuch as theSaturn,PlayStation,andNintendo 64.Arcade systems such as the Sega Model 2 andSGIOnyx-based Namco Magic Edge Hornet Simulator in 1993 were capable of hardware T&L (transform, clipping, and lighting) years before appearing in consumer graphics cards.[23][24]Another early example is theSuper FXchip, aRISC-basedon-cartridge graphics chipused in someSNESgames, notablyDoomandStar Fox.Some systems usedDSPsto accelerate transformations.Fujitsu,which worked on the Sega Model 2 arcade system,[25]began working on integrating T&L into a singleLSIsolution for use in home computers in 1995;[26]the Fujitsu Pinolite, the first 3D geometry processor for personal computers, released in 1997.[27]The first hardware T&L GPU onhomevideo game consoleswas theNintendo 64'sReality Coprocessor,released in 1996.[28]In 1997,Mitsubishireleased the3Dpro/2MP,a GPU capable of transformation and lighting, forworkstationsandWindows NTdesktops;[29]ATiused it for itsFireGL 4000graphics card,released in 1997.[30]

The term "GPU" was coined bySonyin reference to the 32-bitSony GPU(designed byToshiba) in thePlayStationvideo game console, released in 1994.[31]

In the PC world, notable failed attempts for low-cost 3D graphics chips included theS3ViRGE,ATI Rage,andMatroxMystique.These chips were essentially previous-generation 2D accelerators with 3D features bolted on. Many werepin-compatiblewith the earlier-generation chips for ease of implementation and minimal cost. Initially, 3D graphics were possible only with discrete boards dedicated to accelerating 3D functions (and lacking 2D graphical user interface (GUI) acceleration entirely) such as thePowerVRand the3dfxVoodoo.However, as manufacturing technology continued to progress, video, 2D GUI acceleration, and 3D functionality were all integrated into one chip.Rendition'sVeritechipsets were among the first to do this well. In 1997, Rendition collaborated withHerculesand Fujitsu on a "Thriller Conspiracy" project which combined a Fujitsu FXG-1 Pinolite geometry processor with a Vérité V2200 core to create a graphics card with a full T&L engine years before Nvidia'sGeForce 256;This card, designed to reduce the load placed upon the system's CPU, never made it to market.[citation needed]NVIDIARIVA 128was one of the first consumer-facing GPU integrated 3D processing unit and 2D processing unit on a chip.

OpenGLwas introduced in the early '90s by SGI as a professional graphics API, with proprietary hardware support for 3D rasterization. In 1994Microsoft acquired Softimage,the dominant CGI movie production tool used for early CGI movie hits like Jurassic Park, Terminator 2 and Titanic. With that deal came a strategic relationship with SGI and a commercial license of SGI's OpenGL libraries enabling Microsoft to port the API to the Windows NT OS but not to the upcoming release of Windows '95. Although it was little known at the time, SGI had contracted with Microsoft totransition from Unix to the forthcoming Windows NT OS,the deal which was signed in 1995 was not announced publicly until 1998. In the intervening period, Microsoft worked closely with SGI to port OpenGL to Windows NT. In that era OpenGL had no standard driver model for competing hardware accelerators to compete on the basis of support for higher level 3D texturing and lighting functionality. In 1994 Microsoft announced DirectX 1.0 and support for gaming in the forthcoming Windows '95 consumer OS, in '95Microsoft announced the acquisition of UK based Rendermorphics Ltdand the Direct3D driver model for the acceleration of consumer 3D graphics. The Direct3D driver model shipped with DirectX 2.0 in 1996. It included standards and specifications for 3D chip makers to compete to support 3D texture, lighting and Z-buffering. ATI, which was later to be acquired by AMD, began development on the first Direct3D GPU's. Nvidia, quickly pivoted from afailed deal with Segain 1996 to aggressively embracing support for Direct3D. In this era Microsoft merged their internal Direct3D and OpenGL teams and worked closely with SGI to unify driver standards for both industrial and consumer 3D graphics hardware accelerators. Microsoft ran annual events for 3D chip makers called "Meltdowns" to test their 3D hardware and drivers to work both with Direct3D and OpenGL. It was during this period of strong Microsoft influence over 3D standards that 3D accelerator cards moved beyond being simplerasterizersto become more powerful general purpose processors as support for hardware accelerated texture mapping, lighting, Z-buffering and compute created the modern GPU. During this period the same Microsoft team responsible for Direct3D and OpenGL driver standardization introduced their own Microsoft 3D chip design calledTalisman.Details of this era are documented extensively in the books: "Game of X"v.1 and v.2 by Russel Demaria,"Renegades of the Empire"by Mike Drummond,"Opening the Xbox"by Dean Takahashi and"Masters of Doom"by David Kushner. TheNvidiaGeForce 256(also known as NV10) was the first consumer-level card with hardware-accelerated T&L; While the OpenGL API provided software support for texture mapping and lighting the first 3D hardware acceleration for these features arrived with the firstDirect3D accelerated consumer GPU's.

2000s[edit]

Nvidia was first to produce a chip capable of programmableshading:theGeForce 3.Each pixel could now be processed by a short program that could include additional image textures as inputs, and each geometric vertex could likewise be processed by a short program before it was projected onto the screen. Used in theXboxconsole, this chip competed with the one in thePlayStation 2,which used a custom vector unit for hardware accelerated vertex processing (commonly referred to as VU0/VU1). The earliest incarnations of shader execution engines used inXboxwere not general purpose and could not execute arbitrary pixel code. Vertices and pixels were processed by different units which had their own resources, with pixel shaders having tighter constraints (because they execute at higher frequencies than vertices). Pixel shading engines were actually more akin to a highly customizable function block and did not really "run" a program. Many of these disparities between vertex and pixel shading were not addressed until theUnified Shader Model.

In October 2002, with the introduction of theATIRadeon 9700(also known as R300), the world's firstDirect3D9.0 accelerator, pixel and vertex shaders could implementloopingand lengthyfloating pointmath, and were quickly becoming as flexible as CPUs, yet orders of magnitude faster for image-array operations. Pixel shading is often used forbump mapping,which adds texture to make an object look shiny, dull, rough, or even round or extruded.[32]

With the introduction of the NvidiaGeForce 8 seriesand new generic stream processing units, GPUs became more generalized computing devices.ParallelGPUs are making computational inroads against the CPU, and a subfield of research, dubbed GPU computing orGPGPUforgeneral purpose computing on GPU,has found applications in fields as diverse asmachine learning,[33]oil exploration,scientificimage processing,linear algebra,[34]statistics,[35]3D reconstruction,andstock optionspricing.GPGPUwas the precursor to what is now called a compute shader (e.g. CUDA, OpenCL, DirectCompute) and actually abused the hardware to a degree by treating the data passed to algorithms as texture maps and executing algorithms by drawing a triangle or quad with an appropriate pixel shader.[clarification needed]This entails some overheads since units like thescan converterare involved where they are not needed (nor are triangle manipulations even a concern—except to invoke the pixel shader).[clarification needed]

Nvidia'sCUDAplatform, first introduced in 2007,[36]was the earliest widely adopted programming model for GPU computing.OpenCLis an open standard defined by theKhronos Groupthat allows for the development of code for both GPUs and CPUs with an emphasis on portability.[37]OpenCL solutions are supported by Intel, AMD, Nvidia, and ARM, and according to a report in 2011 by Evans Data, OpenCL had become the second most popular HPC tool.[38]

2010s[edit]

In 2010, Nvidia partnered withAudito power their cars' dashboards, using theTegraGPU to provide increased functionality to cars' navigation and entertainment systems.[39]Advances in GPU technology in cars helped advanceself-driving technology.[40]AMD'sRadeon HD 6000 seriescards were released in 2010, and in 2011 AMD released its 6000M Series discrete GPUs for mobile devices.[41]The Kepler line of graphics cards by Nvidia were released in 2012 and were used in the Nvidia's 600 and 700 series cards. A feature in this GPU microarchitecture included GPU boost, a technology that adjusts the clock-speed of a video card to increase or decrease it according to its power draw.[42]TheKepler microarchitecturewas manufacturedon the28 nm process[clarification needed].

ThePS4andXbox Onewere released in 2013; they both use GPUs based onAMD's Radeon HD 7850 and 7790.[43]Nvidia's Kepler line of GPUs was followed by theMaxwellline, manufactured on the same process. Nvidia's 28 nm chips were manufactured byTSMCin Taiwan using the 28 nm process. Compared to the 40 nm technology from the past, this manufacturing process allowed a 20 percent boost in performance while drawing less power.[44][45]Virtual reality headsetshave high system requirements; manufacturers recommended the GTX 970 and the R9 290X or better at the time of their release.[46][47]Cards based on thePascalmicroarchitecture were released in 2016. TheGeForce 10 seriesof cards are of this generation of graphics cards. They are made using the 16 nm manufacturing process which improves upon previous microarchitectures.[48]Nvidia released one non-consumer card under the newVoltaarchitecture, the Titan V. Changes from the Titan XP, Pascal's high-end card, include an increase in the number of CUDA cores, the addition of tensor cores, andHBM2.Tensor cores are designed for deep learning, while high-bandwidth memory is on-die, stacked, lower-clocked memory that offers an extremely wide memory bus. To emphasize that the Titan V is not a gaming card, Nvidia removed the "GeForce GTX" suffix it adds to consumer gaming cards.

In 2018, Nvidia launched the RTX 20 series GPUs that added ray-tracing cores to GPUs, improving their performance on lighting effects.[49]Polaris 11andPolaris 10GPUs from AMD are fabricated by a 14 nm process. Their release resulted in a substantial increase in the performance per watt of AMD video cards.[50]AMD also released the Vega GPU series for the high end market as a competitor to Nvidia's high end Pascal cards, also featuring HBM2 like the Titan V.

In 2019, AMD released the successor to theirGraphics Core Next(GCN) microarchitecture/instruction set. Dubbed RDNA, the first product featuring it was theRadeon RX 5000 seriesof video cards.[51]

The company announced that the successor to the RDNA microarchitecture would be incremental (aka a refresh). AMD unveiled theRadeon RX 6000 series,its RDNA 2 graphics cards with support for hardware-accelerated ray tracing.[52]The product series, launched in late 2020, consisted of the RX 6800, RX 6800 XT, and RX 6900 XT.[53][54]The RX 6700 XT, which is based on Navi 22, was launched in early 2021.[55]

ThePlayStation 5andXbox Series X and Series Swere released in 2020; they both use GPUs based on theRDNA 2microarchitecture with incremental improvements and different GPU configurations in each system's implementation.[56][57][58]

Intelfirstentered the GPU marketin the late 1990s, but produced lackluster 3D accelerators compared to the competition at the time. Rather than attempting to compete with the high-end manufacturers Nvidia and ATI/AMD, they began integratingIntel Graphics TechnologyGPUs into motherboard chipsets, beginning with theIntel 810for the Pentium III, and later into CPUs. They began with theIntel Atom 'Pineview'laptop processor in 2009, continuing in 2010 with desktop processors in the first generation of theIntel Coreline and with contemporary Pentiums and Celerons. This resulted in a large nominal market share, as the majority of computers with an Intel CPU also featured this embedded graphics processor. These generally lagged behind discrete processors in performance. Intel re-entered the discrete GPU market in 2022 with itsArcseries, which competed with the then-current GeForce 30 series and Radeon 6000 series cards at competitive prices.[citation needed]

2020s[edit]

In the 2020s, GPUs have been increasingly used for calculations involvingembarrassingly parallelproblems, such as training ofneural networkson enormous datasets that are needed forlarge language models.Specialized processing cores on some modern workstation's GPUs are dedicated fordeep learningsince they have significant FLOPS performance increases, using 4×4 matrix multiplication and division, resulting in hardware performance up to 128 TFLOPS in some applications.[59]These tensor cores are expected to appear in consumer cards, as well.[needs update][60]

GPU companies[edit]

Many companies have produced GPUs under a number of brand names. In 2009,[needs update]Intel,Nvidia,andAMD/ATIwere the market share leaders, with 49.4%, 27.8%, and 20.6% market share respectively. In addition,Matrox[61]produces GPUs. Modern smartphones use mostlyAdrenoGPUs fromQualcomm,PowerVRGPUs fromImagination Technologies,andMali GPUsfromARM.

Computational functions[edit]

Modern GPUs have traditionally used most of theirtransistorsto do calculations related to3D computer graphics.In addition to the 3D hardware, today's GPUs include basic 2D acceleration andframebuffercapabilities (usually with a VGA compatibility mode). Newer cards such as AMD/ATI HD5000–HD7000 lack dedicated 2D acceleration; it is emulated by 3D hardware. GPUs were initially used to accelerate the memory-intensive work oftexture mappingandrenderingpolygons. Later,units[clarification needed]were added to accelerategeometriccalculations such as therotationandtranslationofverticesinto differentcoordinate systems.Recent developments in GPUs include support forprogrammable shaderswhich can manipulate vertices and textures with many of the same operations that are supported byCPUs,oversamplingandinterpolationtechniques to reducealiasing,and very high-precisioncolor spaces.

Several factors of GPU construction affect the performance of the card for real-time rendering, such as the size of the connector pathways in thesemiconductor device fabrication,theclock signalfrequency, and the number and size of various on-chip memorycaches.Performance is also affected by the number of streaming multiprocessors (SM) for NVidia GPUs, or compute units (CU) for AMD GPUs, or Xe cores for Intel discrete GPUs, which describe the number of core on-silicon processor units within the GPU chip that perform the core calculations, typically working in parallel with other SM/CUs on the GPU. GPU performance is typically measured in floating point operations per second (FLOPS); GPUs in the 2010s and 2020s typically deliver performance measured in teraflops (TFLOPS). This is an estimated performance measure, as other factors can affect the actual display rate.[62]

GPU accelerated video decoding and encoding[edit]

The ATI HD5470 GPU (above, with copperheatpipeattached) featuresUVD2.1 which enables it to decode AVC and VC-1 video formats.

Most GPUs made since 1995 support theYUVcolor spaceandhardware overlays,important fordigital videoplayback, and many GPUs made since 2000 also supportMPEGprimitives such asmotion compensationandiDCT.This hardware-accelerated video decoding, in which portions of thevideo decodingprocess andvideo post-processingare offloaded to the GPU hardware, is commonly referred to as "GPU accelerated video decoding", "GPU assisted video decoding", "GPU hardware accelerated video decoding", or "GPU hardware assisted video decoding".

Recent graphics cards decodehigh-definition videoon the card, offloading the central processing unit. The most commonAPIsfor GPU accelerated video decoding areDxVAforMicrosoft Windowsoperating systems andVDPAU,VAAPI,XvMC,andXvBAfor Linux-based and UNIX-like operating systems. All except XvMC are capable of decoding videos encoded withMPEG-1,MPEG-2,MPEG-4 ASP (MPEG-4 Part 2),MPEG-4 AVC(H.264 / DivX 6),VC-1,WMV3/WMV9,Xvid/ OpenDivX (DivX 4), andDivX5codecs,while XvMC is only capable of decoding MPEG-1 and MPEG-2.

There are severaldedicated hardware video decoding and encoding solutions.

Video decoding processes that can be accelerated[edit]

Video decoding processes that can be accelerated by modern GPU hardware are:

These operations also have applications in video editing, encoding, and transcoding.

2D graphics APIs[edit]

An earlier GPU may support one or more 2D graphics API for 2D acceleration, such asGDIandDirectDraw.[63]

3D graphics APIs[edit]

A GPU can support one or more 3D graphics API, such asDirectX,Metal,OpenGL,OpenGL ES,Vulkan.

GPU forms[edit]

Terminology[edit]

In the 1970s, the term "GPU" originally stood forgraphics processor unitand described a programmable processing unit working independently from the CPU that was responsible for graphics manipulation and output.[64][65]In 1994,Sonyused the term (now standing forgraphics processing unit) in reference to thePlayStationconsole'sToshiba-designedSony GPU.[31]The term was popularized byNvidiain 1999, who marketed theGeForce 256as "the world's first GPU".[66]It was presented as a "single-chipprocessorwith integratedtransform, lighting, triangle setup/clipping,and rendering engines ".[67]RivalATI Technologiescoined the term "visual processing unit"orVPUwith the release of theRadeon 9700in 2002.[68]TheAMD Alveo MA35Dfeatures dual VPU’s, each using the5 nm processin 2023.[69]

In personal computers, there are two main forms of GPUs. Each has many synonyms:[70]

Usage-specific GPU[edit]

Most GPUs are designed for a specific use, real-time 3D graphics, or other mass calculations:

  1. Gaming
  2. Cloud Gaming
  3. Workstation
  4. Cloud Workstation
  5. Artificial Intelligence training and Cloud
  6. Automated/Driverless car

Dedicated graphics processing unit[edit]

Dedicated graphics processing unitsusesRAMthat is dedicated to the GPU rather than relying on the computer’s main system memory. This RAM is usually specially selected for the expected serial workload of the graphics card (seeGDDR). Sometimes systems with dedicateddiscreteGPUs were called "DIS" systems as opposed to "UMA" systems (see next section).[71]

Dedicated GPUs are not necessarily removable, nor does it necessarily interface with the motherboard in a standard fashion. The term "dedicated" refers to the fact thatgraphics cardshave RAM that is dedicated to the card's use, not to the fact thatmostdedicated GPUs are removable. Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts.

Graphics cards with dedicated GPUs typically interface with themotherboardby means of anexpansion slotsuch asPCI Express(PCIe) orAccelerated Graphics Port(AGP). They can usually be replaced or upgraded with relative ease, assuming the motherboard is capable of supporting the upgrade. A few graphics cards still usePeripheral Component Interconnect(PCI) slots, but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot is not available.

Technologies such asScan-Line Interleaveby 3dfx,SLIandNVLinkby Nvidia andCrossFireby AMD allow multiple GPUs to draw images simultaneously for a single screen, increasing the processing power available for graphics. These technologies, however, are increasingly uncommon; most games do not fully use multiple GPUs, as most users cannot afford them.[72][73][74]Multiple GPUs are still used on supercomputers (like inSummit), on workstations to accelerate video (processing multiple videos at once)[75][76][77]and 3D rendering,[78]forVFX,[79]GPGPUworkloads and for simulations,[80]and in AI to expedite training, as is the case with Nvidia's lineup of DGX workstations and servers, Tesla GPUs, and Intel's Ponte Vecchio GPUs.

Integrated graphics processing unit[edit]

The position of an integrated GPU in a northbridge/southbridge system layout
AnASRockmotherboard with integrated graphics, which has HDMI, VGA and DVI-out ports

Integrated graphics processing units(IGPU),integrated graphics,shared graphics solutions,integrated graphics processors(IGP), orunified memory architectures(UMA) use a portion of a computer's system RAM rather than dedicated graphics memory. IGPs can be integrated onto a motherboard as part of itsnorthbridgechipset,[81]or on the samedie (integrated circuit)with the CPU (likeAMD APUorIntel HD Graphics). On certain motherboards,[82]AMD's IGPs can use dedicated sideport memory: a separate fixed block of high performance memory that is dedicated for use by the GPU. As of early 2007computers with integrated graphics account for about 90% of all PC shipments.[83][needs update]They are less costly to implement than dedicated graphics processing, but tend to be less capable. Historically, integrated processing was considered unfit for 3D games or graphically intensive programs but could run less intensive programs such as Adobe Flash. Examples of such IGPs would be offerings from SiS and VIA circa 2004.[84]However, modern integrated graphics processors such asAMD Accelerated Processing UnitandIntel Graphics Technology(HD, UHD, Iris, Iris Pro, Iris Plus, andXe-LP) can handle 2D graphics or low-stress 3D graphics.

Since GPU computations are memory-intensive, integrated processing may compete with the CPU for relatively slow system RAM, as it has minimal or no dedicated video memory. IGPs use system memory with bandwidth up to a current maximum of 128 GB/s, whereas a discrete graphics card may have a bandwidth of more than 1000 GB/s between itsVRAMand GPU core. Thismemory busbandwidth can limit the performance of the GPU, thoughmulti-channel memorycan mitigate this deficiency.[85]Older integrated graphics chipsets lacked hardwaretransform and lighting,but newer ones include it.[86][87]

On systems with "Unified Memory Architecture" (UMA), including modern AMD processors with integrated graphics,[88]modern Intel processors with integrated graphics,[89]Apple processors, the PS5 and Xbox Series (among others), the CPU cores and the GPU block share the same pool of RAM and memory address space. This allows the system to dynamically allocate memory between the CPU cores and the GPU block based on memory needs (without needing a large static split of the RAM) and thanks to zero copy transfers, removes the need for either copying data over abus (computing)between physically separate RAM pools or copying between separate address spaces on a single physical pool of RAM, allowing more efficient transfer of data.

Hybrid graphics processing[edit]

Hybrid GPUs compete with integrated graphics in the low-end desktop and notebook markets. The most common implementations of this are ATI'sHyperMemoryand Nvidia'sTurboCache.

Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. They share memory with the system and have a small dedicated memory cache, to make up for the highlatencyof the system RAM. Technologies within PCI Express make this possible. While these solutions are sometimes advertised as having as much as 768 MB of RAM, this refers to how much can be shared with the system memory.

Stream processing and general purpose GPUs (GPGPU)[edit]

It is common to use ageneral purpose graphics processing unit (GPGPU)as a modified form ofstream processor(or avector processor), runningcompute kernels.This turns the massive computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU. The two largest discrete (see "Dedicated graphics processing unit"above) GPU designers,AMDandNvidia,are pursuing this approach with an array of applications. Both Nvidia and AMD teamed withStanford Universityto create a GPU-based client for theFolding@homedistributed computing project for protein folding calculations. In certain circumstances, the GPU calculates forty times faster than the CPUs traditionally used by such applications.[90][91]

GPGPUs can be used for many types ofembarrassingly paralleltasks includingray tracing.They are generally suited to high-throughput computations that exhibitdata-parallelismto exploit the wide vector widthSIMDarchitecture of the GPU.

GPU-based high performance computers play a significant role in large-scale modelling. Three of the ten most powerful supercomputers in the world take advantage of GPU acceleration.[92]

GPUs support API extensions to theCprogramming language such asOpenCLandOpenMP.Furthermore, each GPU vendor introduced its own API which only works with their cards:AMD APP SDKfrom AMD, andCUDAfrom Nvidia. These allow functions calledcompute kernelsto run on the GPU's stream processors. This makes it possible for C programs to take advantage of a GPU's ability to operate on large buffers in parallel, while still using the CPU when appropriate. CUDA was the first API to allow CPU-based applications to directly access the resources of a GPU for more general purpose computing without the limitations of using a graphics API.[citation needed]

Since 2005 there has been interest in using the performance offered by GPUs forevolutionary computationin general, and for accelerating thefitnessevaluation ingenetic programmingin particular. Most approaches compilelinearortree programson the host PC and transfer the executable to the GPU to be run. Typically a performance advantage is only obtained by running the single active program simultaneously on many example problems in parallel, using the GPU'sSIMDarchitecture.[93]However, substantial acceleration can also be obtained by not compiling the programs, and instead transferring them to the GPU, to be interpreted there.[94]Acceleration can then be obtained by either interpreting multiple programs simultaneously, simultaneously running multiple example problems, or combinations of both. A modern GPU can simultaneously interpret hundreds of thousands of very small programs.

External GPU (eGPU)[edit]

An external GPU is a graphics processor located outside of the housing of the computer, similar to a large external hard drive. External graphics processors are sometimes used with laptop computers. Laptops might have a substantial amount of RAM and a sufficiently powerful central processing unit (CPU), but often lack a powerful graphics processor, and instead have a less powerful but more energy-efficient on-board graphics chip. On-board graphics chips are often not powerful enough for playing video games, or for other graphically intensive tasks, such as editing video or 3D animation/rendering.

Therefore, it is desirable to attach a GPU to some external bus of a notebook.PCI Expressis the only bus used for this purpose. The port may be, for example, anExpressCardormPCIeport (PCIe ×1, up to 5 or 2.5 Gbit/s respectively), aThunderbolt1, 2, or 3 port (PCIe ×4, up to 10, 20, or 40 Gbit/s respectively), or anOCuLinkport. Those ports are only available on certain notebook systems.[95][96]eGPU enclosures include their own power supply (PSU), because powerful GPUs can consume hundreds of watts.[97]

Official vendor support for external GPUs has gained traction. A milestone was Apple's decision to support external GPUs with MacOS High Sierra 10.13.4.[98]Several major hardware vendors (HP, Razer) released Thunderbolt 3 eGPU enclosures.[99][100]This support fuels eGPU implementations by enthusiasts.[clarification needed][101]

Energy efficiency[edit]

Graphics processing units (GPU) have continued to increase in energy usage, while CPUs designers have recently focused on improving performance per watt. High performance GPUs may draw large amount of power, therefore intelligent techniques are required to manage GPU power consumption. Measures like3DMark2006 scoreper watt can help identify more efficient GPUs.[102]However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks.[103]

With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design.

Since GPUs may also be used for somegeneral purpose computation,sometimes their performance is measured in terms also applied to CPUs, such as FLOPS per watt.

Sales[edit]

In 2013, 438.3 million GPUs were shipped globally and the forecast for 2014 was 414.2 million. However, by the third quarter of 2022, shipments of integrated GPUs totaled around 75.5 million units, down 19% year-over-year.[104][needs update][105]

See also[edit]

Hardware[edit]

APIs[edit]

Applications[edit]

References[edit]

  1. ^Hague, James (September 10, 2013)."Why Do Dedicated Game Consoles Exist?".Programming in the 21st Century.Archived fromthe originalon May 4, 2015.RetrievedNovember 11,2015.
  2. ^"mame/8080bw.c at master lộ mamedev/mame lộ GitHub".GitHub.Archived fromthe originalon 2014-11-21.
  3. ^"mame/galaxian.c at master lộ mamedev/mame lộ GitHub".GitHub.Archived fromthe originalon 2014-11-21.
  4. ^"mame/galaxian.c at master lộ mamedev/mame lộ GitHub".GitHub.Archived fromthe originalon 2014-11-21.
  5. ^Springmann, Alessondra."Atari 2600 Teardown: What?s Inside Your Old Console?".The Washington Post.Archivedfrom the original on July 14, 2015.RetrievedJuly 14,2015.
  6. ^"What are the 6502, ANTIC, CTIA/GTIA, POKEY, and FREDDIE chips?".Atari8.Archived fromthe originalon 2016-03-05.
  7. ^Wiegers, Karl E. (April 1984)."Atari Display List Interrupts".Compute!(47): 161.Archivedfrom the original on 2016-03-04.
  8. ^Wiegers, Karl E. (December 1985)."Atari Fine Scrolling".Compute!(67): 110.Archivedfrom the original on 2006-02-16.
  9. ^Hopgood, F. Robert A.; Hubbold, Roger J.; Duce, David A., eds. (1986).Advances in Computer Graphics II.Springer. p. 169.ISBN9783540169109.Perhaps the best known one is the NEC 7220.
  10. ^Anderson, Marian (2018-07-18)."Famous Graphics Chips: NEC μPD7220 Graphics Display Controller".IEEE Computer Society.Retrieved2023-10-17.
  11. ^Riddle, Sean."Blitter Information".Archivedfrom the original on 2015-12-22.
  12. ^Wolf, Mark J. P. (June 2012).Before the Crash: Early Video Game History.Wayne State University Press. p. 185.ISBN978-0814337226.
  13. ^Anderson, Marian (2018-10-07)."GPU History: Hitachi ARTC HD63484".IEEE Computer Society.Retrieved2023-10-17.
  14. ^"Famous Graphics Chips: TI TMS34010 and VRAM. The first programmable graphics processor chip | IEEE Computer Society".10 January 2019.
  15. ^"X68000".Archivedfrom the original on 2014-09-03.Retrieved2014-09-12.
  16. ^"museum ~ Sharp X68000".Old-computers. Archived fromthe originalon 2015-02-19.Retrieved2015-01-28.
  17. ^"Hardcore Gaming 101: Retro Japanese Computers: Gaming's Final Frontier".hardcoregaming101.net.Archivedfrom the original on 2011-01-13.
  18. ^"System 16 – Namco System 21 Hardware (Namco)".system16.Archivedfrom the original on 2015-05-18.
  19. ^"System 16 – Taito Air System Hardware (Taito)".system16.Archivedfrom the original on 2015-03-16.
  20. ^Brownstein, Mark (November 14, 1988)."NEC Forms Video Standards Group".InfoWorld.Vol. 10, no. 46. p. 3.ISSN0199-6649.RetrievedMay 27,2016.
  21. ^"S3 Video Boards".InfoWorld.14(20): 62. May 18, 1992.Archivedfrom the original on November 22, 2017.RetrievedJuly 13,2015.
  22. ^"What the numbers mean".PC Magazine.12:128. 23 February 1993.Archivedfrom the original on 11 April 2017.Retrieved29 March2016.
  23. ^"System 16 – Namco Magic Edge Hornet Simulator Hardware (Namco)".system16.Archivedfrom the original on 2014-09-12.
  24. ^"MAME – src/mame/video/model2.c".Archived fromthe originalon 4 January 2013.
  25. ^"System 16 – Sega Model 2 Hardware (Sega)".system16.Archivedfrom the original on 2010-12-21.
  26. ^"3D Graphics Processor Chip Set"(PDF).Archived fromthe original(PDF)on 2016-10-11.Retrieved2016-08-08.
  27. ^"Fujitsu Develops World's First Three Dimensional Geometry Processor".fujitsu.Archivedfrom the original on 2014-09-12.
  28. ^"The Nintendo 64 is one of the greatest gaming devices of all time".xenol.Archived fromthe originalon 2015-11-18.
  29. ^"Mitsubishi's 3DPro/2mp Chipset Sets New Records for Fastest 3D Graphics Accelerator for Windows NT Systems; 3DPro/2mp grabs Viewperf performance lead; other high-end benchmark tests clearly show that 3DPro's performance outdistances all Windows NT competitors".Archived fromthe originalon 2018-11-15.Retrieved2022-02-18.
  30. ^Vlask."VGA Legacy MKIII – Diamond Fire GL 4000 (Mitsubishi 3DPro/2mp)".Archivedfrom the original on 2015-11-18.
  31. ^ab"Is it Time to Rename the GPU? | IEEE Computer Society".17 July 2018.
  32. ^Dreijer, Søren."Bump Mapping Using CG (3rd Edition)".Archived fromthe originalon 2010-01-20.Retrieved2007-05-30.
  33. ^Raina, Rajat; Madhavan, Anand; Ng, Andrew Y. (2009-06-14). "Large-scale deep unsupervised learning using graphics processors".Proceedings of the 26th Annual International Conference on Machine Learning – ICML '09.Dl.acm.org. pp. 1–8.doi:10.1145/1553374.1553486.ISBN9781605585161.S2CID392458.
  34. ^"Linear algebra operators for GPU implementation of numerical algorithms",Kruger and Westermann, International Conference on Computer Graphics and Interactive Techniques, 2005
  35. ^Liepe; et al. (2010)."ABC-SysBio—approximate Bayesian computation in Python with GPU support".Bioinformatics.26(14): 1797–1799.doi:10.1093/bioinformatics/btq278.PMC2894518.PMID20591907.Archived fromthe originalon 2015-11-05.Retrieved2010-10-15.
  36. ^Sanders, Jason; Kandrot, Edward (2010-07-19).CUDA by Example: An Introduction to General-Purpose GPU Programming, Portable Documents.Addison-Wesley Professional.ISBN9780132180139.Archivedfrom the original on 2017-04-12.
  37. ^"OpenCL – The open standard for parallel programming of heterogeneous systems".khronos.org.Archivedfrom the original on 2011-08-09.
  38. ^Handy, Alex (2011-09-28)."AMD helps OpenCL gain ground in HPC space".SD Times.Retrieved2023-06-04.
  39. ^Teglet, Traian (8 January 2010)."NVIDIA Tegra Inside Every Audi 2010 Vehicle".Archivedfrom the original on 2016-10-04.Retrieved2016-08-03.
  40. ^"School's in session – Nvidia's driverless system learns by watching".2016-04-30.Archivedfrom the original on 2016-05-01.Retrieved2016-08-03.
  41. ^"AMD Radeon HD 6000M series – don't call it ATI!".CNET.Archivedfrom the original on 2016-10-11.Retrieved2016-08-03.
  42. ^"Nvidia GeForce GTX 680 2GB Review".Archivedfrom the original on 2016-09-11.Retrieved2016-08-03.
  43. ^"Xbox One vs. PlayStation 4: Which game console is best?".ExtremeTech.Retrieved2019-05-13.
  44. ^"Kepler TM GK110"(PDF).NVIDIA Corporation. 2012.Archived(PDF)from the original on October 11, 2016.RetrievedAugust 3,2016.
  45. ^"Taiwan Semiconductor Manufacturing Company Limited".tsmc.Archivedfrom the original on 2016-08-10.Retrieved2016-08-03.
  46. ^"Building a PC for the HTC Vive".2016-06-16.Archivedfrom the original on 2016-07-29.Retrieved2016-08-03.
  47. ^"VIVE Ready Computers".Vive.Archivedfrom the original on 2016-02-24.Retrieved2021-07-30.
  48. ^"Nvidia's monstrous Pascal GPU is packed with cutting-edge tech and 15 billion transistors".5 April 2016.Archivedfrom the original on 2016-07-31.Retrieved2016-08-03.
  49. ^Sarkar, Samit (20 August 2018)."Nvidia RTX 2070, RTX 2080, RTX 2080 Ti GPUs revealed: specs, price, release date".Polygon.Retrieved11 September2019.
  50. ^"AMD RX 480, 470 & 460 Polaris GPUs To Deliver The 'Most Revolutionary Jump In Performance' Yet".2016-01-16.Archivedfrom the original on 2016-08-01.Retrieved2016-08-03.
  51. ^AMD press release:"AMD Announces Next-Generation Leadership Products at Computex 2019 Keynote".AMD.RetrievedOctober 5,2019.
  52. ^"AMD to Introduce New Next-Gen RDNA GPUs in 2020, Not a Typical 'Refresh' of Navi".Tom's Hardware.2020-01-29.Retrieved2020-02-08.
  53. ^"AMD Teases Radeon RX 6000 Card Performance Numbers: Aiming For 3080?".AnandTech.2020-10-08.Retrieved2020-10-25.
  54. ^Judd, Will (October 28, 2020)."AMD unveils three Radeon 6000 graphics cards with ray tracing and RTX-beating performance".Eurogamer.RetrievedOctober 28,2020.
  55. ^Mujtaba, Hassan (2020-11-30)."AMD Radeon RX 6700 XT 'Navi 22 GPU' Custom Models Reportedly Boost Up To 2.95 GHz".Wccftech.Retrieved2020-12-03.
  56. ^Funk, Ben (December 12, 2020)."Sony PS5 Gets A Full Teardown Detailing Its RDNA 2 Guts And Glory".Hot Hardware.Archived fromthe originalon December 12, 2020.RetrievedJanuary 3,2021.
  57. ^Gartenberg, Chaim (March 18, 2020)."Sony reveals full PS5 hardware specifications".The Verge.RetrievedJanuary 3,2021.
  58. ^Smith, Ryan."Microsoft Drops More Xbox Series X Tech Specs: Zen 2 + RDNA 2, 12 TFLOPs GPU, HDMI 2.1, & a Custom SSD".AnandTech.Retrieved2020-03-19.
  59. ^Smith, Ryan."NVIDIA Volta Unveiled: GV100 GPU and Tesla V100 Accelerator Announced".AnandTech.Retrieved16 August2018.
  60. ^Hill, Brandon (11 August 2017)."AMD's Navi 7nm GPU Architecture to Reportedly Feature Dedicated AI Circuitry".HotHardware.Archived fromthe originalon 17 August 2018.Retrieved16 August2018.
  61. ^"Matrox Graphics – Products – Graphics Cards".Matrox.Archivedfrom the original on 2014-02-05.Retrieved2014-01-21.
  62. ^Hruska, Joel (February 10, 2021)."How Do Graphics Cards Work?".Extreme Tech.RetrievedJuly 17,2021.
  63. ^CL-GD5446 64-bit VisualMedia Accelerator Preliminary Data Book(PDF),Cirrus Logic, November 1996,retrieved30 January2024– via The Datasheet Archive
  64. ^Barron, E. T.; Glorioso, R. M. (September 1973). "A micro controlled peripheral processor".Conference record of the 6th annual workshop on Microprogramming – MICRO 6.pp. 122–128.doi:10.1145/800203.806247.ISBN9781450377836.S2CID36942876.
  65. ^Levine, Ken (August 1978)."Core standard graphic package for the VGI 3400".ACM SIGGRAPH Computer Graphics.12(3): 298–300.doi:10.1145/965139.807405.
  66. ^"NVIDIA Launches the World's First Graphics Processing Unit: GeForce 256".Nvidia. 31 August 1999.Archivedfrom the original on 12 April 2016.Retrieved28 March2016.
  67. ^"Graphics Processing Unit (GPU)".Nvidia. 16 December 2009.Archivedfrom the original on 8 April 2016.Retrieved29 March2016.
  68. ^Pabst, Thomas (18 July 2002)."ATi Takes Over 3D Technology Leadership With Radeon 9700".Tom's Hardware.Retrieved29 March2016.
  69. ^Child, J. (6 April 2023)."AMD Rolls Out 5 nm ASIC-based Accelerator for the Interactive Streaming Era".EETech Media.Retrieved24 December2023.
  70. ^"Help Me Choose: Video Cards".Dell.Archived fromthe originalon 2016-09-09.Retrieved2016-09-17.
  71. ^"Nvidia Optimus documentation for Linux device driver".freedesktop. 13 November 2023.Retrieved24 December2023.
  72. ^Abazovic, F. (3 July 2015)."Crossfire and SLI market is just 300.000 units".fudzilla.Retrieved24 December2023.
  73. ^"Is Multi-GPU Dead?".7 January 2018.
  74. ^"Nvidia SLI and AMD CrossFire is dead – but should we mourn multi-GPU gaming? | TechRadar".24 August 2019.
  75. ^"NVIDIA FFmpeg Transcoding Guide".24 July 2019.
  76. ^"Hardware Selection and Configuration Guide DaVinci Resolve 15"(PDF).BlackMagic Design. 2018.Retrieved31 May2022.
  77. ^"Recommended System: Recommended Systems for DaVinci Resolve".Puget Systems.
  78. ^"V-Ray Next Multi-GPU Performance Scaling".20 August 2019.
  79. ^"V-Ray for Nuke – Ray Traced Rendering for Compositors | Chaos Group".
  80. ^"What about multi-GPU support? – Folding@home".
  81. ^"Evolution of Intel Graphics: I740 to Iris Pro".4 February 2017.
  82. ^"GA-890GPA-UD3H overview".Archived fromthe originalon 2015-04-15.Retrieved2015-04-15.
  83. ^Key, Gary."AnandTech – μATX Part 2: Intel G33 Performance Review".anandtech.Archivedfrom the original on 2008-05-31.
  84. ^Tscheblockov, Tim."Xbit Labs: Roundup of 7 Contemporary Integrated Graphics Chipsets for Socket 478 and Socket A Platforms".Archived fromthe originalon 2007-05-26.Retrieved2007-06-03.
  85. ^Coelho, Rafael (18 January 2016)."Does dual-channel memory make difference on integrated video performance?".Hardware Secrets.Retrieved4 January2019.
  86. ^Sanford, Bradley."Integrated Graphics Solutions for Graphics-Intensive Applications"(PDF).Archived(PDF)from the original on 2007-11-28.Retrieved2007-09-02.
  87. ^Sanford, Bradley."Integrated Graphics Solutions for Graphics-Intensive Applications".Archivedfrom the original on 2012-01-07.Retrieved2007-09-02.
  88. ^Shimpi, Anand Lal."AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU in 2013, HSA GPUs in 2014".anandtech.Retrieved2024-01-08.
  89. ^Lake, Adam T."Getting the Most from OpenCL™ 1.2: How to Increase Performance by..."Intel.Retrieved2024-01-08.
  90. ^Murph, Darren (29 September 2006)."Stanford University tailors Folding@home to GPUs".Archivedfrom the original on 2007-10-12.Retrieved2007-10-04.
  91. ^Houston, Mike."Folding@Home – GPGPU".Archivedfrom the original on 2007-10-27.Retrieved2007-10-04.
  92. ^"Top500 List – June 2012 | TOP500 Supercomputer Sites".Top500.org. Archived fromthe originalon 2014-01-13.Retrieved2014-01-21.
  93. ^Nickolls, John."Stanford Lecture: Scalable Parallel Programming with CUDA on Manycore GPUs".YouTube.Archivedfrom the original on 2016-10-11.
  94. ^Langdon, W.; Banzhaf, W."A SIMD interpreter for Genetic Programming on GPU Graphics Cards".Archivedfrom the original on 2008-06-09.Retrieved2008-05-01.
  95. ^"eGPU candidate system list".Tech-Inferno Forums.15 July 2013.
  96. ^Mohr, Neil."How to make an external laptop graphics adaptor".TechRadar.Archivedfrom the original on 2017-06-26.
  97. ^"Best External Graphics Card 2020 (EGPU) [The Complete Guide]".16 March 2020.
  98. ^"Use an external graphics processor with your Mac".Apple Support.Retrieved2018-12-11.
  99. ^"OMEN Accelerator | HP® Official Site".www8.hp.Retrieved2018-12-11.
  100. ^"Alienware Graphics Amplifier | Dell United States".Dell.Retrieved2018-12-11.
  101. ^Box, ► Suggestions (2016-11-25)."Build Guides by users".eGPU.io.Retrieved2018-12-11.
  102. ^Atwood, Jeff (2006-08-18)."Video Card Power Consumption".Archived fromthe originalon 8 September 2008.Retrieved26 March2008.
  103. ^"Video card power consumption".Xbit Labs.Archived fromthe originalon 2011-09-04.
  104. ^"GPU Q3'22 biggest quarter-to-quarter drop since the 2009 recession".Jon Peddie Research.2022-11-20.Retrieved2023-06-06.
  105. ^"Graphics chips market is showing some life".TG Daily. August 20, 2014.Archivedfrom the original on August 26, 2014.RetrievedAugust 22,2014.

Sources[edit]

External links[edit]