1 ATI chips hacking 2 ================= 3 Dedicated to ATI's hackers. 4 5Preface 6~~~~~~~ 7This document will compare ATI chips only from point of DAC and video overlay. 8There are lots of difference from 3D point, dual-head support, tv-out support 9and many other things but it's already perfectly different story. 10This document doesn't include information about ATI AIW (All In Wonder) chips. 11 12What are units on modern ATI chips: 13DAC - (Digital to Analog Convertor) controls CRTC, LCD, DFP monitor's output 14 Consists from: 15 PLL - (Programable line length) registers 16 CRTC - CRT controller 17 LCD/DFP scaler 18 surface control 19DAC2 - controls CRTC, LCD, DFP monitor's output on second head 20TVDAC - controls Composite Video and Super Video output ports 21 Consists from: 22 TV_PLL 23 TV scaler & sync unit 24 TV format convertor (PAL/NTSC) 25TVCAP - controls Video-In port 26MPP - Miscellaneous peripheral port. (includes macrovision's filter - copy 27 protection mechanism) 28OV - Video overlay (YUV BES) (include subpictures, gamma correction and 29 adaptive deinterlacing) 30CAP0 - Video capturing 31CAP1 - Video capturing (second unit) 32RT - Rage theatre: video encoding and mixing 33MUX - video muxer 34MEM - PCI/AGP bus mastering 352D - GUI engine 363D - 3D-OpenGL engine (There are lots of stuff) 37I2C - I2C Bus control 38 39This document is mainly related only with OV unit ;) 40Video decoding diagram: 41 42RAM memory: [ App ] Copies YUV image to overlay memory 43 | <-- (It's possible to program DMA here) 44overlay memory:[ OV ] performs scaling and YUVtoRGB convertion 45 /\ 46RGB memory: / \ 47 / [ macrovision ] performs copy protection filtering 48 / \ (unneeded but presented by default thing;) 49 [ CRTC/LCD/DFP DAC ] [ TV DAC ] convert RGB memory to CRTC and NTSC/PAL signals 50 | | 51 [CRTC/LCD/DFP Monitor] [TV-screen] 52 53History 54~~~~~~~ 55 What is history of ATI's chips? I can be wrong but below is my vision 56of this question: 57 580. I don't know any earlied chips :( 591. Mach8 602. Mach16 613. Mach32 62 634. Mach64. 64 It's first chip which has support from side of open 65 source drivers. Set of mach64 chips is: 66 mach64GX (ATI888GX00) 67 mach64CX (ATI888CX00) 68 mach64CT (ATI264CT) 69 mach64ET (ATI264ET) 70 mach64VTA3 (ATI264VT) 71 mach64VTA4 (ATI264VT) 72 mach64VTB (ATI264VTB) 73 mach64VT4 (ATI264VT4) 74 755. 3D rage chips. 76 It seems that these chips have fully compatible by GPU with Mach64 77 which is extended by 3D possibilities. Set of 3D rage chips is: 78 3D RAGE (GT) 79 3D RAGE II+ (GTB) 80 3D RAGE IIC (PCI) 81 3D RAGE IIC (AGP) 82 3D RAGE LT 83 3D RAGE LT-G 84 3D RAGE PRO (BGA, AGP) 85 3D RAGE PRO (BGA, AGP, 1x only) 86 3D RAGE PRO (BGA, PCI) 87 3D RAGE PRO (PQFP, PCI) 88 3D RAGE PRO (PQFP, PCI, limited 3D) 89 3D RAGE (XL) 90 3D RAGE LT PRO (AGP) 91 3D RAGE LT PRO (PCI) 92 3D RAGE Mobility (PCI) 93 3D RAGE Mobility (AGP) 94 956. Rage128 chips. 96 These chips have perfectly new GPU which supports memory mapped IO 97 space for accelerating port access (It's main cause of incompatibility 98 with mach64). Set of Rage128 chips is: 99 Rage128 GL RE 100 Rage128 GL RF 101 Rage128 GL RG 102 Rage128 GL RH 103 Rage128 GL RI 104 Rage128 VR RK 105 Rage128 VR RL 106 Rage128 VR RM 107 Rage128 VR RN 108 Rage128 VR RO 109 Rage128 Mobility M3 LE 110 Rage128 Mobility M3 LF 1117. Rage128Pro chips. 112 These chips are successors of Rage128 ones. 113 Rage128Pro GL PA 114 Rage128Pro GL PB 115 Rage128Pro GL PC 116 Rage128Pro GL PD 117 Rage128Pro GL PE 118 Rage128Pro GL PF 119 Rage128Pro VR PG 120 Rage128Pro VR PH 121 Rage128Pro VR PI 122 Rage128Pro VR PJ 123 Rage128Pro VR PK 124 Rage128Pro VR PL 125 Rage128Pro VR PM 126 Rage128Pro VR PN 127 Rage128Pro VR PO 128 Rage128Pro VR PP 129 Rage128Pro VR PQ 130 Rage128Pro VR PR 131 Rage128Pro VR TR 132 Rage128Pro VR PS 133 Rage128Pro VR PT 134 Rage128Pro VR PU 135 Rage128Pro VR PV 136 Rage128Pro VR PW 137 Rage128Pro VR PX 138 Rage128Pro Ultra U1 139 Rage128Pro Ultra U2 140 Rage128Pro Ultra U3 141 1428. Radeon chips. 143 Indeed they could be named Rage256 Pro. (With minor changes is fully 144 compatible with Rage128 chips). 145 Radeon QD 146 Radeon QE 147 Radeon QF 148 Radeon QG 149 Radeon VE QY 150 Radeon VE QZ 151 Radeon M6 LY 152 Radeon M6 LZ 153 Radeon M7 LW 1549. Radeon2 chips. 155 Indeed they could be named Rage512 Pro. 156 Radeon2 8500 QL 157 Radeon2 7500 QW 158 15910. Radeon3 and newest are cooming soon, but I hope that they will be fully 160 compatible with Radeon1 chips. 161 162In Radeon famility there were introduced also FX chips: Radeon FX and 163Radeon2 8700 FX. Probably they have the same possibility as other Radeon 164but currently it's unknown for me. 165 166What about video overlay and DAC? 167~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 168 169Currently it's known that there is only difference between 170Mach64 and Rage128 compatible chips: 171- They have different logic of io ports programming! 172- They are incompatible by port numbers! 173But: 174- They use the same program logic from register's name point. 175(Indeed exists slight difference even between Radeon and Rage128 176chips. AFAIK only Radeon has OV0_SLICE_CNTL register which currently 177is not used by driver. But I know only its name ;). Also there 178is difference in slight adjust of BES position but it's configured 179by #ifdef blocks). 180 181Please compare: 182 183(The piece of Back-End Scaler programming) 184 185 Sample for Mach64 compatible chips: 186 *********************************** 187 188#define SPARSE_IO_BASE 0x03fcu 189#define SPARSE_IO_SELECT 0xfc00u 190 191#define BLOCK_IO_BASE 0xff00u 192#define BLOCK_IO_SELECT 0x00fcu 193 194#define MM_IO_SELECT 0x03fcu 195#define BLOCK_SELECT 0x0400u 196#define DWORD_SELECT (BLOCK_SELECT | MM_IO_SELECT) 197 198#define IO_BYTE_SELECT 0x0003u 199 200#define SPARSE_IO_PORT (SPARSE_IO_BASE | IO_BYTE_SELECT) 201#define BLOCK_IO_PORT (BLOCK_IO_BASE | IO_BYTE_SELECT) 202 203#define IOPortTag(_SparseIOSelect, _BlockIOSelect) \ 204 (SetBits(_SparseIOSelect, SPARSE_IO_SELECT) | \ 205 SetBits(_BlockIOSelect, BLOCK_SELECT | MM_IO_SELECT)) 206#define SparseIOTag(_IOSelect) IOPortTag(_IOSelect, 0) 207#define BlockIOTag(_IOSelect) IOPortTag(0, _IOSelect) 208 209... 210 211#define OVERLAY_Y_X_START BlockIOTag(0x100u) 212#define OVERLAY_Y_X_END BlockIOTag(0x101u) 213 214... 215 216#define OUTREG(_Register, _Value) \ 217 MMIO_OUT32(pATI->pBlock[GetBits(_Register, BLOCK_SELECT)], \ 218 (_Register) & MM_IO_SELECT, _Value) 219 220... 221 222OUTREG(OVERLAY_Y_X_START,((drw_x)<<16)|(drw_y)|(1<<31)); 223OUTREG(OVERLAY_Y_X_END,((drw_x+drw_w)<<16)|(drw_y+drw_h)); 224 225 226 Sample for Rage128 compatible chips: 227 ************************************ 228 229#define OV0_Y_X_START 0x0400 230#define OV0_Y_X_END 0x0404 231 232... 233 234#define INREG(addr) readl((rage_mmio_base)+addr) 235#define OUTREG(addr,val) writel(val, (rage_mmio_base)+addr) 236 237... 238 239rage_mmio_base = ioremap_nocache(pci_resource_start (dev, 2),RAGE_REGSIZE); 240 241... 242 243#ifdef RADEON 244#define X_ADJUST 8 245#else /* rage128 */ 246#define X_ADJUST 0 247#endif 248 249OUTREG(OV0_Y_X_START,(drw_x+X_ADJUST)|(drw_y<<16)); 250OUTREG(OV0_Y_X_END,(drw_x+drw_w+X_ADJUST)|(drw_y+drw_h)<<16)); 251 252Thus - these chips have almost the same logic from register's name point. 253(except the fact that they have swapped 16-bit halfs). 254Yes - programming of Rage128 is much simpler of Mach64. 255 256 257What about other ATI's chips? 258~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 259 260I suggest you have latest copy of GATOS-CVS: 261http://www.linuxvideo.org 262GATOS was designed and introduced as General ATI TV and Overlay Sowfware. 263You will be able to find out there a lots of useful hacking utilities 264(at location gatos-ati/gatos): 265gfxdump - Program for dumping graphics chips registers on Linux and Windows 9X. 266 (it's more useful for Win9x to hack their values). 267xatitv - For working with tv-in (currently is under hard development) 268atitvout- For working with tv-out 269and lot of other stuff. 270BUT: After studing of Gatos and X11 stuffs I've found that they are bad 271optimized for movie playback. 272Please compare: 273 radeon_vid - configures video overlay only once and provides DGA to it. 274 (doesn't require to be MMX optimized) 275 gatos and X11 - configures video overlay at every slice of frame, then 276 performs unoptimized copying of source stuff to video memory 277 often with using CopyMungedData (it's C-analog of YV12_to_YUY2) 278 since there are lacks in yv12 support. 279 (is not MMX optimized that's gladly accepted, but probably 280 will be never optimized due portability). 281 282hardware IDCT support diagram: 283~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 284 | 285[ Video parser ] <---------- [ Transport demuxing ] --> [ Audio ] 286 | | | 287[ Variable length decoder] |D | 288 | |V | 289[ Inverse quantization ] |D | 290 | | | 291-------|---[ video card ]---------+ |s | 292 | | |u | 293[ Run level decode & de-zigzag ] | |b | 294 | | |p | 295[ IDCT ] | |i | 296 | | |c | 297[ Motion compensation ] | |t | 298 | | |u | 299[ Advanced deinterlacing ] | |r | 300 | | |e | 301[ Filtered X-Y scaling ] [SUBPIC]-|-----+s [ OSD ] 302 | | | | | 303[ 4-bit alpha blending ] <---+ | +-------+ 304 | | 305[ YUV to RGB conversion ] | 306-------|--------------------------+ 307TV-screen or CRT-display 308 309 310Conslusion: 311~~~~~~~~~~~ 312 313That's all folk! 314