1                       ATI chips hacking
2                       =================
3                                                Dedicated to ATI's hackers.
4
5Preface
6~~~~~~~
7This document will compare ATI chips only from point of DAC and video overlay.
8There are lots of difference from 3D point, dual-head support, tv-out support
9and many other things but it's already perfectly different story.
10This document doesn't include information about ATI AIW (All In Wonder) chips.
11
12What are units on modern ATI chips:
13DAC   - (Digital to Analog Convertor) controls CRTC, LCD, DFP monitor's output
14        Consists from:
15                PLL  - (Programable line length) registers
16                CRTC - CRT controller
17                LCD/DFP scaler
18                surface control
19DAC2  - controls CRTC, LCD, DFP monitor's output on second head
20TVDAC - controls Composite Video and Super Video output ports
21        Consists from:
22                TV_PLL
23                TV scaler & sync unit
24                TV format convertor (PAL/NTSC)
25TVCAP - controls Video-In port
26MPP   - Miscellaneous peripheral port. (includes macrovision's filter - copy
27        protection mechanism)
28OV    - Video overlay (YUV BES) (include subpictures, gamma correction and
29        adaptive deinterlacing)
30CAP0  - Video capturing
31CAP1  - Video capturing (second unit)
32RT    - Rage theatre: video encoding and mixing
33MUX   - video muxer
34MEM   - PCI/AGP bus mastering
352D    - GUI engine
363D    - 3D-OpenGL engine (There are lots of stuff)
37I2C   - I2C Bus control
38
39This document is mainly related only with OV unit ;)
40Video decoding diagram:
41
42RAM memory:   [ App ]    Copies YUV image to overlay memory
43                 |       <-- (It's possible to program DMA here)
44overlay memory:[ OV ]    performs scaling and YUVtoRGB convertion
45                /\
46RGB memory:   /    \
47            /  [ macrovision ]  performs copy protection filtering
48          /            \        (unneeded but presented by default thing;)
49 [ CRTC/LCD/DFP DAC ] [ TV DAC ] convert RGB memory to CRTC and NTSC/PAL signals
50          |                |
51 [CRTC/LCD/DFP Monitor] [TV-screen]
52
53History
54~~~~~~~
55  What is history of ATI's chips? I can be wrong but below is my vision
56of this question:
57
580. I don't know any earlied chips :(
591. Mach8
602. Mach16
613. Mach32
62
634. Mach64.
64        It's first chip which has support from side of open
65        source drivers. Set of mach64 chips is:
66                mach64GX (ATI888GX00)
67                mach64CX (ATI888CX00)
68                mach64CT (ATI264CT)
69                mach64ET (ATI264ET)
70                mach64VTA3 (ATI264VT)
71                mach64VTA4 (ATI264VT)
72                mach64VTB (ATI264VTB)
73                mach64VT4 (ATI264VT4)
74
755. 3D rage chips.
76        It seems that these chips have fully compatible by GPU with Mach64
77        which is extended by 3D possibilities. Set of 3D rage chips is:
78                3D RAGE (GT)
79                3D RAGE II+ (GTB)
80                3D RAGE IIC (PCI)
81                3D RAGE IIC (AGP)
82                3D RAGE LT
83                3D RAGE LT-G
84                3D RAGE PRO (BGA, AGP)
85                3D RAGE PRO (BGA, AGP, 1x only)
86                3D RAGE PRO (BGA, PCI)
87                3D RAGE PRO (PQFP, PCI)
88                3D RAGE PRO (PQFP, PCI, limited 3D)
89                3D RAGE (XL)
90                3D RAGE LT PRO (AGP)
91                3D RAGE LT PRO (PCI)
92                3D RAGE Mobility (PCI)
93                3D RAGE Mobility (AGP)
94
956. Rage128 chips.
96        These chips have perfectly new GPU which supports memory mapped IO
97        space for accelerating port access (It's main cause of incompatibility
98        with mach64). Set of Rage128 chips is:
99                Rage128 GL RE
100                Rage128 GL RF
101                Rage128 GL RG
102                Rage128 GL RH
103                Rage128 GL RI
104                Rage128 VR RK
105                Rage128 VR RL
106                Rage128 VR RM
107                Rage128 VR RN
108                Rage128 VR RO
109                Rage128 Mobility M3 LE
110                Rage128 Mobility M3 LF
1117. Rage128Pro chips.
112        These chips are successors of Rage128 ones.
113                Rage128Pro GL PA
114                Rage128Pro GL PB
115                Rage128Pro GL PC
116                Rage128Pro GL PD
117                Rage128Pro GL PE
118                Rage128Pro GL PF
119                Rage128Pro VR PG
120                Rage128Pro VR PH
121                Rage128Pro VR PI
122                Rage128Pro VR PJ
123                Rage128Pro VR PK
124                Rage128Pro VR PL
125                Rage128Pro VR PM
126                Rage128Pro VR PN
127                Rage128Pro VR PO
128                Rage128Pro VR PP
129                Rage128Pro VR PQ
130                Rage128Pro VR PR
131                Rage128Pro VR TR
132                Rage128Pro VR PS
133                Rage128Pro VR PT
134                Rage128Pro VR PU
135                Rage128Pro VR PV
136                Rage128Pro VR PW
137                Rage128Pro VR PX
138                Rage128Pro Ultra U1
139                Rage128Pro Ultra U2
140                Rage128Pro Ultra U3
141
1428. Radeon chips.
143        Indeed they could be named Rage256 Pro. (With minor changes is fully
144        compatible with Rage128 chips).
145                Radeon QD
146                Radeon QE
147                Radeon QF
148                Radeon QG
149                Radeon VE QY
150                Radeon VE QZ
151                Radeon M6 LY
152                Radeon M6 LZ
153                Radeon M7 LW
1549. Radeon2 chips.
155        Indeed they could be named Rage512 Pro.
156                Radeon2 8500 QL
157                Radeon2 7500 QW
158
15910. Radeon3 and newest are cooming soon, but I hope that they will be fully
160    compatible with Radeon1 chips.
161
162In Radeon famility there were introduced also FX chips: Radeon FX and
163Radeon2 8700 FX. Probably they have the same possibility as other Radeon
164but currently it's unknown for me.
165
166What about video overlay and DAC?
167~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
168
169Currently it's known that there is only difference between
170Mach64 and Rage128 compatible chips:
171- They have different logic of io ports programming!
172- They are incompatible by port numbers!
173But:
174- They use the same program logic from register's name point.
175(Indeed exists slight difference even between Radeon and Rage128
176chips. AFAIK only Radeon has OV0_SLICE_CNTL register which currently
177is not used by driver. But I know only its name ;). Also there
178is difference in slight adjust of BES position but it's configured
179by #ifdef blocks).
180
181Please compare:
182
183(The piece of Back-End Scaler programming)
184
185 Sample for Mach64 compatible chips:
186 ***********************************
187
188#define SPARSE_IO_BASE          0x03fcu
189#define SPARSE_IO_SELECT        0xfc00u
190
191#define BLOCK_IO_BASE           0xff00u
192#define BLOCK_IO_SELECT         0x00fcu
193
194#define MM_IO_SELECT            0x03fcu
195#define BLOCK_SELECT            0x0400u
196#define DWORD_SELECT            (BLOCK_SELECT | MM_IO_SELECT)
197
198#define IO_BYTE_SELECT          0x0003u
199
200#define SPARSE_IO_PORT          (SPARSE_IO_BASE | IO_BYTE_SELECT)
201#define BLOCK_IO_PORT           (BLOCK_IO_BASE | IO_BYTE_SELECT)
202
203#define IOPortTag(_SparseIOSelect, _BlockIOSelect)      \
204        (SetBits(_SparseIOSelect, SPARSE_IO_SELECT) |   \
205         SetBits(_BlockIOSelect, BLOCK_SELECT | MM_IO_SELECT))
206#define SparseIOTag(_IOSelect)  IOPortTag(_IOSelect, 0)
207#define BlockIOTag(_IOSelect)   IOPortTag(0, _IOSelect)
208
209...
210
211#define OVERLAY_Y_X_START       BlockIOTag(0x100u)
212#define OVERLAY_Y_X_END         BlockIOTag(0x101u)
213
214...
215
216#define OUTREG(_Register, _Value)                              \
217    MMIO_OUT32(pATI->pBlock[GetBits(_Register, BLOCK_SELECT)], \
218               (_Register) & MM_IO_SELECT, _Value)
219
220...
221
222OUTREG(OVERLAY_Y_X_START,((drw_x)<<16)|(drw_y)|(1<<31));
223OUTREG(OVERLAY_Y_X_END,((drw_x+drw_w)<<16)|(drw_y+drw_h));
224
225
226 Sample for Rage128 compatible chips:
227 ************************************
228
229#define OV0_Y_X_START                          0x0400
230#define OV0_Y_X_END                            0x0404
231
232...
233
234#define INREG(addr)             readl((rage_mmio_base)+addr)
235#define OUTREG(addr,val)        writel(val, (rage_mmio_base)+addr)
236
237...
238
239rage_mmio_base = ioremap_nocache(pci_resource_start (dev, 2),RAGE_REGSIZE);
240
241...
242
243#ifdef RADEON
244#define X_ADJUST 8
245#else /* rage128 */
246#define X_ADJUST 0
247#endif
248
249OUTREG(OV0_Y_X_START,(drw_x+X_ADJUST)|(drw_y<<16));
250OUTREG(OV0_Y_X_END,(drw_x+drw_w+X_ADJUST)|(drw_y+drw_h)<<16));
251
252Thus - these chips have almost the same logic from register's name point.
253(except the fact that they have swapped 16-bit halfs).
254Yes - programming of Rage128 is much simpler of Mach64.
255
256
257What about other ATI's chips?
258~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259
260I suggest you have latest copy of GATOS-CVS:
261http://www.linuxvideo.org
262GATOS was designed and introduced as General ATI TV and Overlay Sowfware.
263You will be able to find out there a lots of useful hacking utilities
264(at location gatos-ati/gatos):
265gfxdump - Program for dumping graphics chips registers on Linux and Windows 9X.
266          (it's more useful for Win9x to hack their values).
267xatitv  - For working with tv-in (currently is under hard development)
268atitvout- For working with tv-out
269and lot of other stuff.
270BUT: After studing of Gatos and X11 stuffs I've found that they are bad
271optimized for movie playback.
272Please compare:
273 radeon_vid - configures video overlay only once and provides DGA to it.
274              (doesn't require to be MMX optimized)
275 gatos and X11 - configures video overlay at every slice of frame, then
276              performs unoptimized copying of source stuff to video memory
277              often with using CopyMungedData (it's C-analog of YV12_to_YUY2)
278              since there are lacks in yv12 support.
279              (is not MMX optimized that's gladly accepted, but probably
280               will be never optimized due portability).
281
282hardware IDCT support diagram:
283~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
284                                      |
285[ Video parser ] <---------- [ Transport demuxing ] --> [ Audio ]
286       |                                |       |
287[ Variable length decoder]              |D      |
288       |                                |V      |
289[ Inverse quantization ]                |D      |
290       |                                |       |
291-------|---[ video card ]---------+     |s      |
292       |                          |     |u      |
293[ Run level decode & de-zigzag ]  |     |b      |
294       |                          |     |p      |
295[    IDCT   ]                     |     |i      |
296       |                          |     |c      |
297[  Motion compensation  ]         |     |t      |
298       |                          |     |u      |
299[ Advanced deinterlacing ]        |     |r      |
300       |                          |     |e      |
301[ Filtered X-Y scaling ] [SUBPIC]-|-----+s   [ OSD ]
302       |                     |    |     |       |
303[ 4-bit alpha blending ] <---+    |     +-------+
304       |                          |
305[ YUV to RGB conversion ]         |
306-------|--------------------------+
307TV-screen or CRT-display
308
309
310Conslusion:
311~~~~~~~~~~~
312
313That's all folk!
314