1#! /usr/bin/env python3
2
3"""fixdiv - tool to fix division operators.
4
5To use this tool, first run `python -Qwarnall yourscript.py 2>warnings'.
6This runs the script `yourscript.py' while writing warning messages
7about all uses of the classic division operator to the file
8`warnings'.  The warnings look like this:
9
10  <file>:<line>: DeprecationWarning: classic <type> division
11
12The warnings are written to stderr, so you must use `2>' for the I/O
13redirect.  I know of no way to redirect stderr on Windows in a DOS
14box, so you will have to modify the script to set sys.stderr to some
15kind of log file if you want to do this on Windows.
16
17The warnings are not limited to the script; modules imported by the
18script may also trigger warnings.  In fact a useful technique is to
19write a test script specifically intended to exercise all code in a
20particular module or set of modules.
21
22Then run `python fixdiv.py warnings'.  This first reads the warnings,
23looking for classic division warnings, and sorts them by file name and
24line number.  Then, for each file that received at least one warning,
25it parses the file and tries to match the warnings up to the division
26operators found in the source code.  If it is successful, it writes
27its findings to stdout, preceded by a line of dashes and a line of the
28form:
29
30  Index: <file>
31
32If the only findings found are suggestions to change a / operator into
33a // operator, the output is acceptable input for the Unix 'patch'
34program.
35
36Here are the possible messages on stdout (N stands for a line number):
37
38- A plain-diff-style change ('NcN', a line marked by '<', a line
39  containing '---', and a line marked by '>'):
40
41  A / operator was found that should be changed to //.  This is the
42  recommendation when only int and/or long arguments were seen.
43
44- 'True division / operator at line N' and a line marked by '=':
45
46  A / operator was found that can remain unchanged.  This is the
47  recommendation when only float and/or complex arguments were seen.
48
49- 'Ambiguous / operator (..., ...) at line N', line marked by '?':
50
51  A / operator was found for which int or long as well as float or
52  complex arguments were seen.  This is highly unlikely; if it occurs,
53  you may have to restructure the code to keep the classic semantics,
54  or maybe you don't care about the classic semantics.
55
56- 'No conclusive evidence on line N', line marked by '*':
57
58  A / operator was found for which no warnings were seen.  This could
59  be code that was never executed, or code that was only executed
60  with user-defined objects as arguments.  You will have to
61  investigate further.  Note that // can be overloaded separately from
62  /, using __floordiv__.  True division can also be separately
63  overloaded, using __truediv__.  Classic division should be the same
64  as either of those.  (XXX should I add a warning for division on
65  user-defined objects, to disambiguate this case from code that was
66  never executed?)
67
68- 'Phantom ... warnings for line N', line marked by '*':
69
70  A warning was seen for a line not containing a / operator.  The most
71  likely cause is a warning about code executed by 'exec' or eval()
72  (see note below), or an indirect invocation of the / operator, for
73  example via the div() function in the operator module.  It could
74  also be caused by a change to the file between the time the test
75  script was run to collect warnings and the time fixdiv was run.
76
77- 'More than one / operator in line N'; or
78  'More than one / operator per statement in lines N-N':
79
80  The scanner found more than one / operator on a single line, or in a
81  statement split across multiple lines.  Because the warnings
82  framework doesn't (and can't) show the offset within the line, and
83  the code generator doesn't always give the correct line number for
84  operations in a multi-line statement, we can't be sure whether all
85  operators in the statement were executed.  To be on the safe side,
86  by default a warning is issued about this case.  In practice, these
87  cases are usually safe, and the -m option suppresses these warning.
88
89- 'Can't find the / operator in line N', line marked by '*':
90
91  This really shouldn't happen.  It means that the tokenize module
92  reported a '/' operator but the line it returns didn't contain a '/'
93  character at the indicated position.
94
95- 'Bad warning for line N: XYZ', line marked by '*':
96
97  This really shouldn't happen.  It means that a 'classic XYZ
98  division' warning was read with XYZ being something other than
99  'int', 'long', 'float', or 'complex'.
100
101Notes:
102
103- The augmented assignment operator /= is handled the same way as the
104  / operator.
105
106- This tool never looks at the // operator; no warnings are ever
107  generated for use of this operator.
108
109- This tool never looks at the / operator when a future division
110  statement is in effect; no warnings are generated in this case, and
111  because the tool only looks at files for which at least one classic
112  division warning was seen, it will never look at files containing a
113  future division statement.
114
115- Warnings may be issued for code not read from a file, but executed
116  using the exec() or eval() functions.  These may have
117  <string> in the filename position, in which case the fixdiv script
118  will attempt and fail to open a file named '<string>' and issue a
119  warning about this failure; or these may be reported as 'Phantom'
120  warnings (see above).  You're on your own to deal with these.  You
121  could make all recommended changes and add a future division
122  statement to all affected files, and then re-run the test script; it
123  should not issue any warnings.  If there are any, and you have a
124  hard time tracking down where they are generated, you can use the
125  -Werror option to force an error instead of a first warning,
126  generating a traceback.
127
128- The tool should be run from the same directory as that from which
129  the original script was run, otherwise it won't be able to open
130  files given by relative pathnames.
131"""
132
133import sys
134import getopt
135import re
136import tokenize
137
138multi_ok = 0
139
140def main():
141    try:
142        opts, args = getopt.getopt(sys.argv[1:], "hm")
143    except getopt.error as msg:
144        usage(msg)
145        return 2
146    for o, a in opts:
147        if o == "-h":
148            print(__doc__)
149            return
150        if o == "-m":
151            global multi_ok
152            multi_ok = 1
153    if not args:
154        usage("at least one file argument is required")
155        return 2
156    if args[1:]:
157        sys.stderr.write("%s: extra file arguments ignored\n", sys.argv[0])
158    warnings = readwarnings(args[0])
159    if warnings is None:
160        return 1
161    files = list(warnings.keys())
162    if not files:
163        print("No classic division warnings read from", args[0])
164        return
165    files.sort()
166    exit = None
167    for filename in files:
168        x = process(filename, warnings[filename])
169        exit = exit or x
170    return exit
171
172def usage(msg):
173    sys.stderr.write("%s: %s\n" % (sys.argv[0], msg))
174    sys.stderr.write("Usage: %s [-m] warnings\n" % sys.argv[0])
175    sys.stderr.write("Try `%s -h' for more information.\n" % sys.argv[0])
176
177PATTERN = (r"^(.+?):(\d+): DeprecationWarning: "
178           r"classic (int|long|float|complex) division$")
179
180def readwarnings(warningsfile):
181    prog = re.compile(PATTERN)
182    try:
183        f = open(warningsfile)
184    except IOError as msg:
185        sys.stderr.write("can't open: %s\n" % msg)
186        return
187    warnings = {}
188    while 1:
189        line = f.readline()
190        if not line:
191            break
192        m = prog.match(line)
193        if not m:
194            if line.find("division") >= 0:
195                sys.stderr.write("Warning: ignored input " + line)
196            continue
197        filename, lineno, what = m.groups()
198        list = warnings.get(filename)
199        if list is None:
200            warnings[filename] = list = []
201        list.append((int(lineno), sys.intern(what)))
202    f.close()
203    return warnings
204
205def process(filename, list):
206    print("-"*70)
207    assert list # if this fails, readwarnings() is broken
208    try:
209        fp = open(filename)
210    except IOError as msg:
211        sys.stderr.write("can't open: %s\n" % msg)
212        return 1
213    print("Index:", filename)
214    f = FileContext(fp)
215    list.sort()
216    index = 0 # list[:index] has been processed, list[index:] is still to do
217    g = tokenize.generate_tokens(f.readline)
218    while 1:
219        startlineno, endlineno, slashes = lineinfo = scanline(g)
220        if startlineno is None:
221            break
222        assert startlineno <= endlineno is not None
223        orphans = []
224        while index < len(list) and list[index][0] < startlineno:
225            orphans.append(list[index])
226            index += 1
227        if orphans:
228            reportphantomwarnings(orphans, f)
229        warnings = []
230        while index < len(list) and list[index][0] <= endlineno:
231            warnings.append(list[index])
232            index += 1
233        if not slashes and not warnings:
234            pass
235        elif slashes and not warnings:
236            report(slashes, "No conclusive evidence")
237        elif warnings and not slashes:
238            reportphantomwarnings(warnings, f)
239        else:
240            if len(slashes) > 1:
241                if not multi_ok:
242                    rows = []
243                    lastrow = None
244                    for (row, col), line in slashes:
245                        if row == lastrow:
246                            continue
247                        rows.append(row)
248                        lastrow = row
249                    assert rows
250                    if len(rows) == 1:
251                        print("*** More than one / operator in line", rows[0])
252                    else:
253                        print("*** More than one / operator per statement", end=' ')
254                        print("in lines %d-%d" % (rows[0], rows[-1]))
255            intlong = []
256            floatcomplex = []
257            bad = []
258            for lineno, what in warnings:
259                if what in ("int", "long"):
260                    intlong.append(what)
261                elif what in ("float", "complex"):
262                    floatcomplex.append(what)
263                else:
264                    bad.append(what)
265            lastrow = None
266            for (row, col), line in slashes:
267                if row == lastrow:
268                    continue
269                lastrow = row
270                line = chop(line)
271                if line[col:col+1] != "/":
272                    print("*** Can't find the / operator in line %d:" % row)
273                    print("*", line)
274                    continue
275                if bad:
276                    print("*** Bad warning for line %d:" % row, bad)
277                    print("*", line)
278                elif intlong and not floatcomplex:
279                    print("%dc%d" % (row, row))
280                    print("<", line)
281                    print("---")
282                    print(">", line[:col] + "/" + line[col:])
283                elif floatcomplex and not intlong:
284                    print("True division / operator at line %d:" % row)
285                    print("=", line)
286                elif intlong and floatcomplex:
287                    print("*** Ambiguous / operator (%s, %s) at line %d:" % (
288                        "|".join(intlong), "|".join(floatcomplex), row))
289                    print("?", line)
290    fp.close()
291
292def reportphantomwarnings(warnings, f):
293    blocks = []
294    lastrow = None
295    lastblock = None
296    for row, what in warnings:
297        if row != lastrow:
298            lastblock = [row]
299            blocks.append(lastblock)
300        lastblock.append(what)
301    for block in blocks:
302        row = block[0]
303        whats = "/".join(block[1:])
304        print("*** Phantom %s warnings for line %d:" % (whats, row))
305        f.report(row, mark="*")
306
307def report(slashes, message):
308    lastrow = None
309    for (row, col), line in slashes:
310        if row != lastrow:
311            print("*** %s on line %d:" % (message, row))
312            print("*", chop(line))
313            lastrow = row
314
315class FileContext:
316    def __init__(self, fp, window=5, lineno=1):
317        self.fp = fp
318        self.window = 5
319        self.lineno = 1
320        self.eoflookahead = 0
321        self.lookahead = []
322        self.buffer = []
323    def fill(self):
324        while len(self.lookahead) < self.window and not self.eoflookahead:
325            line = self.fp.readline()
326            if not line:
327                self.eoflookahead = 1
328                break
329            self.lookahead.append(line)
330    def readline(self):
331        self.fill()
332        if not self.lookahead:
333            return ""
334        line = self.lookahead.pop(0)
335        self.buffer.append(line)
336        self.lineno += 1
337        return line
338    def __getitem__(self, index):
339        self.fill()
340        bufstart = self.lineno - len(self.buffer)
341        lookend = self.lineno + len(self.lookahead)
342        if bufstart <= index < self.lineno:
343            return self.buffer[index - bufstart]
344        if self.lineno <= index < lookend:
345            return self.lookahead[index - self.lineno]
346        raise KeyError
347    def report(self, first, last=None, mark="*"):
348        if last is None:
349            last = first
350        for i in range(first, last+1):
351            try:
352                line = self[first]
353            except KeyError:
354                line = "<missing line>"
355            print(mark, chop(line))
356
357def scanline(g):
358    slashes = []
359    startlineno = None
360    endlineno = None
361    for type, token, start, end, line in g:
362        endlineno = end[0]
363        if startlineno is None:
364            startlineno = endlineno
365        if token in ("/", "/="):
366            slashes.append((start, line))
367        if type == tokenize.NEWLINE:
368            break
369    return startlineno, endlineno, slashes
370
371def chop(line):
372    if line.endswith("\n"):
373        return line[:-1]
374    else:
375        return line
376
377if __name__ == "__main__":
378    sys.exit(main())
379