1#! /usr/bin/env python3
2
3"""fixdiv - tool to fix division operators.
4
5To use this tool, first run `python -Qwarnall yourscript.py 2>warnings'.
6This runs the script `yourscript.py' while writing warning messages
7about all uses of the classic division operator to the file
8`warnings'.  The warnings look like this:
9
10  <file>:<line>: DeprecationWarning: classic <type> division
11
12The warnings are written to stderr, so you must use `2>' for the I/O
13redirect.  I know of no way to redirect stderr on Windows in a DOS
14box, so you will have to modify the script to set sys.stderr to some
15kind of log file if you want to do this on Windows.
16
17The warnings are not limited to the script; modules imported by the
18script may also trigger warnings.  In fact a useful technique is to
19write a test script specifically intended to exercise all code in a
20particular module or set of modules.
21
22Then run `python fixdiv.py warnings'.  This first reads the warnings,
23looking for classic division warnings, and sorts them by file name and
24line number.  Then, for each file that received at least one warning,
25it parses the file and tries to match the warnings up to the division
26operators found in the source code.  If it is successful, it writes
27its findings to stdout, preceded by a line of dashes and a line of the
28form:
29
30  Index: <file>
31
32If the only findings found are suggestions to change a / operator into
33a // operator, the output is acceptable input for the Unix 'patch'
34program.
35
36Here are the possible messages on stdout (N stands for a line number):
37
38- A plain-diff-style change ('NcN', a line marked by '<', a line
39  containing '---', and a line marked by '>'):
40
41  A / operator was found that should be changed to //.  This is the
42  recommendation when only int and/or long arguments were seen.
43
44- 'True division / operator at line N' and a line marked by '=':
45
46  A / operator was found that can remain unchanged.  This is the
47  recommendation when only float and/or complex arguments were seen.
48
49- 'Ambiguous / operator (..., ...) at line N', line marked by '?':
50
51  A / operator was found for which int or long as well as float or
52  complex arguments were seen.  This is highly unlikely; if it occurs,
53  you may have to restructure the code to keep the classic semantics,
54  or maybe you don't care about the classic semantics.
55
56- 'No conclusive evidence on line N', line marked by '*':
57
58  A / operator was found for which no warnings were seen.  This could
59  be code that was never executed, or code that was only executed
60  with user-defined objects as arguments.  You will have to
61  investigate further.  Note that // can be overloaded separately from
62  /, using __floordiv__.  True division can also be separately
63  overloaded, using __truediv__.  Classic division should be the same
64  as either of those.  (XXX should I add a warning for division on
65  user-defined objects, to disambiguate this case from code that was
66  never executed?)
67
68- 'Phantom ... warnings for line N', line marked by '*':
69
70  A warning was seen for a line not containing a / operator.  The most
71  likely cause is a warning about code executed by 'exec' or eval()
72  (see note below), or an indirect invocation of the / operator, for
73  example via the div() function in the operator module.  It could
74  also be caused by a change to the file between the time the test
75  script was run to collect warnings and the time fixdiv was run.
76
77- 'More than one / operator in line N'; or
78  'More than one / operator per statement in lines N-N':
79
80  The scanner found more than one / operator on a single line, or in a
81  statement split across multiple lines.  Because the warnings
82  framework doesn't (and can't) show the offset within the line, and
83  the code generator doesn't always give the correct line number for
84  operations in a multi-line statement, we can't be sure whether all
85  operators in the statement were executed.  To be on the safe side,
86  by default a warning is issued about this case.  In practice, these
87  cases are usually safe, and the -m option suppresses these warning.
88
89- 'Can't find the / operator in line N', line marked by '*':
90
91  This really shouldn't happen.  It means that the tokenize module
92  reported a '/' operator but the line it returns didn't contain a '/'
93  character at the indicated position.
94
95- 'Bad warning for line N: XYZ', line marked by '*':
96
97  This really shouldn't happen.  It means that a 'classic XYZ
98  division' warning was read with XYZ being something other than
99  'int', 'long', 'float', or 'complex'.
100
101Notes:
102
103- The augmented assignment operator /= is handled the same way as the
104  / operator.
105
106- This tool never looks at the // operator; no warnings are ever
107  generated for use of this operator.
108
109- This tool never looks at the / operator when a future division
110  statement is in effect; no warnings are generated in this case, and
111  because the tool only looks at files for which at least one classic
112  division warning was seen, it will never look at files containing a
113  future division statement.
114
115- Warnings may be issued for code not read from a file, but executed
116  using the exec() or eval() functions.  These may have
117  <string> in the filename position, in which case the fixdiv script
118  will attempt and fail to open a file named '<string>' and issue a
119  warning about this failure; or these may be reported as 'Phantom'
120  warnings (see above).  You're on your own to deal with these.  You
121  could make all recommended changes and add a future division
122  statement to all affected files, and then re-run the test script; it
123  should not issue any warnings.  If there are any, and you have a
124  hard time tracking down where they are generated, you can use the
125  -Werror option to force an error instead of a first warning,
126  generating a traceback.
127
128- The tool should be run from the same directory as that from which
129  the original script was run, otherwise it won't be able to open
130  files given by relative pathnames.
131"""
132
133import sys
134import getopt
135import re
136import tokenize
137
138multi_ok = 0
139
140def main():
141    try:
142        opts, args = getopt.getopt(sys.argv[1:], "hm")
143    except getopt.error as msg:
144        usage(msg)
145        return 2
146    for o, a in opts:
147        if o == "-h":
148            print(__doc__)
149            return
150        if o == "-m":
151            global multi_ok
152            multi_ok = 1
153    if not args:
154        usage("at least one file argument is required")
155        return 2
156    if args[1:]:
157        sys.stderr.write("%s: extra file arguments ignored\n", sys.argv[0])
158    warnings = readwarnings(args[0])
159    if warnings is None:
160        return 1
161    files = list(warnings.keys())
162    if not files:
163        print("No classic division warnings read from", args[0])
164        return
165    files.sort()
166    exit = None
167    for filename in files:
168        x = process(filename, warnings[filename])
169        exit = exit or x
170    return exit
171
172def usage(msg):
173    sys.stderr.write("%s: %s\n" % (sys.argv[0], msg))
174    sys.stderr.write("Usage: %s [-m] warnings\n" % sys.argv[0])
175    sys.stderr.write("Try `%s -h' for more information.\n" % sys.argv[0])
176
177PATTERN = (r"^(.+?):(\d+): DeprecationWarning: "
178           r"classic (int|long|float|complex) division$")
179
180def readwarnings(warningsfile):
181    prog = re.compile(PATTERN)
182    warnings = {}
183    try:
184        f = open(warningsfile)
185    except IOError as msg:
186        sys.stderr.write("can't open: %s\n" % msg)
187        return
188    with f:
189        while 1:
190            line = f.readline()
191            if not line:
192                break
193            m = prog.match(line)
194            if not m:
195                if line.find("division") >= 0:
196                    sys.stderr.write("Warning: ignored input " + line)
197                continue
198            filename, lineno, what = m.groups()
199            list = warnings.get(filename)
200            if list is None:
201                warnings[filename] = list = []
202            list.append((int(lineno), sys.intern(what)))
203    return warnings
204
205def process(filename, list):
206    print("-"*70)
207    assert list # if this fails, readwarnings() is broken
208    try:
209        fp = open(filename)
210    except IOError as msg:
211        sys.stderr.write("can't open: %s\n" % msg)
212        return 1
213    with fp:
214        print("Index:", filename)
215        f = FileContext(fp)
216        list.sort()
217        index = 0 # list[:index] has been processed, list[index:] is still to do
218        g = tokenize.generate_tokens(f.readline)
219        while 1:
220            startlineno, endlineno, slashes = lineinfo = scanline(g)
221            if startlineno is None:
222                break
223            assert startlineno <= endlineno is not None
224            orphans = []
225            while index < len(list) and list[index][0] < startlineno:
226                orphans.append(list[index])
227                index += 1
228            if orphans:
229                reportphantomwarnings(orphans, f)
230            warnings = []
231            while index < len(list) and list[index][0] <= endlineno:
232                warnings.append(list[index])
233                index += 1
234            if not slashes and not warnings:
235                pass
236            elif slashes and not warnings:
237                report(slashes, "No conclusive evidence")
238            elif warnings and not slashes:
239                reportphantomwarnings(warnings, f)
240            else:
241                if len(slashes) > 1:
242                    if not multi_ok:
243                        rows = []
244                        lastrow = None
245                        for (row, col), line in slashes:
246                            if row == lastrow:
247                                continue
248                            rows.append(row)
249                            lastrow = row
250                        assert rows
251                        if len(rows) == 1:
252                            print("*** More than one / operator in line", rows[0])
253                        else:
254                            print("*** More than one / operator per statement", end=' ')
255                            print("in lines %d-%d" % (rows[0], rows[-1]))
256                intlong = []
257                floatcomplex = []
258                bad = []
259                for lineno, what in warnings:
260                    if what in ("int", "long"):
261                        intlong.append(what)
262                    elif what in ("float", "complex"):
263                        floatcomplex.append(what)
264                    else:
265                        bad.append(what)
266                lastrow = None
267                for (row, col), line in slashes:
268                    if row == lastrow:
269                        continue
270                    lastrow = row
271                    line = chop(line)
272                    if line[col:col+1] != "/":
273                        print("*** Can't find the / operator in line %d:" % row)
274                        print("*", line)
275                        continue
276                    if bad:
277                        print("*** Bad warning for line %d:" % row, bad)
278                        print("*", line)
279                    elif intlong and not floatcomplex:
280                        print("%dc%d" % (row, row))
281                        print("<", line)
282                        print("---")
283                        print(">", line[:col] + "/" + line[col:])
284                    elif floatcomplex and not intlong:
285                        print("True division / operator at line %d:" % row)
286                        print("=", line)
287                    elif intlong and floatcomplex:
288                        print("*** Ambiguous / operator (%s, %s) at line %d:" %
289                            ("|".join(intlong), "|".join(floatcomplex), row))
290                        print("?", line)
291
292def reportphantomwarnings(warnings, f):
293    blocks = []
294    lastrow = None
295    lastblock = None
296    for row, what in warnings:
297        if row != lastrow:
298            lastblock = [row]
299            blocks.append(lastblock)
300        lastblock.append(what)
301    for block in blocks:
302        row = block[0]
303        whats = "/".join(block[1:])
304        print("*** Phantom %s warnings for line %d:" % (whats, row))
305        f.report(row, mark="*")
306
307def report(slashes, message):
308    lastrow = None
309    for (row, col), line in slashes:
310        if row != lastrow:
311            print("*** %s on line %d:" % (message, row))
312            print("*", chop(line))
313            lastrow = row
314
315class FileContext:
316    def __init__(self, fp, window=5, lineno=1):
317        self.fp = fp
318        self.window = 5
319        self.lineno = 1
320        self.eoflookahead = 0
321        self.lookahead = []
322        self.buffer = []
323    def fill(self):
324        while len(self.lookahead) < self.window and not self.eoflookahead:
325            line = self.fp.readline()
326            if not line:
327                self.eoflookahead = 1
328                break
329            self.lookahead.append(line)
330    def readline(self):
331        self.fill()
332        if not self.lookahead:
333            return ""
334        line = self.lookahead.pop(0)
335        self.buffer.append(line)
336        self.lineno += 1
337        return line
338    def __getitem__(self, index):
339        self.fill()
340        bufstart = self.lineno - len(self.buffer)
341        lookend = self.lineno + len(self.lookahead)
342        if bufstart <= index < self.lineno:
343            return self.buffer[index - bufstart]
344        if self.lineno <= index < lookend:
345            return self.lookahead[index - self.lineno]
346        raise KeyError
347    def report(self, first, last=None, mark="*"):
348        if last is None:
349            last = first
350        for i in range(first, last+1):
351            try:
352                line = self[first]
353            except KeyError:
354                line = "<missing line>"
355            print(mark, chop(line))
356
357def scanline(g):
358    slashes = []
359    startlineno = None
360    endlineno = None
361    for type, token, start, end, line in g:
362        endlineno = end[0]
363        if startlineno is None:
364            startlineno = endlineno
365        if token in ("/", "/="):
366            slashes.append((start, line))
367        if type == tokenize.NEWLINE:
368            break
369    return startlineno, endlineno, slashes
370
371def chop(line):
372    if line.endswith("\n"):
373        return line[:-1]
374    else:
375        return line
376
377if __name__ == "__main__":
378    sys.exit(main())
379