1{-
2(c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4\section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5-}
6
7{-# LANGUAGE CPP #-}
8{-# LANGUAGE DeriveFunctor #-}
9{-# LANGUAGE ViewPatterns #-}
10module Specialise ( specProgram, specUnfolding ) where
11
12#include "HsVersions.h"
13
14import GhcPrelude
15
16import Id
17import TcType hiding( substTy )
18import Type   hiding( substTy, extendTvSubstList )
19import Predicate
20import Module( Module, HasModule(..) )
21import Coercion( Coercion )
22import CoreMonad
23import qualified CoreSubst as Core
24import CoreUnfold
25import Var              ( isLocalVar )
26import VarSet
27import VarEnv
28import CoreSyn
29import Rules
30import CoreOpt          ( collectBindersPushingCo )
31import CoreUtils        ( exprIsTrivial, mkCast, exprType, getIdFromTrivialExpr_maybe )
32import CoreFVs
33import CoreArity        ( etaExpandToJoinPointRule )
34import UniqSupply
35import Name
36import MkId             ( voidArgId, voidPrimId )
37import TysPrim            ( voidPrimTy )
38import Maybes           ( mapMaybe, maybeToList, isJust )
39import MonadUtils       ( foldlM )
40import BasicTypes
41import HscTypes
42import Bag
43import DynFlags
44import Util
45import Outputable
46import FastString
47import State
48import UniqDFM
49import TyCoRep (TyCoBinder (..))
50
51import Control.Monad
52import qualified Control.Monad.Fail as MonadFail
53
54{-
55************************************************************************
56*                                                                      *
57\subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
58*                                                                      *
59************************************************************************
60
61These notes describe how we implement specialisation to eliminate
62overloading.
63
64The specialisation pass works on Core
65syntax, complete with all the explicit dictionary application,
66abstraction and construction as added by the type checker.  The
67existing type checker remains largely as it is.
68
69One important thought: the {\em types} passed to an overloaded
70function, and the {\em dictionaries} passed are mutually redundant.
71If the same function is applied to the same type(s) then it is sure to
72be applied to the same dictionary(s)---or rather to the same {\em
73values}.  (The arguments might look different but they will evaluate
74to the same value.)
75
76Second important thought: we know that we can make progress by
77treating dictionary arguments as static and worth specialising on.  So
78we can do without binding-time analysis, and instead specialise on
79dictionary arguments and no others.
80
81The basic idea
82~~~~~~~~~~~~~~
83Suppose we have
84
85        let f = <f_rhs>
86        in <body>
87
88and suppose f is overloaded.
89
90STEP 1: CALL-INSTANCE COLLECTION
91
92We traverse <body>, accumulating all applications of f to types and
93dictionaries.
94
95(Might there be partial applications, to just some of its types and
96dictionaries?  In principle yes, but in practice the type checker only
97builds applications of f to all its types and dictionaries, so partial
98applications could only arise as a result of transformation, and even
99then I think it's unlikely.  In any case, we simply don't accumulate such
100partial applications.)
101
102
103STEP 2: EQUIVALENCES
104
105So now we have a collection of calls to f:
106        f t1 t2 d1 d2
107        f t3 t4 d3 d4
108        ...
109Notice that f may take several type arguments.  To avoid ambiguity, we
110say that f is called at type t1/t2 and t3/t4.
111
112We take equivalence classes using equality of the *types* (ignoring
113the dictionary args, which as mentioned previously are redundant).
114
115STEP 3: SPECIALISATION
116
117For each equivalence class, choose a representative (f t1 t2 d1 d2),
118and create a local instance of f, defined thus:
119
120        f@t1/t2 = <f_rhs> t1 t2 d1 d2
121
122f_rhs presumably has some big lambdas and dictionary lambdas, so lots
123of simplification will now result.  However we don't actually *do* that
124simplification.  Rather, we leave it for the simplifier to do.  If we
125*did* do it, though, we'd get more call instances from the specialised
126RHS.  We can work out what they are by instantiating the call-instance
127set from f's RHS with the types t1, t2.
128
129Add this new id to f's IdInfo, to record that f has a specialised version.
130
131Before doing any of this, check that f's IdInfo doesn't already
132tell us about an existing instance of f at the required type/s.
133(This might happen if specialisation was applied more than once, or
134it might arise from user SPECIALIZE pragmas.)
135
136Recursion
137~~~~~~~~~
138Wait a minute!  What if f is recursive?  Then we can't just plug in
139its right-hand side, can we?
140
141But it's ok.  The type checker *always* creates non-recursive definitions
142for overloaded recursive functions.  For example:
143
144        f x = f (x+x)           -- Yes I know its silly
145
146becomes
147
148        f a (d::Num a) = let p = +.sel a d
149                         in
150                         letrec fl (y::a) = fl (p y y)
151                         in
152                         fl
153
154We still have recursion for non-overloaded functions which we
155specialise, but the recursive call should get specialised to the
156same recursive version.
157
158
159Polymorphism 1
160~~~~~~~~~~~~~~
161
162All this is crystal clear when the function is applied to *constant
163types*; that is, types which have no type variables inside.  But what if
164it is applied to non-constant types?  Suppose we find a call of f at type
165t1/t2.  There are two possibilities:
166
167(a) The free type variables of t1, t2 are in scope at the definition point
168of f.  In this case there's no problem, we proceed just as before.  A common
169example is as follows.  Here's the Haskell:
170
171        g y = let f x = x+x
172              in f y + f y
173
174After typechecking we have
175
176        g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
177                                in +.sel a d (f a d y) (f a d y)
178
179Notice that the call to f is at type type "a"; a non-constant type.
180Both calls to f are at the same type, so we can specialise to give:
181
182        g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
183                                in +.sel a d (f@a y) (f@a y)
184
185
186(b) The other case is when the type variables in the instance types
187are *not* in scope at the definition point of f.  The example we are
188working with above is a good case.  There are two instances of (+.sel a d),
189but "a" is not in scope at the definition of +.sel.  Can we do anything?
190Yes, we can "common them up", a sort of limited common sub-expression deal.
191This would give:
192
193        g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
194                                    f@a (x::a) = +.sel@a x x
195                                in +.sel@a (f@a y) (f@a y)
196
197This can save work, and can't be spotted by the type checker, because
198the two instances of +.sel weren't originally at the same type.
199
200Further notes on (b)
201
202* There are quite a few variations here.  For example, the defn of
203  +.sel could be floated ouside the \y, to attempt to gain laziness.
204  It certainly mustn't be floated outside the \d because the d has to
205  be in scope too.
206
207* We don't want to inline f_rhs in this case, because
208that will duplicate code.  Just commoning up the call is the point.
209
210* Nothing gets added to +.sel's IdInfo.
211
212* Don't bother unless the equivalence class has more than one item!
213
214Not clear whether this is all worth it.  It is of course OK to
215simply discard call-instances when passing a big lambda.
216
217Polymorphism 2 -- Overloading
218~~~~~~~~~~~~~~
219Consider a function whose most general type is
220
221        f :: forall a b. Ord a => [a] -> b -> b
222
223There is really no point in making a version of g at Int/Int and another
224at Int/Bool, because it's only instantiating the type variable "a" which
225buys us any efficiency. Since g is completely polymorphic in b there
226ain't much point in making separate versions of g for the different
227b types.
228
229That suggests that we should identify which of g's type variables
230are constrained (like "a") and which are unconstrained (like "b").
231Then when taking equivalence classes in STEP 2, we ignore the type args
232corresponding to unconstrained type variable.  In STEP 3 we make
233polymorphic versions.  Thus:
234
235        f@t1/ = /\b -> <f_rhs> t1 b d1 d2
236
237We do this.
238
239
240Dictionary floating
241~~~~~~~~~~~~~~~~~~~
242Consider this
243
244        f a (d::Num a) = let g = ...
245                         in
246                         ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
247
248Here, g is only called at one type, but the dictionary isn't in scope at the
249definition point for g.  Usually the type checker would build a
250definition for d1 which enclosed g, but the transformation system
251might have moved d1's defn inward.  Solution: float dictionary bindings
252outwards along with call instances.
253
254Consider
255
256        f x = let g p q = p==q
257                  h r s = (r+s, g r s)
258              in
259              h x x
260
261
262Before specialisation, leaving out type abstractions we have
263
264        f df x = let g :: Eq a => a -> a -> Bool
265                     g dg p q = == dg p q
266                     h :: Num a => a -> a -> (a, Bool)
267                     h dh r s = let deq = eqFromNum dh
268                                in (+ dh r s, g deq r s)
269              in
270              h df x x
271
272After specialising h we get a specialised version of h, like this:
273
274                    h' r s = let deq = eqFromNum df
275                             in (+ df r s, g deq r s)
276
277But we can't naively make an instance for g from this, because deq is not in scope
278at the defn of g.  Instead, we have to float out the (new) defn of deq
279to widen its scope.  Notice that this floating can't be done in advance -- it only
280shows up when specialisation is done.
281
282User SPECIALIZE pragmas
283~~~~~~~~~~~~~~~~~~~~~~~
284Specialisation pragmas can be digested by the type checker, and implemented
285by adding extra definitions along with that of f, in the same way as before
286
287        f@t1/t2 = <f_rhs> t1 t2 d1 d2
288
289Indeed the pragmas *have* to be dealt with by the type checker, because
290only it knows how to build the dictionaries d1 and d2!  For example
291
292        g :: Ord a => [a] -> [a]
293        {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
294
295Here, the specialised version of g is an application of g's rhs to the
296Ord dictionary for (Tree Int), which only the type checker can conjure
297up.  There might not even *be* one, if (Tree Int) is not an instance of
298Ord!  (All the other specialision has suitable dictionaries to hand
299from actual calls.)
300
301Problem.  The type checker doesn't have to hand a convenient <f_rhs>, because
302it is buried in a complex (as-yet-un-desugared) binding group.
303Maybe we should say
304
305        f@t1/t2 = f* t1 t2 d1 d2
306
307where f* is the Id f with an IdInfo which says "inline me regardless!".
308Indeed all the specialisation could be done in this way.
309That in turn means that the simplifier has to be prepared to inline absolutely
310any in-scope let-bound thing.
311
312
313Again, the pragma should permit polymorphism in unconstrained variables:
314
315        h :: Ord a => [a] -> b -> b
316        {-# SPECIALIZE h :: [Int] -> b -> b #-}
317
318We *insist* that all overloaded type variables are specialised to ground types,
319(and hence there can be no context inside a SPECIALIZE pragma).
320We *permit* unconstrained type variables to be specialised to
321        - a ground type
322        - or left as a polymorphic type variable
323but nothing in between.  So
324
325        {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
326
327is *illegal*.  (It can be handled, but it adds complication, and gains the
328programmer nothing.)
329
330
331SPECIALISING INSTANCE DECLARATIONS
332~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
333Consider
334
335        instance Foo a => Foo [a] where
336                ...
337        {-# SPECIALIZE instance Foo [Int] #-}
338
339The original instance decl creates a dictionary-function
340definition:
341
342        dfun.Foo.List :: forall a. Foo a -> Foo [a]
343
344The SPECIALIZE pragma just makes a specialised copy, just as for
345ordinary function definitions:
346
347        dfun.Foo.List@Int :: Foo [Int]
348        dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
349
350The information about what instance of the dfun exist gets added to
351the dfun's IdInfo in the same way as a user-defined function too.
352
353
354Automatic instance decl specialisation?
355~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
356Can instance decls be specialised automatically?  It's tricky.
357We could collect call-instance information for each dfun, but
358then when we specialised their bodies we'd get new call-instances
359for ordinary functions; and when we specialised their bodies, we might get
360new call-instances of the dfuns, and so on.  This all arises because of
361the unrestricted mutual recursion between instance decls and value decls.
362
363Still, there's no actual problem; it just means that we may not do all
364the specialisation we could theoretically do.
365
366Furthermore, instance decls are usually exported and used non-locally,
367so we'll want to compile enough to get those specialisations done.
368
369Lastly, there's no such thing as a local instance decl, so we can
370survive solely by spitting out *usage* information, and then reading that
371back in as a pragma when next compiling the file.  So for now,
372we only specialise instance decls in response to pragmas.
373
374
375SPITTING OUT USAGE INFORMATION
376~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377
378To spit out usage information we need to traverse the code collecting
379call-instance information for all imported (non-prelude?) functions
380and data types. Then we equivalence-class it and spit it out.
381
382This is done at the top-level when all the call instances which escape
383must be for imported functions and data types.
384
385*** Not currently done ***
386
387
388Partial specialisation by pragmas
389~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
390What about partial specialisation:
391
392        k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
393        {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
394
395or even
396
397        {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
398
399Seems quite reasonable.  Similar things could be done with instance decls:
400
401        instance (Foo a, Foo b) => Foo (a,b) where
402                ...
403        {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
404        {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
405
406Ho hum.  Things are complex enough without this.  I pass.
407
408
409Requirements for the simplifier
410~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
411The simplifier has to be able to take advantage of the specialisation.
412
413* When the simplifier finds an application of a polymorphic f, it looks in
414f's IdInfo in case there is a suitable instance to call instead.  This converts
415
416        f t1 t2 d1 d2   ===>   f_t1_t2
417
418Note that the dictionaries get eaten up too!
419
420* Dictionary selection operations on constant dictionaries must be
421  short-circuited:
422
423        +.sel Int d     ===>  +Int
424
425The obvious way to do this is in the same way as other specialised
426calls: +.sel has inside it some IdInfo which tells that if it's applied
427to the type Int then it should eat a dictionary and transform to +Int.
428
429In short, dictionary selectors need IdInfo inside them for constant
430methods.
431
432* Exactly the same applies if a superclass dictionary is being
433  extracted:
434
435        Eq.sel Int d   ===>   dEqInt
436
437* Something similar applies to dictionary construction too.  Suppose
438dfun.Eq.List is the function taking a dictionary for (Eq a) to
439one for (Eq [a]).  Then we want
440
441        dfun.Eq.List Int d      ===> dEq.List_Int
442
443Where does the Eq [Int] dictionary come from?  It is built in
444response to a SPECIALIZE pragma on the Eq [a] instance decl.
445
446In short, dfun Ids need IdInfo with a specialisation for each
447constant instance of their instance declaration.
448
449All this uses a single mechanism: the SpecEnv inside an Id
450
451
452What does the specialisation IdInfo look like?
453~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
454
455The SpecEnv of an Id maps a list of types (the template) to an expression
456
457        [Type]  |->  Expr
458
459For example, if f has this RuleInfo:
460
461        [Int, a]  ->  \d:Ord Int. f' a
462
463it means that we can replace the call
464
465        f Int t  ===>  (\d. f' t)
466
467This chucks one dictionary away and proceeds with the
468specialised version of f, namely f'.
469
470
471What can't be done this way?
472~~~~~~~~~~~~~~~~~~~~~~~~~~~~
473There is no way, post-typechecker, to get a dictionary for (say)
474Eq a from a dictionary for Eq [a].  So if we find
475
476        ==.sel [t] d
477
478we can't transform to
479
480        eqList (==.sel t d')
481
482where
483        eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
484
485Of course, we currently have no way to automatically derive
486eqList, nor to connect it to the Eq [a] instance decl, but you
487can imagine that it might somehow be possible.  Taking advantage
488of this is permanently ruled out.
489
490Still, this is no great hardship, because we intend to eliminate
491overloading altogether anyway!
492
493A note about non-tyvar dictionaries
494~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
495Some Ids have types like
496
497        forall a,b,c. Eq a -> Ord [a] -> tau
498
499This seems curious at first, because we usually only have dictionary
500args whose types are of the form (C a) where a is a type variable.
501But this doesn't hold for the functions arising from instance decls,
502which sometimes get arguments with types of form (C (T a)) for some
503type constructor T.
504
505Should we specialise wrt this compound-type dictionary?  We used to say
506"no", saying:
507        "This is a heuristic judgement, as indeed is the fact that we
508        specialise wrt only dictionaries.  We choose *not* to specialise
509        wrt compound dictionaries because at the moment the only place
510        they show up is in instance decls, where they are simply plugged
511        into a returned dictionary.  So nothing is gained by specialising
512        wrt them."
513
514But it is simpler and more uniform to specialise wrt these dicts too;
515and in future GHC is likely to support full fledged type signatures
516like
517        f :: Eq [(a,b)] => ...
518
519
520************************************************************************
521*                                                                      *
522\subsubsection{The new specialiser}
523*                                                                      *
524************************************************************************
525
526Our basic game plan is this.  For let(rec) bound function
527        f :: (C a, D c) => (a,b,c,d) -> Bool
528
529* Find any specialised calls of f, (f ts ds), where
530  ts are the type arguments t1 .. t4, and
531  ds are the dictionary arguments d1 .. d2.
532
533* Add a new definition for f1 (say):
534
535        f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
536
537  Note that we abstract over the unconstrained type arguments.
538
539* Add the mapping
540
541        [t1,b,t3,d]  |->  \d1 d2 -> f1 b d
542
543  to the specialisations of f.  This will be used by the
544  simplifier to replace calls
545                (f t1 t2 t3 t4) da db
546  by
547                (\d1 d1 -> f1 t2 t4) da db
548
549  All the stuff about how many dictionaries to discard, and what types
550  to apply the specialised function to, are handled by the fact that the
551  SpecEnv contains a template for the result of the specialisation.
552
553We don't build *partial* specialisations for f.  For example:
554
555  f :: Eq a => a -> a -> Bool
556  {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
557
558Here, little is gained by making a specialised copy of f.
559There's a distinct danger that the specialised version would
560first build a dictionary for (Eq b, Eq c), and then select the (==)
561method from it!  Even if it didn't, not a great deal is saved.
562
563We do, however, generate polymorphic, but not overloaded, specialisations:
564
565  f :: Eq a => [a] -> b -> b -> b
566  ... SPECIALISE f :: [Int] -> b -> b -> b ...
567
568Hence, the invariant is this:
569
570        *** no specialised version is overloaded ***
571
572
573************************************************************************
574*                                                                      *
575\subsubsection{The exported function}
576*                                                                      *
577************************************************************************
578-}
579
580-- | Specialise calls to type-class overloaded functions occuring in a program.
581specProgram :: ModGuts -> CoreM ModGuts
582specProgram guts@(ModGuts { mg_module = this_mod
583                          , mg_rules = local_rules
584                          , mg_binds = binds })
585  = do { dflags <- getDynFlags
586
587             -- Specialise the bindings of this module
588       ; (binds', uds) <- runSpecM dflags this_mod (go binds)
589
590       ; (spec_rules, spec_binds) <- specImports dflags this_mod top_env
591                                                 local_rules uds
592
593       ; return (guts { mg_binds = spec_binds ++ binds'
594                      , mg_rules = spec_rules ++ local_rules }) }
595  where
596        -- We need to start with a Subst that knows all the things
597        -- that are in scope, so that the substitution engine doesn't
598        -- accidentally re-use a unique that's already in use
599        -- Easiest thing is to do it all at once, as if all the top-level
600        -- decls were mutually recursive
601    top_env = SE { se_subst = Core.mkEmptySubst $ mkInScopeSet $ mkVarSet $
602                              bindersOfBinds binds
603                 , se_interesting = emptyVarSet }
604
605    go []           = return ([], emptyUDs)
606    go (bind:binds) = do (binds', uds) <- go binds
607                         (bind', uds') <- specBind top_env bind uds
608                         return (bind' ++ binds', uds')
609
610{-
611Note [Wrap bindings returned by specImports]
612~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
613'specImports' returns a set of specialized bindings. However, these are lacking
614necessary floated dictionary bindings, which are returned by
615UsageDetails(ud_binds). These dictionaries need to be brought into scope with
616'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
617for instance, the 'specImports' call in 'specProgram'.
618
619
620Note [Disabling cross-module specialisation]
621~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
622Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
623in modules outside of the current module. This can sometimes uncover user code
624which explodes in size when aggressively optimized. The
625-fno-cross-module-specialise option was introduced to allow users to being
626bitten by such instances to revert to the pre-7.10 behavior.
627
628See #10491
629-}
630
631
632{- *********************************************************************
633*                                                                      *
634                   Specialising imported functions
635*                                                                      *
636********************************************************************* -}
637
638specImports :: DynFlags -> Module -> SpecEnv
639            -> [CoreRule]
640            -> UsageDetails
641            -> CoreM ([CoreRule], [CoreBind])
642specImports dflags this_mod top_env local_rules
643            (MkUD { ud_binds = dict_binds, ud_calls = calls })
644  | not $ gopt Opt_CrossModuleSpecialise dflags
645    -- See Note [Disabling cross-module specialisation]
646  = return ([], wrapDictBinds dict_binds [])
647
648  | otherwise
649  = do { hpt_rules <- getRuleBase
650       ; let rule_base = extendRuleBaseList hpt_rules local_rules
651
652       ; (spec_rules, spec_binds) <- spec_imports dflags this_mod top_env
653                                                  [] rule_base
654                                                  dict_binds calls
655
656             -- Don't forget to wrap the specialized bindings with
657             -- bindings for the needed dictionaries.
658             -- See Note [Wrap bindings returned by specImports]
659             -- and Note [Glom the bindings if imported functions are specialised]
660       ; let final_binds
661               | null spec_binds = wrapDictBinds dict_binds []
662               | otherwise       = [Rec $ flattenBinds $
663                                    wrapDictBinds dict_binds spec_binds]
664
665       ; return (spec_rules, final_binds)
666    }
667
668-- | Specialise a set of calls to imported bindings
669spec_imports :: DynFlags
670             -> Module
671             -> SpecEnv          -- Passed in so that all top-level Ids are in scope
672             -> [Id]             -- Stack of imported functions being specialised
673                                 -- See Note [specImport call stack]
674             -> RuleBase         -- Rules from this module and the home package
675                                 -- (but not external packages, which can change)
676             -> Bag DictBind     -- Dict bindings, used /only/ for filterCalls
677                                 -- See Note [Avoiding loops in specImports]
678             -> CallDetails      -- Calls for imported things
679             -> CoreM ( [CoreRule]   -- New rules
680                      , [CoreBind] ) -- Specialised bindings
681spec_imports dflags this_mod top_env
682             callers rule_base dict_binds calls
683  = do { let import_calls = dVarEnvElts calls
684       -- ; debugTraceMsg (text "specImports {" <+>
685       --                  vcat [ text "calls:" <+> ppr import_calls
686       --                       , text "dict_binds:" <+> ppr dict_binds ])
687       ; (rules, spec_binds) <- go rule_base import_calls
688       -- ; debugTraceMsg (text "End specImports }" <+> ppr import_calls)
689
690       ; return (rules, spec_binds) }
691  where
692    go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
693    go _ [] = return ([], [])
694    go rb (cis : other_calls)
695      = do { -- debugTraceMsg (text "specImport {" <+> ppr cis)
696           ; (rules1, spec_binds1) <- spec_import dflags this_mod top_env
697                                                  callers rb dict_binds cis
698           -- ; debugTraceMsg (text "specImport }" <+> ppr cis)
699
700           ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
701           ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
702
703spec_import :: DynFlags
704            -> Module
705            -> SpecEnv               -- Passed in so that all top-level Ids are in scope
706            -> [Id]                  -- Stack of imported functions being specialised
707                                     -- See Note [specImport call stack]
708            -> RuleBase              -- Rules from this module
709            -> Bag DictBind          -- Dict bindings, used /only/ for filterCalls
710                                     -- See Note [Avoiding loops in specImports]
711            -> CallInfoSet           -- Imported function and calls for it
712            -> CoreM ( [CoreRule]    -- New rules
713                     , [CoreBind] )  -- Specialised bindings
714spec_import dflags this_mod top_env callers
715            rb dict_binds cis@(CIS fn _)
716  | isIn "specImport" fn callers
717  = return ([], [])     -- No warning.  This actually happens all the time
718                        -- when specialising a recursive function, because
719                        -- the RHS of the specialised function contains a recursive
720                        -- call to the original function
721
722  | null good_calls
723  = do { -- debugTraceMsg (text "specImport:no valid calls")
724       ; return ([], []) }
725
726  | wantSpecImport dflags unfolding
727  , Just rhs <- maybeUnfoldingTemplate unfolding
728  = do {     -- Get rules from the external package state
729             -- We keep doing this in case we "page-fault in"
730             -- more rules as we go along
731       ; hsc_env <- getHscEnv
732       ; eps <- liftIO $ hscEPS hsc_env
733       ; vis_orphs <- getVisibleOrphanMods
734       ; let full_rb = unionRuleBase rb (eps_rule_base eps)
735             rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
736
737       ; (rules1, spec_pairs, MkUD { ud_binds = dict_binds1, ud_calls = new_calls })
738             <- do { -- debugTraceMsg (text "specImport1" <+> vcat [ppr fn, ppr good_calls, ppr rhs])
739                   ; runSpecM dflags this_mod $
740                     specCalls (Just this_mod) top_env rules_for_fn good_calls fn rhs }
741       ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
742             -- After the rules kick in we may get recursion, but
743             -- we rely on a global GlomBinds to sort that out later
744             -- See Note [Glom the bindings if imported functions are specialised]
745
746              -- Now specialise any cascaded calls
747       -- ; debugTraceMsg (text "specImport 2" <+> (ppr fn $$ ppr rules1 $$ ppr spec_binds1))
748       ; (rules2, spec_binds2) <- spec_imports dflags this_mod top_env
749                                               (fn:callers)
750                                               (extendRuleBaseList rb rules1)
751                                               (dict_binds `unionBags` dict_binds1)
752                                               new_calls
753
754       ; let final_binds = wrapDictBinds dict_binds1 $
755                           spec_binds2 ++ spec_binds1
756
757       ; return (rules2 ++ rules1, final_binds) }
758
759  | otherwise
760  = do { tryWarnMissingSpecs dflags callers fn good_calls
761       ; return ([], [])}
762
763  where
764    unfolding = realIdUnfolding fn   -- We want to see the unfolding even for loop breakers
765    good_calls = filterCalls cis dict_binds
766       -- SUPER IMPORTANT!  Drop calls that (directly or indirectly) refer to fn
767       -- See Note [Avoiding loops in specImports]
768
769-- | Returns whether or not to show a missed-spec warning.
770-- If -Wall-missed-specializations is on, show the warning.
771-- Otherwise, if -Wmissed-specializations is on, only show a warning
772-- if there is at least one imported function being specialized,
773-- and if all imported functions are marked with an inline pragma
774-- Use the most specific warning as the reason.
775tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM ()
776-- See Note [Warning about missed specialisations]
777tryWarnMissingSpecs dflags callers fn calls_for_fn
778  | wopt Opt_WarnMissedSpecs dflags
779    && not (null callers)
780    && allCallersInlined                  = doWarn $ Reason Opt_WarnMissedSpecs
781  | wopt Opt_WarnAllMissedSpecs dflags    = doWarn $ Reason Opt_WarnAllMissedSpecs
782  | otherwise                             = return ()
783  where
784    allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers
785    doWarn reason =
786      warnMsg reason
787        (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn))
788                2 (vcat [ text "when specialising" <+> quotes (ppr caller)
789                        | caller <- callers])
790          , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
791          , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
792
793wantSpecImport :: DynFlags -> Unfolding -> Bool
794-- See Note [Specialise imported INLINABLE things]
795wantSpecImport dflags unf
796 = case unf of
797     NoUnfolding      -> False
798     BootUnfolding    -> False
799     OtherCon {}      -> False
800     DFunUnfolding {} -> True
801     CoreUnfolding { uf_src = src, uf_guidance = _guidance }
802       | gopt Opt_SpecialiseAggressively dflags -> True
803       | isStableSource src -> True
804               -- Specialise even INLINE things; it hasn't inlined yet,
805               -- so perhaps it never will.  Moreover it may have calls
806               -- inside it that we want to specialise
807       | otherwise -> False    -- Stable, not INLINE, hence INLINABLE
808
809{- Note [Avoiding loops in specImports]
810~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
811We must take great care when specialising instance declarations
812(functions like $fOrdList) lest we accidentally build a recursive
813dictionary. See Note [Avoiding loops].
814
815The basic strategy of Note [Avoiding loops] is to use filterCalls
816to discard loopy specialisations.  But to do that we must ensure
817that the in-scope dict-binds (passed to filterCalls) contains
818all the needed dictionary bindings.  In particular, in the recursive
819call to spec_imorpts in spec_import, we must include the dict-binds
820from the parent.  Lacking this caused #17151, a really nasty bug.
821
822Here is what happened.
823* Class struture:
824    Source is a superclass of Mut
825    Index is a superclass of Source
826
827* We started with these dict binds
828    dSource = $fSourcePix @Int $fIndexInt
829    dIndex  = sc_sel dSource
830    dMut    = $fMutPix @Int dIndex
831  and these calls to specialise
832    $fMutPix @Int dIndex
833    $fSourcePix @Int $fIndexInt
834
835* We specialised the call ($fMutPix @Int dIndex)
836  ==> new call ($fSourcePix @Int dIndex)
837      (because Source is a superclass of Mut)
838
839* We specialised ($fSourcePix @Int dIndex)
840  ==> produces specialised dict $s$fSourcePix,
841      a record with dIndex as a field
842      plus RULE forall d. ($fSourcePix @Int d) = $s$fSourcePix
843  *** This is the bogus step ***
844
845* Now we decide not to specialise the call
846    $fSourcePix @Int $fIndexInt
847  because we alredy have a RULE that matches it
848
849* Finally the simplifer rewrites
850    dSource = $fSourcePix @Int $fIndexInt
851    ==>  dSource = $s$fSourcePix
852
853Disaster. Now we have
854
855Rewrite dSource's RHS to $s$fSourcePix   Disaster
856    dSource = $s$fSourcePix
857    dIndex  = sc_sel dSource
858    $s$fSourcePix = MkSource dIndex ...
859
860Solution: filterCalls should have stopped the bogus step,
861by seeing that dIndex transitively uses $fSourcePix. But
862it can only do that if it sees all the dict_binds.  Wow.
863
864--------------
865Here's another example (#13429).  Suppose we have
866  class Monoid v => C v a where ...
867
868We start with a call
869   f @ [Integer] @ Integer $fC[]Integer
870
871Specialising call to 'f' gives dict bindings
872   $dMonoid_1 :: Monoid [Integer]
873   $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
874
875   $dC_1 :: C [Integer] (Node [Integer] Integer)
876   $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
877
878...plus a recursive call to
879   f @ [Integer] @ (Node [Integer] Integer) $dC_1
880
881Specialising that call gives
882   $dMonoid_2  :: Monoid [Integer]
883   $dMonoid_2  = M.$p1C @ [Integer] $dC_1
884
885   $dC_2 :: C [Integer] (Node [Integer] Integer)
886   $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
887
888Now we have two calls to the imported function
889  M.$fCvNode :: Monoid v => C v a
890  M.$fCvNode @v @a m = C m some_fun
891
892But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
893for specialisation, else we get:
894
895  $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
896  $dMonoid_2 = M.$p1C @ [Integer] $dC_1
897  $s$fCvNode = C $dMonoid_2 ...
898    RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
899
900Now use the rule to rewrite the call in the RHS of $dC_1
901and we get a loop!
902
903
904Note [specImport call stack]
905~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
906When specialising an imports function 'f', we may get new calls
907of an imported fuction 'g', which we want to specialise in turn,
908and similarly specialising 'g' might expose a new call to 'h'.
909
910We track the stack of enclosing functions. So when specialising 'h' we
911haev a specImport call stack of [g,f]. We do this for two reasons:
912* Note [Warning about missed specialisations]
913* Note [Avoiding recursive specialisation]
914
915Note [Warning about missed specialisations]
916~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
917Suppose
918 * In module Lib, you carefully mark a function 'foo' INLINABLE
919 * Import Lib(foo) into another module M
920 * Call 'foo' at some specialised type in M
921Then you jolly well expect it to be specialised in M.  But what if
922'foo' calls another function 'Lib.bar'.  Then you'd like 'bar' to be
923specialised too.  But if 'bar' is not marked INLINABLE it may well
924not be specialised.  The warning Opt_WarnMissedSpecs warns about this.
925
926It's more noisy to warning about a missed specialisation opportunity
927for /every/ overloaded imported function, but sometimes useful. That
928is what Opt_WarnAllMissedSpecs does.
929
930ToDo: warn about missed opportunities for local functions.
931
932Note [Avoiding recursive specialisation]
933~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
934When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
935'f's RHS.  So we want to specialise g,h.  But we don't want to
936specialise f any more!  It's possible that f's RHS might have a
937recursive yet-more-specialised call, so we'd diverge in that case.
938And if the call is to the same type, one specialisation is enough.
939Avoiding this recursive specialisation loop is one reason for the
940'callers' stack passed to specImports and specImport.
941
942Note [Specialise imported INLINABLE things]
943~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
944What imported functions do we specialise?  The basic set is
945 * DFuns and things with INLINABLE pragmas.
946but with -fspecialise-aggressively we add
947 * Anything with an unfolding template
948
949#8874 has a good example of why we want to auto-specialise DFuns.
950
951We have the -fspecialise-aggressively flag (usually off), because we
952risk lots of orphan modules from over-vigorous specialisation.
953However it's not a big deal: anything non-recursive with an
954unfolding-template will probably have been inlined already.
955
956Note [Glom the bindings if imported functions are specialised]
957~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
958Suppose we have an imported, *recursive*, INLINABLE function
959   f :: Eq a => a -> a
960   f = /\a \d x. ...(f a d)...
961In the module being compiled we have
962   g x = f (x::Int)
963Now we'll make a specialised function
964   f_spec :: Int -> Int
965   f_spec = \x -> ...(f Int dInt)...
966   {-# RULE  f Int _ = f_spec #-}
967   g = \x. f Int dInt x
968Note that f_spec doesn't look recursive
969After rewriting with the RULE, we get
970   f_spec = \x -> ...(f_spec)...
971BUT since f_spec was non-recursive before it'll *stay* non-recursive.
972The occurrence analyser never turns a NonRec into a Rec.  So we must
973make sure that f_spec is recursive.  Easiest thing is to make all
974the specialisations for imported bindings recursive.
975
976
977
978************************************************************************
979*                                                                      *
980\subsubsection{@specExpr@: the main function}
981*                                                                      *
982************************************************************************
983-}
984
985data SpecEnv
986  = SE { se_subst :: Core.Subst
987             -- We carry a substitution down:
988             -- a) we must clone any binding that might float outwards,
989             --    to avoid name clashes
990             -- b) we carry a type substitution to use when analysing
991             --    the RHS of specialised bindings (no type-let!)
992
993
994       , se_interesting :: VarSet
995             -- Dict Ids that we know something about
996             -- and hence may be worth specialising against
997             -- See Note [Interesting dictionary arguments]
998     }
999
1000instance Outputable SpecEnv where
1001  ppr (SE { se_subst = subst, se_interesting = interesting })
1002    = text "SE" <+> braces (sep $ punctuate comma
1003        [ text "subst =" <+> ppr subst
1004        , text "interesting =" <+> ppr interesting ])
1005
1006specVar :: SpecEnv -> Id -> CoreExpr
1007specVar env v = Core.lookupIdSubst (text "specVar") (se_subst env) v
1008
1009specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
1010
1011---------------- First the easy cases --------------------
1012specExpr env (Type ty)     = return (Type     (substTy env ty), emptyUDs)
1013specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
1014specExpr env (Var v)       = return (specVar env v, emptyUDs)
1015specExpr _   (Lit lit)     = return (Lit lit,       emptyUDs)
1016specExpr env (Cast e co)
1017  = do { (e', uds) <- specExpr env e
1018       ; return ((mkCast e' (substCo env co)), uds) }
1019specExpr env (Tick tickish body)
1020  = do { (body', uds) <- specExpr env body
1021       ; return (Tick (specTickish env tickish) body', uds) }
1022
1023---------------- Applications might generate a call instance --------------------
1024specExpr env expr@(App {})
1025  = go expr []
1026  where
1027    go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
1028                               (fun', uds_app) <- go fun (arg':args)
1029                               return (App fun' arg', uds_arg `plusUDs` uds_app)
1030
1031    go (Var f)       args = case specVar env f of
1032                                Var f' -> return (Var f', mkCallUDs env f' args)
1033                                e'     -> return (e', emptyUDs) -- I don't expect this!
1034    go other         _    = specExpr env other
1035
1036---------------- Lambda/case require dumping of usage details --------------------
1037specExpr env e@(Lam _ _) = do
1038    (body', uds) <- specExpr env' body
1039    let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
1040    return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
1041  where
1042    (bndrs, body) = collectBinders e
1043    (env', bndrs') = substBndrs env bndrs
1044        -- More efficient to collect a group of binders together all at once
1045        -- and we don't want to split a lambda group with dumped bindings
1046
1047specExpr env (Case scrut case_bndr ty alts)
1048  = do { (scrut', scrut_uds) <- specExpr env scrut
1049       ; (scrut'', case_bndr', alts', alts_uds)
1050             <- specCase env scrut' case_bndr alts
1051       ; return (Case scrut'' case_bndr' (substTy env ty) alts'
1052                , scrut_uds `plusUDs` alts_uds) }
1053
1054---------------- Finally, let is the interesting case --------------------
1055specExpr env (Let bind body)
1056  = do { -- Clone binders
1057         (rhs_env, body_env, bind') <- cloneBindSM env bind
1058
1059         -- Deal with the body
1060       ; (body', body_uds) <- specExpr body_env body
1061
1062        -- Deal with the bindings
1063      ; (binds', uds) <- specBind rhs_env bind' body_uds
1064
1065        -- All done
1066      ; return (foldr Let body' binds', uds) }
1067
1068--------------
1069specLam :: SpecEnv -> [OutBndr] -> InExpr -> SpecM (OutExpr, UsageDetails)
1070-- The binders have been substituted, but the body has not
1071specLam env bndrs body
1072  | null bndrs
1073  = specExpr env body
1074  | otherwise
1075  = do { (body', uds) <- specExpr env body
1076       ; let (free_uds, dumped_dbs) = dumpUDs bndrs uds
1077       ; return (mkLams bndrs (wrapDictBindsE dumped_dbs body'), free_uds) }
1078
1079--------------
1080specTickish :: SpecEnv -> Tickish Id -> Tickish Id
1081specTickish env (Breakpoint ix ids)
1082  = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
1083  -- drop vars from the list if they have a non-variable substitution.
1084  -- should never happen, but it's harmless to drop them anyway.
1085specTickish _ other_tickish = other_tickish
1086
1087--------------
1088specCase :: SpecEnv
1089         -> CoreExpr            -- Scrutinee, already done
1090         -> Id -> [CoreAlt]
1091         -> SpecM ( CoreExpr    -- New scrutinee
1092                  , Id
1093                  , [CoreAlt]
1094                  , UsageDetails)
1095specCase env scrut' case_bndr [(con, args, rhs)]
1096  | isDictId case_bndr           -- See Note [Floating dictionaries out of cases]
1097  , interestingDict env scrut'
1098  , not (isDeadBinder case_bndr && null sc_args')
1099  = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
1100
1101       ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
1102                              [(con, args', Var sc_arg')]
1103                       | sc_arg' <- sc_args' ]
1104
1105             -- Extend the substitution for RHS to map the *original* binders
1106             -- to their floated versions.
1107             mb_sc_flts :: [Maybe DictId]
1108             mb_sc_flts = map (lookupVarEnv clone_env) args'
1109             clone_env  = zipVarEnv sc_args' sc_args_flt
1110             subst_prs  = (case_bndr, Var case_bndr_flt)
1111                        : [ (arg, Var sc_flt)
1112                          | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
1113             env_rhs' = env_rhs { se_subst = Core.extendIdSubstList (se_subst env_rhs) subst_prs
1114                                , se_interesting = se_interesting env_rhs `extendVarSetList`
1115                                                   (case_bndr_flt : sc_args_flt) }
1116
1117       ; (rhs', rhs_uds)   <- specExpr env_rhs' rhs
1118       ; let scrut_bind    = mkDB (NonRec case_bndr_flt scrut')
1119             case_bndr_set = unitVarSet case_bndr_flt
1120             sc_binds      = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
1121                             | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
1122             flt_binds     = scrut_bind : sc_binds
1123             (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
1124             all_uds = flt_binds `addDictBinds` free_uds
1125             alt'    = (con, args', wrapDictBindsE dumped_dbs rhs')
1126       ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
1127  where
1128    (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
1129    sc_args' = filter is_flt_sc_arg args'
1130
1131    clone_me bndr = do { uniq <- getUniqueM
1132                       ; return (mkUserLocalOrCoVar occ uniq ty loc) }
1133       where
1134         name = idName bndr
1135         ty   = idType bndr
1136         occ  = nameOccName name
1137         loc  = getSrcSpan name
1138
1139    arg_set = mkVarSet args'
1140    is_flt_sc_arg var =  isId var
1141                      && not (isDeadBinder var)
1142                      && isDictTy var_ty
1143                      && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
1144       where
1145         var_ty = idType var
1146
1147
1148specCase env scrut case_bndr alts
1149  = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
1150       ; return (scrut, case_bndr', alts', uds_alts) }
1151  where
1152    (env_alt, case_bndr') = substBndr env case_bndr
1153    spec_alt (con, args, rhs) = do
1154          (rhs', uds) <- specExpr env_rhs rhs
1155          let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
1156          return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
1157        where
1158          (env_rhs, args') = substBndrs env_alt args
1159
1160{-
1161Note [Floating dictionaries out of cases]
1162~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1163Consider
1164   g = \d. case d of { MkD sc ... -> ...(f sc)... }
1165Naively we can't float d2's binding out of the case expression,
1166because 'sc' is bound by the case, and that in turn means we can't
1167specialise f, which seems a pity.
1168
1169So we invert the case, by floating out a binding
1170for 'sc_flt' thus:
1171    sc_flt = case d of { MkD sc ... -> sc }
1172Now we can float the call instance for 'f'.  Indeed this is just
1173what'll happen if 'sc' was originally bound with a let binding,
1174but case is more efficient, and necessary with equalities. So it's
1175good to work with both.
1176
1177You might think that this won't make any difference, because the
1178call instance will only get nuked by the \d.  BUT if 'g' itself is
1179specialised, then transitively we should be able to specialise f.
1180
1181In general, given
1182   case e of cb { MkD sc ... -> ...(f sc)... }
1183we transform to
1184   let cb_flt = e
1185       sc_flt = case cb_flt of { MkD sc ... -> sc }
1186   in
1187   case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1188
1189The "_flt" things are the floated binds; we use the current substitution
1190to substitute sc -> sc_flt in the RHS
1191
1192************************************************************************
1193*                                                                      *
1194                     Dealing with a binding
1195*                                                                      *
1196************************************************************************
1197-}
1198
1199specBind :: SpecEnv                     -- Use this for RHSs
1200         -> CoreBind                    -- Binders are already cloned by cloneBindSM,
1201                                        -- but RHSs are un-processed
1202         -> UsageDetails                -- Info on how the scope of the binding
1203         -> SpecM ([CoreBind],          -- New bindings
1204                   UsageDetails)        -- And info to pass upstream
1205
1206-- Returned UsageDetails:
1207--    No calls for binders of this bind
1208specBind rhs_env (NonRec fn rhs) body_uds
1209  = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1210
1211        ; let zapped_fn = zapIdDemandInfo fn
1212              -- We zap the demand info because the binding may float,
1213              -- which would invaidate the demand info (see #17810 for example).
1214              -- Destroying demand info is not terrible; specialisation is
1215              -- always followed soon by demand analysis.
1216      ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds zapped_fn rhs
1217
1218       ; let pairs = spec_defns ++ [(fn', rhs')]
1219                        -- fn' mentions the spec_defns in its rules,
1220                        -- so put the latter first
1221
1222             combined_uds = body_uds1 `plusUDs` rhs_uds
1223
1224             (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1225
1226             final_binds :: [DictBind]
1227             -- See Note [From non-recursive to recursive]
1228             final_binds
1229               | not (isEmptyBag dump_dbs)
1230               , not (null spec_defns)
1231               = [recWithDumpedDicts pairs dump_dbs]
1232               | otherwise
1233               = [mkDB $ NonRec b r | (b,r) <- pairs]
1234                 ++ bagToList dump_dbs
1235
1236       ; if float_all then
1237             -- Rather than discard the calls mentioning the bound variables
1238             -- we float this (dictionary) binding along with the others
1239              return ([], free_uds `snocDictBinds` final_binds)
1240         else
1241             -- No call in final_uds mentions bound variables,
1242             -- so we can just leave the binding here
1243              return (map fst final_binds, free_uds) }
1244
1245
1246specBind rhs_env (Rec pairs) body_uds
1247       -- Note [Specialising a recursive group]
1248  = do { let (bndrs,rhss) = unzip pairs
1249       ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1250       ; let scope_uds = body_uds `plusUDs` rhs_uds
1251                       -- Includes binds and calls arising from rhss
1252
1253       ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1254
1255       ; (bndrs3, spec_defns3, uds3)
1256             <- if null spec_defns1  -- Common case: no specialisation
1257                then return (bndrs1, [], uds1)
1258                else do {            -- Specialisation occurred; do it again
1259                          (bndrs2, spec_defns2, uds2)
1260                              <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1261                        ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1262
1263       ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1264             final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1265                                             dumped_dbs
1266
1267       ; if float_all then
1268              return ([], final_uds `snocDictBind` final_bind)
1269         else
1270              return ([fst final_bind], final_uds) }
1271
1272
1273---------------------------
1274specDefns :: SpecEnv
1275          -> UsageDetails               -- Info on how it is used in its scope
1276          -> [(OutId,InExpr)]           -- The things being bound and their un-processed RHS
1277          -> SpecM ([OutId],            -- Original Ids with RULES added
1278                    [(OutId,OutExpr)],  -- Extra, specialised bindings
1279                    UsageDetails)       -- Stuff to fling upwards from the specialised versions
1280
1281-- Specialise a list of bindings (the contents of a Rec), but flowing usages
1282-- upwards binding by binding.  Example: { f = ...g ...; g = ...f .... }
1283-- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1284-- in turn generates a specialised call for 'f', we catch that in this one sweep.
1285-- But not vice versa (it's a fixpoint problem).
1286
1287specDefns _env uds []
1288  = return ([], [], uds)
1289specDefns env uds ((bndr,rhs):pairs)
1290  = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1291       ; (bndr1, spec_defns2, uds2)  <- specDefn env uds1 bndr rhs
1292       ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1293
1294---------------------------
1295specDefn :: SpecEnv
1296         -> UsageDetails                -- Info on how it is used in its scope
1297         -> OutId -> InExpr             -- The thing being bound and its un-processed RHS
1298         -> SpecM (Id,                  -- Original Id with added RULES
1299                   [(Id,CoreExpr)],     -- Extra, specialised bindings
1300                   UsageDetails)        -- Stuff to fling upwards from the specialised versions
1301
1302specDefn env body_uds fn rhs
1303  = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1304             rules_for_me = idCoreRules fn
1305       ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1306                                                    calls_for_me fn rhs
1307       ; return ( fn `addIdSpecialisations` rules
1308                , spec_defns
1309                , body_uds_without_me `plusUDs` spec_uds) }
1310                -- It's important that the `plusUDs` is this way
1311                -- round, because body_uds_without_me may bind
1312                -- dictionaries that are used in calls_for_me passed
1313                -- to specDefn.  So the dictionary bindings in
1314                -- spec_uds may mention dictionaries bound in
1315                -- body_uds_without_me
1316
1317---------------------------
1318specCalls :: Maybe Module      -- Just this_mod  =>  specialising imported fn
1319                               -- Nothing        =>  specialising local fn
1320          -> SpecEnv
1321          -> [CoreRule]        -- Existing RULES for the fn
1322          -> [CallInfo]
1323          -> OutId -> InExpr
1324          -> SpecM SpecInfo    -- New rules, specialised bindings, and usage details
1325
1326-- This function checks existing rules, and does not create
1327-- duplicate ones. So the caller does not need to do this filtering.
1328-- See 'already_covered'
1329
1330type SpecInfo = ( [CoreRule]       -- Specialisation rules
1331                , [(Id,CoreExpr)]  -- Specialised definition
1332                , UsageDetails )   -- Usage details from specialised RHSs
1333
1334specCalls mb_mod env existing_rules calls_for_me fn rhs
1335        -- The first case is the interesting one
1336  |  notNull calls_for_me               -- And there are some calls to specialise
1337  && not (isNeverActive (idInlineActivation fn))
1338        -- Don't specialise NOINLINE things
1339        -- See Note [Auto-specialisation and RULES]
1340
1341--   && not (certainlyWillInline (idUnfolding fn))      -- And it's not small
1342--      See Note [Inline specialisation] for why we do not
1343--      switch off specialisation for inline functions
1344
1345  = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1346    foldlM spec_call ([], [], emptyUDs) calls_for_me
1347
1348  | otherwise   -- No calls or RHS doesn't fit our preconceptions
1349  = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1350          text "Missed specialisation opportunity for"
1351                                 <+> ppr fn $$ _trace_doc )
1352          -- Note [Specialisation shape]
1353    -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1354    return ([], [], emptyUDs)
1355  where
1356    _trace_doc = sep [ ppr rhs_bndrs, ppr (idInlineActivation fn) ]
1357
1358    fn_type   = idType fn
1359    fn_arity  = idArity fn
1360    fn_unf    = realIdUnfolding fn  -- Ignore loop-breaker-ness here
1361    inl_prag  = idInlinePragma fn
1362    inl_act   = inlinePragmaActivation inl_prag
1363    is_local  = isLocalId fn
1364    is_dfun   = isDFunId fn
1365
1366        -- Figure out whether the function has an INLINE pragma
1367        -- See Note [Inline specialisations]
1368
1369    (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1370                            -- See Note [Account for casts in binding]
1371
1372    in_scope = Core.substInScope (se_subst env)
1373
1374    already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool
1375    already_covered dflags new_rules args      -- Note [Specialisations already covered]
1376       = isJust (lookupRule dflags (in_scope, realIdUnfolding)
1377                            (const True) fn args
1378                            (new_rules ++ existing_rules))
1379         -- NB: we look both in the new_rules (generated by this invocation
1380         --     of specCalls), and in existing_rules (passed in to specCalls)
1381
1382    ----------------------------------------------------------
1383        -- Specialise to one particular call pattern
1384    spec_call :: SpecInfo                         -- Accumulating parameter
1385              -> CallInfo                         -- Call instance
1386              -> SpecM SpecInfo
1387    spec_call spec_acc@(rules_acc, pairs_acc, uds_acc) _ci@(CI { ci_key = call_args })
1388      = -- See Note [Specialising Calls]
1389        do { let all_call_args | is_dfun   = call_args ++ repeat UnspecArg
1390                               | otherwise = call_args
1391                               -- See Note [Specialising DFuns]
1392           ; ( useful, rhs_env2, leftover_bndrs
1393             , rule_bndrs, rule_lhs_args
1394             , spec_bndrs1, dx_binds, spec_args) <- specHeader env rhs_bndrs all_call_args
1395
1396--           ; pprTrace "spec_call" (vcat [ text "call info: " <+> ppr _ci
1397--                                        , text "useful:    " <+> ppr useful
1398--                                        , text "rule_bndrs:" <+> ppr rule_bndrs
1399--                                        , text "lhs_args:  " <+> ppr rule_lhs_args
1400--                                        , text "spec_bndrs:" <+> ppr spec_bndrs1
1401--                                        , text "spec_args: " <+> ppr spec_args
1402--                                        , text "dx_binds:  " <+> ppr dx_binds
1403--                                        , text "rhs_env2:  " <+> ppr (se_subst rhs_env2)
1404--                                        , ppr dx_binds ]) $
1405--             return ()
1406
1407           ; dflags <- getDynFlags
1408           ; if not useful  -- No useful specialisation
1409                || already_covered dflags rules_acc rule_lhs_args
1410             then return spec_acc
1411             else
1412        do { -- Run the specialiser on the specialised RHS
1413             -- The "1" suffix is before we maybe add the void arg
1414           ; (spec_rhs1, rhs_uds) <- specLam rhs_env2 (spec_bndrs1 ++ leftover_bndrs) rhs_body
1415           ; let spec_fn_ty1 = exprType spec_rhs1
1416
1417                 -- Maybe add a void arg to the specialised function,
1418                 -- to avoid unlifted bindings
1419                 -- See Note [Specialisations Must Be Lifted]
1420                 -- C.f. GHC.Core.Op.WorkWrap.Lib.mkWorkerArgs
1421                 add_void_arg = isUnliftedType spec_fn_ty1 && not (isJoinId fn)
1422                 (spec_bndrs, spec_rhs, spec_fn_ty)
1423                   | add_void_arg = ( voidPrimId : spec_bndrs1
1424                                    , Lam        voidArgId  spec_rhs1
1425                                    , mkVisFunTy voidPrimTy spec_fn_ty1)
1426                   | otherwise   = (spec_bndrs1, spec_rhs1, spec_fn_ty1)
1427
1428                 join_arity_decr = length rule_lhs_args - length spec_bndrs
1429                 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1430                                 = Just (orig_join_arity - join_arity_decr)
1431                                 | otherwise
1432                                 = Nothing
1433
1434           ; spec_fn <- newSpecIdSM fn spec_fn_ty spec_join_arity
1435           ; this_mod <- getModule
1436           ; let
1437                -- The rule to put in the function's specialisation is:
1438                --      forall x @b d1' d2'.
1439                --          f x @T1 @b @T2 d1' d2' = f1 x @b
1440                -- See Note [Specialising Calls]
1441                herald = case mb_mod of
1442                           Nothing        -- Specialising local fn
1443                               -> text "SPEC"
1444                           Just this_mod  -- Specialising imported fn
1445                               -> text "SPEC/" <> ppr this_mod
1446
1447                rule_name = mkFastString $ showSDoc dflags $
1448                            herald <+> ftext (occNameFS (getOccName fn))
1449                                   <+> hsep (mapMaybe ppr_call_key_ty call_args)
1450                            -- This name ends up in interface files, so use occNameString.
1451                            -- Otherwise uniques end up there, making builds
1452                            -- less deterministic (See #4012 comment:61 ff)
1453
1454                rule_wout_eta = mkRule
1455                                  this_mod
1456                                  True {- Auto generated -}
1457                                  is_local
1458                                  rule_name
1459                                  inl_act       -- Note [Auto-specialisation and RULES]
1460                                  (idName fn)
1461                                  rule_bndrs
1462                                  rule_lhs_args
1463                                  (mkVarApps (Var spec_fn) spec_bndrs)
1464
1465                spec_rule
1466                  = case isJoinId_maybe fn of
1467                      Just join_arity -> etaExpandToJoinPointRule join_arity rule_wout_eta
1468                      Nothing -> rule_wout_eta
1469
1470                -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1471                -- See Note [Specialising Calls]
1472                spec_uds = foldr consDictBind rhs_uds dx_binds
1473
1474                --------------------------------------
1475                -- Add a suitable unfolding if the spec_inl_prag says so
1476                -- See Note [Inline specialisations]
1477                (spec_inl_prag, spec_unf)
1478                  | not is_local && isStrongLoopBreaker (idOccInfo fn)
1479                  = (neverInlinePragma, noUnfolding)
1480                        -- See Note [Specialising imported functions] in OccurAnal
1481
1482                  | InlinePragma { inl_inline = Inlinable } <- inl_prag
1483                  = (inl_prag { inl_inline = NoUserInline }, noUnfolding)
1484
1485                  | otherwise
1486                  = (inl_prag, specUnfolding dflags spec_bndrs (`mkApps` spec_args)
1487                                             rule_lhs_args fn_unf)
1488
1489                --------------------------------------
1490                -- Adding arity information just propagates it a bit faster
1491                --      See Note [Arity decrease] in Simplify
1492                -- Copy InlinePragma information from the parent Id.
1493                -- So if f has INLINE[1] so does spec_fn
1494                arity_decr     = count isValArg rule_lhs_args - count isId spec_bndrs
1495                spec_f_w_arity = spec_fn `setIdArity`      max 0 (fn_arity - arity_decr)
1496                                         `setInlinePragma` spec_inl_prag
1497                                         `setIdUnfolding`  spec_unf
1498                                         `asJoinId_maybe`  spec_join_arity
1499
1500                _rule_trace_doc = vcat [ ppr fn <+> dcolon <+> ppr fn_type
1501                                       , ppr spec_fn  <+> dcolon <+> ppr spec_fn_ty
1502                                       , ppr rhs_bndrs, ppr call_args
1503                                       , ppr spec_rule
1504                                       ]
1505
1506           ; -- pprTrace "spec_call: rule" _rule_trace_doc
1507             return ( spec_rule                  : rules_acc
1508                    , (spec_f_w_arity, spec_rhs) : pairs_acc
1509                    , spec_uds           `plusUDs` uds_acc
1510                    ) } }
1511
1512{- Note [Specialising DFuns]
1513~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1514DFuns have a special sort of unfolding (DFunUnfolding), and these are
1515hard to specialise a DFunUnfolding to give another DFunUnfolding
1516unless the DFun is fully applied (#18120).  So, in the case of DFunIds
1517we simply extend the CallKey with trailing UnspecArgs, so we'll
1518generate a rule that completely saturates the DFun.
1519
1520There is an ASSERT that checks this, in the DFunUnfolding case of
1521GHC.Core.Unfold.specUnfolding.
1522
1523Note [Specialisation Must Preserve Sharing]
1524~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1525Consider a function:
1526
1527    f :: forall a. Eq a => a -> blah
1528    f =
1529      if expensive
1530         then f1
1531         else f2
1532
1533As written, all calls to 'f' will share 'expensive'. But if we specialise 'f'
1534at 'Int', eg:
1535
1536    $sfInt = SUBST[a->Int,dict->dEqInt] (if expensive then f1 else f2)
1537
1538    RULE "SPEC f"
1539      forall (d :: Eq Int).
1540        f Int _ = $sfIntf
1541
1542We've now lost sharing between 'f' and '$sfInt' for 'expensive'. Yikes!
1543
1544To avoid this, we only generate specialisations for functions whose arity is
1545enough to bind all of the arguments we need to specialise.  This ensures our
1546specialised functions don't do any work before receiving all of their dicts,
1547and thus avoids the 'f' case above.
1548
1549Note [Specialisations Must Be Lifted]
1550~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1551Consider a function 'f':
1552
1553    f = forall a. Eq a => Array# a
1554
1555used like
1556
1557    case x of
1558      True -> ...f @Int dEqInt...
1559      False -> 0
1560
1561Naively, we might generate an (expensive) specialisation
1562
1563    $sfInt :: Array# Int
1564
1565even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to
1566the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to
1567preserve laziness.
1568
1569Note [Specialising Calls]
1570~~~~~~~~~~~~~~~~~~~~~~~~~
1571Suppose we have a function with a complicated type:
1572
1573    f :: forall a b c. Int -> Eq a => Show b => c -> Blah
1574    f @a @b @c i dEqA dShowA x = blah
1575
1576and suppose it is called at:
1577
1578    f 7 @T1 @T2 @T3 dEqT1 ($dfShow dShowT2) t3
1579
1580This call is described as a 'CallInfo' whose 'ci_key' is:
1581
1582    [ SpecType T1, SpecType T2, UnspecType, UnspecArg, SpecDict dEqT1
1583    , SpecDict ($dfShow dShowT2), UnspecArg ]
1584
1585Why are 'a' and 'b' identified as 'SpecType', while 'c' is 'UnspecType'?
1586Because we must specialise the function on type variables that appear
1587free in its *dictionary* arguments; but not on type variables that do not
1588appear in any dictionaries, i.e. are fully polymorphic.
1589
1590Because this call has dictionaries applied, we'd like to specialise
1591the call on any type argument that appears free in those dictionaries.
1592In this case, those are [a :-> T1, b :-> T2].
1593
1594We also need to substitute the dictionary binders with their
1595specialised dictionaries. The simplest substitution would be
1596[dEqA :-> dEqT1, dShowA :-> $dfShow dShowT2], but this duplicates
1597work, since `$dfShow dShowT2` is a function application. Therefore, we
1598also want to *float the dictionary out* (via bindAuxiliaryDict),
1599creating a new dict binding
1600
1601    dShow1 = $dfShow dShowT2
1602
1603and the substitution [dEqA :-> dEqT1, dShowA :-> dShow1].
1604
1605With the substitutions in hand, we can generate a specialised function:
1606
1607    $sf :: forall c. Int -> c -> Blah
1608    $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
1609
1610Note that the substitution is applied to the whole thing.  This is
1611convenient, but just slightly fragile.  Notably:
1612  * There had better be no name clashes in a/b/c
1613
1614We must construct a rewrite rule:
1615
1616    RULE "SPEC f @T1 @T2 _"
1617      forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
1618        f @T1 @T2 @c i d1 d2 = $sf @c i
1619
1620In the rule, d1 and d2 are just wildcards, not used in the RHS.  Note
1621additionally that 'x' isn't captured by this rule --- we bind only
1622enough etas in order to capture all of the *specialised* arguments.
1623
1624Note [Drop dead args from specialisations]
1625~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1626When specialising a function, it’s possible some of the arguments may
1627actually be dead. For example, consider:
1628
1629    f :: forall a. () -> Show a => a -> String
1630    f x y = show y ++ "!"
1631
1632We might generate the following CallInfo for `f @Int`:
1633
1634    [SpecType Int, UnspecArg, SpecDict $dShowInt, UnspecArg]
1635
1636Normally we’d include both the x and y arguments in the
1637specialisation, since we’re not specialising on either of them. But
1638that’s silly, since x is actually unused! So we might as well drop it
1639in the specialisation:
1640
1641    $sf :: Int -> String
1642    $sf y = show y ++ "!"
1643
1644    {-# RULE "SPEC f @Int" forall x. f @Int x $dShow = $sf #-}
1645
1646This doesn’t save us much, since the arg would be removed later by
1647worker/wrapper, anyway, but it’s easy to do. Note, however, that we
1648only drop dead arguments if:
1649
1650  1. We don’t specialise on them.
1651  2. They come before an argument we do specialise on.
1652
1653Doing the latter would require eta-expanding the RULE, which could
1654make it match less often, so it’s not worth it. Doing the former could
1655be more useful --- it would stop us from generating pointless
1656specialisations --- but it’s more involved to implement and unclear if
1657it actually provides much benefit in practice.
1658
1659Note [Zap occ info in rule binders]
1660~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1661When we generate a specialisation RULE, we need to drop occurrence
1662info on the binders. If we don’t, things go wrong when we specialise a
1663function like
1664
1665    f :: forall a. () -> Show a => a -> String
1666    f x y = show y ++ "!"
1667
1668since we’ll generate a RULE like
1669
1670    RULE "SPEC f @Int" forall x [Occ=Dead].
1671      f @Int x $dShow = $sf
1672
1673and Core Lint complains, even though x only appears on the LHS (due to
1674Note [Drop dead args from specialisations]).
1675
1676Why is that a Lint error? Because the arguments on the LHS of a rule
1677are syntactically expressions, not patterns, so Lint treats the
1678appearance of x as a use rather than a binding. Fortunately, the
1679solution is simple: we just make sure to zap the occ info before
1680using ids as wildcard binders in a rule.
1681
1682Note [Account for casts in binding]
1683~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1684Consider
1685   f :: Eq a => a -> IO ()
1686   {-# INLINABLE f
1687       StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1688     #-}
1689   f = ...
1690
1691In f's stable unfolding we have done some modest simplification which
1692has pushed the cast to the outside.  (I wonder if this is the Right
1693Thing, but it's what happens now; see SimplUtils Note [Casts and
1694lambdas].)  Now that stable unfolding must be specialised, so we want
1695to push the cast back inside. It would be terrible if the cast
1696defeated specialisation!  Hence the use of collectBindersPushingCo.
1697
1698Note [Evidence foralls]
1699~~~~~~~~~~~~~~~~~~~~~~~~~~
1700Suppose (#12212) that we are specialising
1701   f :: forall a b. (Num a, F a ~ F b) => blah
1702with a=b=Int. Then the RULE will be something like
1703   RULE forall (d:Num Int) (g :: F Int ~ F Int).
1704        f Int Int d g = f_spec
1705But both varToCoreExpr (when constructing the LHS args), and the
1706simplifier (when simplifying the LHS args), will transform to
1707   RULE forall (d:Num Int) (g :: F Int ~ F Int).
1708        f Int Int d <F Int> = f_spec
1709by replacing g with Refl.  So now 'g' is unbound, which results in a later
1710crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1711 * varToCoreExpr generates a Refl
1712 * exprsFreeIdsList returns the Ids bound by the args,
1713   which won't include g
1714
1715You might wonder if this will match as often, but the simplifier replaces
1716complicated Refl coercions with Refl pretty aggressively.
1717
1718Note [Orphans and auto-generated rules]
1719~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1720When we specialise an INLINABLE function, or when we have
1721-fspecialise-aggressively, we auto-generate RULES that are orphans.
1722We don't want to warn about these, or we'd generate a lot of warnings.
1723Thus, we only warn about user-specified orphan rules.
1724
1725Indeed, we don't even treat the module as an orphan module if it has
1726auto-generated *rule* orphans.  Orphan modules are read every time we
1727compile, so they are pretty obtrusive and slow down every compilation,
1728even non-optimised ones.  (Reason: for type class instances it's a
1729type correctness issue.)  But specialisation rules are strictly for
1730*optimisation* only so it's fine not to read the interface.
1731
1732What this means is that a SPEC rules from auto-specialisation in
1733module M will be used in other modules only if M.hi has been read for
1734some other reason, which is actually pretty likely.
1735
1736Note [From non-recursive to recursive]
1737~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1738Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1739have built a recursive knot
1740
1741      f a d x = <blah>
1742      MkUD { ud_binds = NonRec d7  (MkD ..f..)
1743           , ud_calls = ...(f T d7)... }
1744
1745The we generate
1746
1747     Rec { fs x = <blah>[T/a, d7/d]
1748           f a d x = <blah>
1749               RULE f T _ = fs
1750           d7 = ...f... }
1751
1752Here the recursion is only through the RULE.
1753
1754However we definitely should /not/ make the Rec in this wildly common
1755case:
1756      d = ...
1757      MkUD { ud_binds = NonRec d7 (...d...)
1758           , ud_calls = ...(f T d7)... }
1759
1760Here we want simply to add d to the floats, giving
1761      MkUD { ud_binds = NonRec d (...)
1762                        NonRec d7 (...d...)
1763           , ud_calls = ...(f T d7)... }
1764
1765In general, we need only make this Rec if
1766  - there are some specialisations (spec_binds non-empty)
1767  - there are some dict_binds that depend on f (dump_dbs non-empty)
1768
1769Note [Avoiding loops]
1770~~~~~~~~~~~~~~~~~~~~~
1771When specialising /dictionary functions/ we must be very careful to
1772avoid building loops. Here is an example that bit us badly: #3591
1773
1774     class Eq a => C a
1775     instance Eq [a] => C [a]
1776
1777This translates to
1778     dfun :: Eq [a] -> C [a]
1779     dfun a d = MkD a d (meth d)
1780
1781     d4 :: Eq [T] = <blah>
1782     d2 ::  C [T] = dfun T d4
1783     d1 :: Eq [T] = $p1 d2
1784     d3 ::  C [T] = dfun T d1
1785
1786None of these definitions is recursive. What happened was that we
1787generated a specialisation:
1788     RULE forall d. dfun T d = dT  :: C [T]
1789     dT = (MkD a d (meth d)) [T/a, d1/d]
1790        = MkD T d1 (meth d1)
1791
1792But now we use the RULE on the RHS of d2, to get
1793    d2 = dT = MkD d1 (meth d1)
1794    d1 = $p1 d2
1795
1796and now d1 is bottom!  The problem is that when specialising 'dfun' we
1797should first dump "below" the binding all floated dictionary bindings
1798that mention 'dfun' itself.  So d2 and d3 (and hence d1) must be
1799placed below 'dfun', and thus unavailable to it when specialising
1800'dfun'.  That in turn means that the call (dfun T d1) must be
1801discarded.  On the other hand, the call (dfun T d4) is fine, assuming
1802d4 doesn't mention dfun.
1803
1804Solution:
1805  Discard all calls that mention dictionaries that depend
1806  (directly or indirectly) on the dfun we are specialising.
1807  This is done by 'filterCalls'
1808
1809--------------
1810Here's yet another example
1811
1812  class C a where { foo,bar :: [a] -> [a] }
1813
1814  instance C Int where
1815     foo x = r_bar x
1816     bar xs = reverse xs
1817
1818  r_bar :: C a => [a] -> [a]
1819  r_bar xs = bar (xs ++ xs)
1820
1821That translates to:
1822
1823    r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1824
1825    Rec { $fCInt :: C Int = MkC foo_help reverse
1826          foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1827
1828The call (r_bar $fCInt) mentions $fCInt,
1829                        which mentions foo_help,
1830                        which mentions r_bar
1831But we DO want to specialise r_bar at Int:
1832
1833    Rec { $fCInt :: C Int = MkC foo_help reverse
1834          foo_help (xs::[Int]) = r_bar Int $fCInt xs
1835
1836          r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1837            RULE r_bar Int _ = r_bar_Int
1838
1839          r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1840           }
1841
1842Note that, because of its RULE, r_bar joins the recursive
1843group.  (In this case it'll unravel a short moment later.)
1844
1845
1846Note [Specialising a recursive group]
1847~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1848Consider
1849    let rec { f x = ...g x'...
1850            ; g y = ...f y'.... }
1851    in f 'a'
1852Here we specialise 'f' at Char; but that is very likely to lead to
1853a specialisation of 'g' at Char.  We must do the latter, else the
1854whole point of specialisation is lost.
1855
1856But we do not want to keep iterating to a fixpoint, because in the
1857presence of polymorphic recursion we might generate an infinite number
1858of specialisations.
1859
1860So we use the following heuristic:
1861  * Arrange the rec block in dependency order, so far as possible
1862    (the occurrence analyser already does this)
1863
1864  * Specialise it much like a sequence of lets
1865
1866  * Then go through the block a second time, feeding call-info from
1867    the RHSs back in the bottom, as it were
1868
1869In effect, the ordering maxmimises the effectiveness of each sweep,
1870and we do just two sweeps.   This should catch almost every case of
1871monomorphic recursion -- the exception could be a very knotted-up
1872recursion with multiple cycles tied up together.
1873
1874This plan is implemented in the Rec case of specBindItself.
1875
1876Note [Specialisations already covered]
1877~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1878We obviously don't want to generate two specialisations for the same
1879argument pattern.  There are two wrinkles
1880
18811. We do the already-covered test in specDefn, not when we generate
1882the CallInfo in mkCallUDs.  We used to test in the latter place, but
1883we now iterate the specialiser somewhat, and the Id at the call site
1884might therefore not have all the RULES that we can see in specDefn
1885
18862. What about two specialisations where the second is an *instance*
1887of the first?  If the more specific one shows up first, we'll generate
1888specialisations for both.  If the *less* specific one shows up first,
1889we *don't* currently generate a specialisation for the more specific
1890one.  (See the call to lookupRule in already_covered.)  Reasons:
1891  (a) lookupRule doesn't say which matches are exact (bad reason)
1892  (b) if the earlier specialisation is user-provided, it's
1893      far from clear that we should auto-specialise further
1894
1895Note [Auto-specialisation and RULES]
1896~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1897Consider:
1898   g :: Num a => a -> a
1899   g = ...
1900
1901   f :: (Int -> Int) -> Int
1902   f w = ...
1903   {-# RULE f g = 0 #-}
1904
1905Suppose that auto-specialisation makes a specialised version of
1906g::Int->Int That version won't appear in the LHS of the RULE for f.
1907So if the specialisation rule fires too early, the rule for f may
1908never fire.
1909
1910It might be possible to add new rules, to "complete" the rewrite system.
1911Thus when adding
1912        RULE forall d. g Int d = g_spec
1913also add
1914        RULE f g_spec = 0
1915
1916But that's a bit complicated.  For now we ask the programmer's help,
1917by *copying the INLINE activation pragma* to the auto-specialised
1918rule.  So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1919will also not be active until phase 2.  And that's what programmers
1920should jolly well do anyway, even aside from specialisation, to ensure
1921that g doesn't inline too early.
1922
1923This in turn means that the RULE would never fire for a NOINLINE
1924thing so not much point in generating a specialisation at all.
1925
1926Note [Specialisation shape]
1927~~~~~~~~~~~~~~~~~~~~~~~~~~~
1928We only specialise a function if it has visible top-level lambdas
1929corresponding to its overloading.  E.g. if
1930        f :: forall a. Eq a => ....
1931then its body must look like
1932        f = /\a. \d. ...
1933
1934Reason: when specialising the body for a call (f ty dexp), we want to
1935substitute dexp for d, and pick up specialised calls in the body of f.
1936
1937This doesn't always work.  One example I came across was this:
1938        newtype Gen a = MkGen{ unGen :: Int -> a }
1939
1940        choose :: Eq a => a -> Gen a
1941        choose n = MkGen (\r -> n)
1942
1943        oneof = choose (1::Int)
1944
1945It's a silly example, but we get
1946        choose = /\a. g `cast` co
1947where choose doesn't have any dict arguments.  Thus far I have not
1948tried to fix this (wait till there's a real example).
1949
1950Mind you, then 'choose' will be inlined (since RHS is trivial) so
1951it doesn't matter.  This comes up with single-method classes
1952
1953   class C a where { op :: a -> a }
1954   instance C a => C [a] where ....
1955==>
1956   $fCList :: C a => C [a]
1957   $fCList = $copList |> (...coercion>...)
1958   ....(uses of $fCList at particular types)...
1959
1960So we suppress the WARN if the rhs is trivial.
1961
1962Note [Inline specialisations]
1963~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1964Here is what we do with the InlinePragma of the original function
1965  * Activation/RuleMatchInfo: both transferred to the
1966                              specialised function
1967  * InlineSpec:
1968       (a) An INLINE pragma is transferred
1969       (b) An INLINABLE pragma is *not* transferred
1970
1971Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1972specialise the function at its call site, and arguably that's not so
1973important for the specialised copies.  BUT *pragma-directed*
1974specialisation now takes place in the typechecker/desugarer, with
1975manually specified INLINEs.  The specialisation here is automatic.
1976It'd be very odd if a function marked INLINE was specialised (because
1977of some local use), and then forever after (including importing
1978modules) the specialised version wasn't INLINEd.  After all, the
1979programmer said INLINE!
1980
1981You might wonder why we specialise INLINE functions at all.  After
1982all they should be inlined, right?  Two reasons:
1983
1984 * Even INLINE functions are sometimes not inlined, when they aren't
1985   applied to interesting arguments.  But perhaps the type arguments
1986   alone are enough to specialise (even though the args are too boring
1987   to trigger inlining), and it's certainly better to call the
1988   specialised version.
1989
1990 * The RHS of an INLINE function might call another overloaded function,
1991   and we'd like to generate a specialised version of that function too.
1992   This actually happens a lot. Consider
1993      replicateM_ :: (Monad m) => Int -> m a -> m ()
1994      {-# INLINABLE replicateM_ #-}
1995      replicateM_ d x ma = ...
1996   The strictness analyser may transform to
1997      replicateM_ :: (Monad m) => Int -> m a -> m ()
1998      {-# INLINE replicateM_ #-}
1999      replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
2000
2001      $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
2002      {-# INLINABLE $wreplicateM_ #-}
2003      $wreplicateM_ = ...
2004   Now an importing module has a specialised call to replicateM_, say
2005   (replicateM_ dMonadIO).  We certainly want to specialise $wreplicateM_!
2006   This particular example had a huge effect on the call to replicateM_
2007   in nofib/shootout/n-body.
2008
2009Why (b): discard INLINABLE pragmas? See #4874 for persuasive examples.
2010Suppose we have
2011    {-# INLINABLE f #-}
2012    f :: Ord a => [a] -> Int
2013    f xs = letrec f' = ...f'... in f'
2014Then, when f is specialised and optimised we might get
2015    wgo :: [Int] -> Int#
2016    wgo = ...wgo...
2017    f_spec :: [Int] -> Int
2018    f_spec xs = case wgo xs of { r -> I# r }
2019and we clearly want to inline f_spec at call sites.  But if we still
2020have the big, un-optimised of f (albeit specialised) captured in an
2021INLINABLE pragma for f_spec, we won't get that optimisation.
2022
2023So we simply drop INLINABLE pragmas when specialising. It's not really
2024a complete solution; ignoring specialisation for now, INLINABLE functions
2025don't get properly strictness analysed, for example. But it works well
2026for examples involving specialisation, which is the dominant use of
2027INLINABLE.  See #4874.
2028-}
2029
2030{- *********************************************************************
2031*                                                                      *
2032                   SpecArg, and specHeader
2033*                                                                      *
2034********************************************************************* -}
2035
2036-- | An argument that we might want to specialise.
2037-- See Note [Specialising Calls] for the nitty gritty details.
2038data SpecArg
2039  =
2040    -- | Type arguments that should be specialised, due to appearing
2041    -- free in the type of a 'SpecDict'.
2042    SpecType Type
2043
2044    -- | Type arguments that should remain polymorphic.
2045  | UnspecType
2046
2047    -- | Dictionaries that should be specialised. mkCallUDs ensures
2048    -- that only "interesting" dictionary arguments get a SpecDict;
2049    -- see Note [Interesting dictionary arguments]
2050  | SpecDict DictExpr
2051
2052    -- | Value arguments that should not be specialised.
2053  | UnspecArg
2054
2055instance Outputable SpecArg where
2056  ppr (SpecType t) = text "SpecType" <+> ppr t
2057  ppr UnspecType   = text "UnspecType"
2058  ppr (SpecDict d) = text "SpecDict" <+> ppr d
2059  ppr UnspecArg    = text "UnspecArg"
2060
2061specArgFreeVars :: SpecArg -> VarSet
2062specArgFreeVars (SpecType ty) = tyCoVarsOfType ty
2063specArgFreeVars (SpecDict dx) = exprFreeVars dx
2064specArgFreeVars UnspecType    = emptyVarSet
2065specArgFreeVars UnspecArg     = emptyVarSet
2066
2067isSpecDict :: SpecArg -> Bool
2068isSpecDict (SpecDict {}) = True
2069isSpecDict _             = False
2070
2071-- | Given binders from an original function 'f', and the 'SpecArg's
2072-- corresponding to its usage, compute everything necessary to build
2073-- a specialisation.
2074--
2075-- We will use the running example from Note [Specialising Calls]:
2076--
2077--     f :: forall a b c. Int -> Eq a => Show b => c -> Blah
2078--     f @a @b @c i dEqA dShowA x = blah
2079--
2080-- Suppose we decide to specialise it at the following pattern:
2081--
2082--     [ SpecType T1, SpecType T2, UnspecType, UnspecArg
2083--     , SpecDict dEqT1, SpecDict ($dfShow dShowT2), UnspecArg ]
2084--
2085-- We'd eventually like to build the RULE
2086--
2087--     RULE "SPEC f @T1 @T2 _"
2088--       forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
2089--         f @T1 @T2 @c i d1 d2 = $sf @c i
2090--
2091-- and the specialisation '$sf'
2092--
2093--     $sf :: forall c. Int -> c -> Blah
2094--     $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
2095--
2096-- where dShow1 is a floated binding created by bindAuxiliaryDict.
2097--
2098-- The cases for 'specHeader' below are presented in the same order as this
2099-- running example. The result of 'specHeader' for this example is as follows:
2100--
2101--    ( -- Returned arguments
2102--      env + [a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1]
2103--    , [x]
2104--
2105--      -- RULE helpers
2106--    , [c, i, d1, d2]
2107--    , [T1, T2, c, i, d1, d2]
2108--
2109--      -- Specialised function helpers
2110--    , [c, i, x]
2111--    , [dShow1 = $dfShow dShowT2]
2112--    , [T1, T2, c, i, dEqT1, dShow1]
2113--    )
2114specHeader
2115     :: SpecEnv
2116     -> [InBndr]    -- The binders from the original function 'f'
2117     -> [SpecArg]   -- From the CallInfo
2118     -> SpecM ( Bool     -- True <=> some useful specialisation happened
2119                         -- Not the same as any (isSpecDict args) because
2120                         -- the args might be longer than bndrs
2121
2122                -- Returned arguments
2123              , SpecEnv      -- Substitution to apply to the body of 'f'
2124              , [OutBndr]    -- Leftover binders from the original function 'f'
2125                             --   that don’t have a corresponding SpecArg
2126
2127                -- RULE helpers
2128              , [OutBndr]    -- Binders for the RULE
2129              , [OutExpr]    -- Args for the LHS of the rule
2130
2131                -- Specialised function helpers
2132              , [OutBndr]    -- Binders for $sf
2133              , [DictBind]   -- Auxiliary dictionary bindings
2134              , [OutExpr]    -- Specialised arguments for unfolding
2135                             -- Same length as "args for LHS of rule"
2136              )
2137
2138-- We want to specialise on type 'T1', and so we must construct a substitution
2139-- 'a->T1', as well as a LHS argument for the resulting RULE and unfolding
2140-- details.
2141specHeader env (bndr : bndrs) (SpecType t : args)
2142  = do { let env' = extendTvSubstList env [(bndr, t)]
2143       ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2144            <- specHeader env' bndrs args
2145       ; pure ( useful
2146              , env''
2147              , leftover_bndrs
2148              , rule_bs
2149              , Type t : rule_es
2150              , bs'
2151              , dx
2152              , Type t : spec_args
2153              )
2154       }
2155
2156-- Next we have a type that we don't want to specialise. We need to perform
2157-- a substitution on it (in case the type refers to 'a'). Additionally, we need
2158-- to produce a binder, LHS argument and RHS argument for the resulting rule,
2159-- /and/ a binder for the specialised body.
2160specHeader env (bndr : bndrs) (UnspecType : args)
2161  = do { let (env', bndr') = substBndr env bndr
2162       ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2163            <- specHeader env' bndrs args
2164       ; pure ( useful
2165              , env''
2166              , leftover_bndrs
2167              , bndr' : rule_bs
2168              , varToCoreExpr bndr' : rule_es
2169              , bndr' : bs'
2170              , dx
2171              , varToCoreExpr bndr' : spec_args
2172              )
2173       }
2174
2175-- Next we want to specialise the 'Eq a' dict away. We need to construct
2176-- a wildcard binder to match the dictionary (See Note [Specialising Calls] for
2177-- the nitty-gritty), as a LHS rule and unfolding details.
2178specHeader env (bndr : bndrs) (SpecDict d : args)
2179  = do { bndr' <- newDictBndr env bndr -- See Note [Zap occ info in rule binders]
2180       ; let (env', dx_bind, spec_dict) = bindAuxiliaryDict env bndr bndr' d
2181       ; (_, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2182             <- specHeader env' bndrs args
2183       ; pure ( True      -- Ha!  A useful specialisation!
2184              , env''
2185              , leftover_bndrs
2186              -- See Note [Evidence foralls]
2187              , exprFreeIdsList (varToCoreExpr bndr') ++ rule_bs
2188              , varToCoreExpr bndr' : rule_es
2189              , bs'
2190              , maybeToList dx_bind ++ dx
2191              , spec_dict : spec_args
2192              )
2193       }
2194
2195-- Finally, we have the unspecialised argument 'i'. We need to produce
2196-- a binder, LHS and RHS argument for the RULE, and a binder for the
2197-- specialised body.
2198--
2199-- NB: Calls to 'specHeader' will trim off any trailing 'UnspecArg's, which is
2200-- why 'i' doesn't appear in our RULE above. But we have no guarantee that
2201-- there aren't 'UnspecArg's which come /before/ all of the dictionaries, so
2202-- this case must be here.
2203specHeader env (bndr : bndrs) (UnspecArg : args)
2204  = do { -- see Note [Zap occ info in rule binders]
2205         let (env', bndr') = substBndr env (zapIdOccInfo bndr)
2206       ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2207             <- specHeader env' bndrs args
2208       ; pure ( useful
2209              , env''
2210              , leftover_bndrs
2211              , bndr' : rule_bs
2212              , varToCoreExpr bndr' : rule_es
2213              , if isDeadBinder bndr
2214                  then bs' -- see Note [Drop dead args from specialisations]
2215                  else bndr' : bs'
2216              , dx
2217              , varToCoreExpr bndr' : spec_args
2218              )
2219       }
2220
2221-- If we run out of binders, stop immediately
2222-- See Note [Specialisation Must Preserve Sharing]
2223specHeader env [] _ = pure (False, env, [], [], [], [], [], [])
2224
2225-- Return all remaining binders from the original function. These have the
2226-- invariant that they should all correspond to unspecialised arguments, so
2227-- it's safe to stop processing at this point.
2228specHeader env bndrs []
2229  = pure (False, env', bndrs', [], [], [], [], [])
2230  where
2231    (env', bndrs') = substBndrs env bndrs
2232
2233
2234-- | Binds a dictionary argument to a fresh name, to preserve sharing
2235bindAuxiliaryDict
2236  :: SpecEnv
2237  -> InId -> OutId -> OutExpr -- Original dict binder, and the witnessing expression
2238  -> ( SpecEnv        -- Substitute for orig_dict_id
2239     , Maybe DictBind -- Auxiliary dict binding, if any
2240     , OutExpr)        -- Witnessing expression (always trivial)
2241bindAuxiliaryDict env@(SE { se_subst = subst, se_interesting = interesting })
2242                  orig_dict_id fresh_dict_id dict_expr
2243
2244  -- If the dictionary argument is trivial,
2245  -- don’t bother creating a new dict binding; just substitute
2246  | Just dict_id <- getIdFromTrivialExpr_maybe dict_expr
2247  = let env' = env { se_subst = Core.extendSubst subst orig_dict_id dict_expr
2248                                `Core.extendInScope` dict_id
2249                          -- See Note [Keep the old dictionaries interesting]
2250                   , se_interesting = interesting `extendVarSet` dict_id }
2251    in (env', Nothing, dict_expr)
2252
2253  | otherwise  -- Non-trivial dictionary arg; make an auxiliary binding
2254  = let dict_bind = mkDB (NonRec fresh_dict_id dict_expr)
2255        env' = env { se_subst = Core.extendSubst subst orig_dict_id (Var fresh_dict_id)
2256                                `Core.extendInScope` fresh_dict_id
2257                      -- See Note [Make the new dictionaries interesting]
2258                   , se_interesting = interesting `extendVarSet` fresh_dict_id }
2259    in (env', Just dict_bind, Var fresh_dict_id)
2260
2261{-
2262Note [Make the new dictionaries interesting]
2263~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2264Important!  We're going to substitute dx_id1 for d
2265and we want it to look "interesting", else we won't gather *any*
2266consequential calls. E.g.
2267    f d = ...g d....
2268If we specialise f for a call (f (dfun dNumInt)), we'll get
2269a consequent call (g d') with an auxiliary definition
2270    d' = df dNumInt
2271We want that consequent call to look interesting
2272
2273Note [Keep the old dictionaries interesting]
2274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2275In bindAuxiliaryDict, we don’t bother creating a new dict binding if
2276the dict expression is trivial. For example, if we have
2277
2278    f = \ @m1 (d1 :: Monad m1) -> ...
2279
2280and we specialize it at the pattern
2281
2282    [SpecType IO, SpecArg $dMonadIO]
2283
2284it would be silly to create a new binding for $dMonadIO; it’s already
2285a binding! So we just extend the substitution directly:
2286
2287    m1 :-> IO
2288    d1 :-> $dMonadIO
2289
2290But this creates a new subtlety: the dict expression might be a dict
2291binding we floated out while specializing another function. For
2292example, we might have
2293
2294    d2 = $p1Monad $dMonadIO -- floated out by bindAuxiliaryDict
2295    $sg = h @IO d2
2296    h = \ @m2 (d2 :: Applicative m2) -> ...
2297
2298and end up specializing h at the following pattern:
2299
2300    [SpecType IO, SpecArg d2]
2301
2302When we created the d2 binding in the first place, we locally marked
2303it as interesting while specializing g as described above by
2304Note [Make the new dictionaries interesting]. But when we go to
2305specialize h, it isn’t in the SpecEnv anymore, so we’ve lost the
2306knowledge that we should specialize on it.
2307
2308To fix this, we have to explicitly add d2 *back* to the interesting
2309set. That way, it will still be considered interesting while
2310specializing the body of h. See !2913.
2311-}
2312
2313
2314{- *********************************************************************
2315*                                                                      *
2316            UsageDetails and suchlike
2317*                                                                      *
2318********************************************************************* -}
2319
2320data UsageDetails
2321  = MkUD {
2322      ud_binds :: !(Bag DictBind),
2323               -- See Note [Floated dictionary bindings]
2324               -- The order is important;
2325               -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
2326               -- (Remember, Bags preserve order in GHC.)
2327
2328      ud_calls :: !CallDetails
2329
2330      -- INVARIANT: suppose bs = bindersOf ud_binds
2331      -- Then 'calls' may *mention* 'bs',
2332      -- but there should be no calls *for* bs
2333    }
2334
2335-- | A 'DictBind' is a binding along with a cached set containing its free
2336-- variables (both type variables and dictionaries)
2337type DictBind = (CoreBind, VarSet)
2338
2339{- Note [Floated dictionary bindings]
2340~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2341We float out dictionary bindings for the reasons described under
2342"Dictionary floating" above.  But not /just/ dictionary bindings.
2343Consider
2344
2345   f :: Eq a => blah
2346   f a d = rhs
2347
2348   $c== :: T -> T -> Bool
2349   $c== x y = ...
2350
2351   $df :: Eq T
2352   $df = Eq $c== ...
2353
2354   gurgle = ...(f @T $df)...
2355
2356We gather the call info for (f @T $df), and we don't want to drop it
2357when we come across the binding for $df.  So we add $df to the floats
2358and continue.  But then we have to add $c== to the floats, and so on.
2359These all float above the binding for 'f', and and now we can
2360successfully specialise 'f'.
2361
2362So the DictBinds in (ud_binds :: Bag DictBind) may contain
2363non-dictionary bindings too.
2364-}
2365
2366instance Outputable UsageDetails where
2367  ppr (MkUD { ud_binds = dbs, ud_calls = calls })
2368        = text "MkUD" <+> braces (sep (punctuate comma
2369                [text "binds" <+> equals <+> ppr dbs,
2370                 text "calls" <+> equals <+> ppr calls]))
2371
2372emptyUDs :: UsageDetails
2373emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
2374
2375------------------------------------------------------------
2376type CallDetails  = DIdEnv CallInfoSet
2377  -- The order of specialized binds and rules depends on how we linearize
2378  -- CallDetails, so to get determinism we must use a deterministic set here.
2379  -- See Note [Deterministic UniqFM] in UniqDFM
2380
2381data CallInfoSet = CIS Id (Bag CallInfo)
2382  -- The list of types and dictionaries is guaranteed to
2383  -- match the type of f
2384  -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
2385  -- These dups are eliminated by already_covered in specCalls
2386
2387data CallInfo
2388  = CI { ci_key  :: [SpecArg]   -- All arguments
2389       , ci_fvs  :: VarSet      -- Free vars of the ci_key
2390                                -- call (including tyvars)
2391                                -- [*not* include the main id itself, of course]
2392    }
2393
2394type DictExpr = CoreExpr
2395
2396ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
2397ciSetFilter p (CIS id a) = CIS id (filterBag p a)
2398
2399instance Outputable CallInfoSet where
2400  ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
2401                        2 (ppr map)
2402
2403pprCallInfo :: Id -> CallInfo -> SDoc
2404pprCallInfo fn (CI { ci_key = key })
2405  = ppr fn <+> ppr key
2406
2407ppr_call_key_ty :: SpecArg -> Maybe SDoc
2408ppr_call_key_ty (SpecType ty) = Just $ char '@' <+> pprParendType ty
2409ppr_call_key_ty UnspecType    = Just $ char '_'
2410ppr_call_key_ty (SpecDict _)  = Nothing
2411ppr_call_key_ty UnspecArg     = Nothing
2412
2413instance Outputable CallInfo where
2414  ppr (CI { ci_key = key, ci_fvs = fvs })
2415    = text "CI" <> braces (hsep [ fsep (mapMaybe ppr_call_key_ty key), ppr fvs ])
2416
2417unionCalls :: CallDetails -> CallDetails -> CallDetails
2418unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
2419
2420unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
2421unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
2422  CIS f (calls1 `unionBags` calls2)
2423
2424callDetailsFVs :: CallDetails -> VarSet
2425callDetailsFVs calls =
2426  nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
2427  -- It's OK to use nonDetFoldUDFM here because we forget the ordering
2428  -- immediately by converting to a nondeterministic set.
2429
2430callInfoFVs :: CallInfoSet -> VarSet
2431callInfoFVs (CIS _ call_info) =
2432  foldr (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
2433
2434getTheta :: [TyCoBinder] -> [PredType]
2435getTheta = fmap tyBinderType . filter isInvisibleBinder . filter (not . isNamedBinder)
2436
2437
2438------------------------------------------------------------
2439singleCall :: Id -> [SpecArg] -> UsageDetails
2440singleCall id args
2441  = MkUD {ud_binds = emptyBag,
2442          ud_calls = unitDVarEnv id $ CIS id $
2443                     unitBag (CI { ci_key  = args -- used to be tys
2444                                 , ci_fvs  = call_fvs }) }
2445  where
2446    call_fvs = foldr (unionVarSet . specArgFreeVars) emptyVarSet args
2447        -- The type args (tys) are guaranteed to be part of the dictionary
2448        -- types, because they are just the constrained types,
2449        -- and the dictionary is therefore sure to be bound
2450        -- inside the binding for any type variables free in the type;
2451        -- hence it's safe to neglect tyvars free in tys when making
2452        -- the free-var set for this call
2453        -- BUT I don't trust this reasoning; play safe and include tys_fvs
2454        --
2455        -- We don't include the 'id' itself.
2456
2457mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
2458mkCallUDs env f args
2459  = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
2460    res
2461  where
2462    res = mkCallUDs' env f args
2463
2464mkCallUDs' env f args
2465  |  not (want_calls_for f)  -- Imported from elsewhere
2466  || null ci_key             -- No useful specialisation
2467   -- See also Note [Specialisations already covered]
2468  = -- pprTrace "mkCallUDs: discarding" _trace_doc
2469    emptyUDs
2470
2471  | otherwise
2472  = -- pprTrace "mkCallUDs: keeping" _trace_doc
2473    singleCall f ci_key
2474  where
2475    _trace_doc = vcat [ppr f, ppr args, ppr ci_key]
2476    pis                = fst $ splitPiTys $ idType f
2477    constrained_tyvars = tyCoVarsOfTypes $ getTheta pis
2478
2479    ci_key :: [SpecArg]
2480    ci_key = dropWhileEndLE (not . isSpecDict) $
2481             zipWith mk_spec_arg args pis
2482             -- Drop trailing args until we get to a SpecDict
2483             -- In this way the RULE has as few args as possible,
2484             -- which broadens its applicability, since rules only
2485             -- fire when saturated
2486
2487    mk_spec_arg :: CoreExpr -> TyCoBinder -> SpecArg
2488    mk_spec_arg arg (Named bndr)
2489      |  binderVar bndr `elemVarSet` constrained_tyvars
2490      = case arg of
2491          Type ty -> SpecType ty
2492          _       -> pprPanic "ci_key" $ ppr arg
2493      |  otherwise = UnspecType
2494
2495    -- For "InvisArg", which are the type-class dictionaries,
2496    -- we decide on a case by case basis if we want to specialise
2497    -- on this argument; if so, SpecDict, if not UnspecArg
2498    mk_spec_arg arg (Anon InvisArg pred)
2499      | type_determines_value pred
2500      , interestingDict env arg -- Note [Interesting dictionary arguments]
2501      = SpecDict arg
2502      | otherwise = UnspecArg
2503
2504    mk_spec_arg _ (Anon VisArg _)
2505      = UnspecArg
2506
2507    want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
2508         -- For imported things, we gather call instances if
2509         -- there is an unfolding that we could in principle specialise
2510         -- We might still decide not to use it (consulting dflags)
2511         -- in specImports
2512         -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
2513
2514    type_determines_value pred    -- See Note [Type determines value]
2515        = case classifyPredType pred of
2516            ClassPred cls _ -> not (isIPClass cls)  -- Superclasses can't be IPs
2517            EqPred {}       -> True
2518            IrredPred {}    -> True   -- Things like (D []) where D is a
2519                                      -- Constraint-ranged family; #7785
2520            ForAllPred {}   -> True
2521
2522{-
2523Note [Type determines value]
2524~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2525Only specialise on non-IP *class* params, because these are the ones
2526whose *type* determines their *value*.  In particular, with implicit
2527params, the type args *don't* say what the value of the implicit param
2528is!  See #7101.
2529
2530So we treat implicit params just like ordinary arguments for the
2531purposes of specialisation.  Note that we still want to specialise
2532functions with implicit params if they have *other* dicts which are
2533class params; see #17930.
2534
2535One apparent additional complexity involves type families. For
2536example, consider
2537         type family D (v::*->*) :: Constraint
2538         type instance D [] = ()
2539         f :: D v => v Char -> Int
2540If we see a call (f "foo"), we'll pass a "dictionary"
2541  () |> (g :: () ~ D [])
2542and it's good to specialise f at this dictionary.
2543
2544So the question is: can an implicit parameter "hide inside" a
2545type-family constraint like (D a).  Well, no.  We don't allow
2546        type instance D Maybe = ?x:Int
2547Hence the IrredPred case in type_determines_value.  See #7785.
2548
2549Note [Interesting dictionary arguments]
2550~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2551Consider this
2552         \a.\d:Eq a.  let f = ... in ...(f d)...
2553There really is not much point in specialising f wrt the dictionary d,
2554because the code for the specialised f is not improved at all, because
2555d is lambda-bound.  We simply get junk specialisations.
2556
2557What is "interesting"?  Just that it has *some* structure.  But what about
2558variables?
2559
2560 * A variable might be imported, in which case its unfolding
2561   will tell us whether it has useful structure
2562
2563 * Local variables are cloned on the way down (to avoid clashes when
2564   we float dictionaries), and cloning drops the unfolding
2565   (cloneIdBndr).  Moreover, we make up some new bindings, and it's a
2566   nuisance to give them unfoldings.  So we keep track of the
2567   "interesting" dictionaries as a VarSet in SpecEnv.
2568   We have to take care to put any new interesting dictionary
2569   bindings in the set.
2570
2571We accidentally lost accurate tracking of local variables for a long
2572time, because cloned variables don't have unfoldings. But makes a
2573massive difference in a few cases, eg #5113. For nofib as a
2574whole it's only a small win: 2.2% improvement in allocation for ansi,
25751.2% for bspt, but mostly 0.0!  Average 0.1% increase in binary size.
2576-}
2577
2578interestingDict :: SpecEnv -> CoreExpr -> Bool
2579-- A dictionary argument is interesting if it has *some* structure
2580-- NB: "dictionary" arguments include constraints of all sorts,
2581--     including equality constraints; hence the Coercion case
2582interestingDict env (Var v) =  hasSomeUnfolding (idUnfolding v)
2583                            || isDataConWorkId v
2584                            || v `elemVarSet` se_interesting env
2585interestingDict _ (Type _)                = False
2586interestingDict _ (Coercion _)            = False
2587interestingDict env (App fn (Type _))     = interestingDict env fn
2588interestingDict env (App fn (Coercion _)) = interestingDict env fn
2589interestingDict env (Tick _ a)            = interestingDict env a
2590interestingDict env (Cast e _)            = interestingDict env e
2591interestingDict _ _                       = True
2592
2593plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2594plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2595        (MkUD {ud_binds = db2, ud_calls = calls2})
2596  = MkUD { ud_binds = db1    `unionBags`   db2
2597         , ud_calls = calls1 `unionCalls`  calls2 }
2598
2599-----------------------------
2600_dictBindBndrs :: Bag DictBind -> [Id]
2601_dictBindBndrs dbs = foldr ((++) . bindersOf . fst) [] dbs
2602
2603-- | Construct a 'DictBind' from a 'CoreBind'
2604mkDB :: CoreBind -> DictBind
2605mkDB bind = (bind, bind_fvs bind)
2606
2607-- | Identify the free variables of a 'CoreBind'
2608bind_fvs :: CoreBind -> VarSet
2609bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2610bind_fvs (Rec prs)         = foldl' delVarSet rhs_fvs bndrs
2611                           where
2612                             bndrs = map fst prs
2613                             rhs_fvs = unionVarSets (map pair_fvs prs)
2614
2615pair_fvs :: (Id, CoreExpr) -> VarSet
2616pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2617                       `unionVarSet` idFreeVars bndr
2618        -- idFreeVars: don't forget variables mentioned in
2619        -- the rules of the bndr.  C.f. OccAnal.addRuleUsage
2620        -- Also tyvars mentioned in its type; they may not appear
2621        -- in the RHS
2622        --      type T a = Int
2623        --      x :: T a = 3
2624  where
2625    interesting :: InterestingVarFun
2626    interesting v = isLocalVar v || (isId v && isDFunId v)
2627        -- Very important: include DFunIds /even/ if it is imported
2628        -- Reason: See Note [Avoiding loops], the second exmaple
2629        --         involving an imported dfun.  We must know whether
2630        --         a dictionary binding depends on an imported dfun,
2631        --         in case we try to specialise that imported dfun
2632        --         #13429 illustrates
2633
2634-- | Flatten a set of "dumped" 'DictBind's, and some other binding
2635-- pairs, into a single recursive binding.
2636recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind
2637recWithDumpedDicts pairs dbs
2638  = (Rec bindings, fvs)
2639  where
2640    (bindings, fvs) = foldr add
2641                               ([], emptyVarSet)
2642                               (dbs `snocBag` mkDB (Rec pairs))
2643    add (NonRec b r, fvs') (pairs, fvs) =
2644      ((b,r) : pairs, fvs `unionVarSet` fvs')
2645    add (Rec prs1,   fvs') (pairs, fvs) =
2646      (prs1 ++ pairs, fvs `unionVarSet` fvs')
2647
2648snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2649-- Add ud_binds to the tail end of the bindings in uds
2650snocDictBinds uds dbs
2651  = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2652
2653consDictBind :: DictBind -> UsageDetails -> UsageDetails
2654consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2655
2656addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2657addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2658
2659snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2660snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2661
2662wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2663wrapDictBinds dbs binds
2664  = foldr add binds dbs
2665  where
2666    add (bind,_) binds = bind : binds
2667
2668wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2669wrapDictBindsE dbs expr
2670  = foldr add expr dbs
2671  where
2672    add (bind,_) expr = Let bind expr
2673
2674----------------------
2675dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2676-- Used at a lambda or case binder; just dump anything mentioning the binder
2677dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2678  | null bndrs = (uds, emptyBag)  -- Common in case alternatives
2679  | otherwise  = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2680                 (free_uds, dump_dbs)
2681  where
2682    free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2683    bndr_set = mkVarSet bndrs
2684    (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2685    free_calls = deleteCallsMentioning dump_set $   -- Drop calls mentioning bndr_set on the floor
2686                 deleteCallsFor bndrs orig_calls    -- Discard calls for bndr_set; there should be
2687                                                    -- no calls for any of the dicts in dump_dbs
2688
2689dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2690-- Used at a let(rec) binding.
2691-- We return a boolean indicating whether the binding itself is mentioned,
2692-- directly or indirectly, by any of the ud_calls; in that case we want to
2693-- float the binding itself;
2694-- See Note [Floated dictionary bindings]
2695dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2696  = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2697    (free_uds, dump_dbs, float_all)
2698  where
2699    free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2700    bndr_set = mkVarSet bndrs
2701    (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2702    free_calls = deleteCallsFor bndrs orig_calls
2703    float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2704
2705callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2706callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2707  = -- pprTrace ("callsForMe")
2708    --          (vcat [ppr fn,
2709    --                 text "Orig dbs ="     <+> ppr (_dictBindBndrs orig_dbs),
2710    --                 text "Orig calls ="   <+> ppr orig_calls,
2711    --                 text "Dep set ="      <+> ppr dep_set,
2712    --                 text "Calls for me =" <+> ppr calls_for_me]) $
2713    (uds_without_me, calls_for_me)
2714  where
2715    uds_without_me = MkUD { ud_binds = orig_dbs
2716                          , ud_calls = delDVarEnv orig_calls fn }
2717    calls_for_me = case lookupDVarEnv orig_calls fn of
2718                        Nothing -> []
2719                        Just cis -> filterCalls cis orig_dbs
2720         -- filterCalls: drop calls that (directly or indirectly)
2721         -- refer to fn.  See Note [Avoiding loops]
2722
2723----------------------
2724filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2725-- See Note [Avoiding loops]
2726filterCalls (CIS fn call_bag) dbs
2727  = filter ok_call (bagToList call_bag)
2728  where
2729    dump_set = foldl' go (unitVarSet fn) dbs
2730      -- This dump-set could also be computed by splitDictBinds
2731      --   (_,_,dump_set) = splitDictBinds dbs {fn}
2732      -- But this variant is shorter
2733
2734    go so_far (db,fvs) | fvs `intersectsVarSet` so_far
2735                       = extendVarSetList so_far (bindersOf db)
2736                       | otherwise = so_far
2737
2738    ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set)
2739
2740----------------------
2741splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2742-- splitDictBinds dbs bndrs returns
2743--   (free_dbs, dump_dbs, dump_set)
2744-- where
2745--   * dump_dbs depends, transitively on bndrs
2746--   * free_dbs does not depend on bndrs
2747--   * dump_set = bndrs `union` bndrs(dump_dbs)
2748splitDictBinds dbs bndr_set
2749   = foldl' split_db (emptyBag, emptyBag, bndr_set) dbs
2750                -- Important that it's foldl' not foldr;
2751                -- we're accumulating the set of dumped ids in dump_set
2752   where
2753    split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2754        | dump_idset `intersectsVarSet` fvs     -- Dump it
2755        = (free_dbs, dump_dbs `snocBag` db,
2756           extendVarSetList dump_idset (bindersOf bind))
2757
2758        | otherwise     -- Don't dump it
2759        = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2760
2761
2762----------------------
2763deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2764-- Remove calls *mentioning* bs in any way
2765deleteCallsMentioning bs calls
2766  = mapDVarEnv (ciSetFilter keep_call) calls
2767  where
2768    keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs)
2769
2770deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2771-- Remove calls *for* bs
2772deleteCallsFor bs calls = delDVarEnvList calls bs
2773
2774{-
2775************************************************************************
2776*                                                                      *
2777\subsubsection{Boring helper functions}
2778*                                                                      *
2779************************************************************************
2780-}
2781
2782newtype SpecM a = SpecM (State SpecState a) deriving (Functor)
2783
2784data SpecState = SpecState {
2785                     spec_uniq_supply :: UniqSupply,
2786                     spec_module :: Module,
2787                     spec_dflags :: DynFlags
2788                 }
2789
2790instance Applicative SpecM where
2791    pure x = SpecM $ return x
2792    (<*>) = ap
2793
2794instance Monad SpecM where
2795    SpecM x >>= f = SpecM $ do y <- x
2796                               case f y of
2797                                   SpecM z ->
2798                                       z
2799#if !MIN_VERSION_base(4,13,0)
2800    fail = MonadFail.fail
2801#endif
2802
2803instance MonadFail.MonadFail SpecM where
2804   fail str = SpecM $ error str
2805
2806instance MonadUnique SpecM where
2807    getUniqueSupplyM
2808        = SpecM $ do st <- get
2809                     let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2810                     put $ st { spec_uniq_supply = us2 }
2811                     return us1
2812
2813    getUniqueM
2814        = SpecM $ do st <- get
2815                     let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2816                     put $ st { spec_uniq_supply = us' }
2817                     return u
2818
2819instance HasDynFlags SpecM where
2820    getDynFlags = SpecM $ liftM spec_dflags get
2821
2822instance HasModule SpecM where
2823    getModule = SpecM $ liftM spec_module get
2824
2825runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2826runSpecM dflags this_mod (SpecM spec)
2827    = do us <- getUniqueSupplyM
2828         let initialState = SpecState {
2829                                spec_uniq_supply = us,
2830                                spec_module = this_mod,
2831                                spec_dflags = dflags
2832                            }
2833         return $ evalState spec initialState
2834
2835mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2836mapAndCombineSM _ []     = return ([], emptyUDs)
2837mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2838                              (ys, uds2) <- mapAndCombineSM f xs
2839                              return (y:ys, uds1 `plusUDs` uds2)
2840
2841extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2842extendTvSubstList env tv_binds
2843  = env { se_subst = Core.extendTvSubstList (se_subst env) tv_binds }
2844
2845substTy :: SpecEnv -> Type -> Type
2846substTy env ty = Core.substTy (se_subst env) ty
2847
2848substCo :: SpecEnv -> Coercion -> Coercion
2849substCo env co = Core.substCo (se_subst env) co
2850
2851substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2852substBndr env bs = case Core.substBndr (se_subst env) bs of
2853                      (subst', bs') -> (env { se_subst = subst' }, bs')
2854
2855substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2856substBndrs env bs = case Core.substBndrs (se_subst env) bs of
2857                      (subst', bs') -> (env { se_subst = subst' }, bs')
2858
2859cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2860-- Clone the binders of the bind; return new bind with the cloned binders
2861-- Return the substitution to use for RHSs, and the one to use for the body
2862cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2863  = do { us <- getUniqueSupplyM
2864       ; let (subst', bndr') = Core.cloneIdBndr subst us bndr
2865             interesting' | interestingDict env rhs
2866                          = interesting `extendVarSet` bndr'
2867                          | otherwise = interesting
2868       ; return (env, env { se_subst = subst', se_interesting = interesting' }
2869                , NonRec bndr' rhs) }
2870
2871cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2872  = do { us <- getUniqueSupplyM
2873       ; let (subst', bndrs') = Core.cloneRecIdBndrs subst us (map fst pairs)
2874             env' = env { se_subst = subst'
2875                        , se_interesting = interesting `extendVarSetList`
2876                                           [ v | (v,r) <- pairs, interestingDict env r ] }
2877       ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2878
2879newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2880-- Make up completely fresh binders for the dictionaries
2881-- Their bindings are going to float outwards
2882newDictBndr env b = do { uniq <- getUniqueM
2883                        ; let n   = idName b
2884                              ty' = substTy env (idType b)
2885                        ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2886
2887newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2888    -- Give the new Id a similar occurrence name to the old one
2889newSpecIdSM old_id new_ty join_arity_maybe
2890  = do  { uniq <- getUniqueM
2891        ; let name    = idName old_id
2892              new_occ = mkSpecOcc (nameOccName name)
2893              new_id  = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2894                          `asJoinId_maybe` join_arity_maybe
2895        ; return new_id }
2896
2897{-
2898                Old (but interesting) stuff about unboxed bindings
2899                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2900
2901What should we do when a value is specialised to a *strict* unboxed value?
2902
2903        map_*_* f (x:xs) = let h = f x
2904                               t = map f xs
2905                           in h:t
2906
2907Could convert let to case:
2908
2909        map_*_Int# f (x:xs) = case f x of h# ->
2910                              let t = map f xs
2911                              in h#:t
2912
2913This may be undesirable since it forces evaluation here, but the value
2914may not be used in all branches of the body. In the general case this
2915transformation is impossible since the mutual recursion in a letrec
2916cannot be expressed as a case.
2917
2918There is also a problem with top-level unboxed values, since our
2919implementation cannot handle unboxed values at the top level.
2920
2921Solution: Lift the binding of the unboxed value and extract it when it
2922is used:
2923
2924        map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2925                                  t = map f xs
2926                              in case h of
2927                                 _Lift h# -> h#:t
2928
2929Now give it to the simplifier and the _Lifting will be optimised away.
2930
2931The benefit is that we have given the specialised "unboxed" values a
2932very simple lifted semantics and then leave it up to the simplifier to
2933optimise it --- knowing that the overheads will be removed in nearly
2934all cases.
2935
2936In particular, the value will only be evaluated in the branches of the
2937program which use it, rather than being forced at the point where the
2938value is bound. For example:
2939
2940        filtermap_*_* p f (x:xs)
2941          = let h = f x
2942                t = ...
2943            in case p x of
2944                True  -> h:t
2945                False -> t
2946   ==>
2947        filtermap_*_Int# p f (x:xs)
2948          = let h = case (f x) of h# -> _Lift h#
2949                t = ...
2950            in case p x of
2951                True  -> case h of _Lift h#
2952                           -> h#:t
2953                False -> t
2954
2955The binding for h can still be inlined in the one branch and the
2956_Lifting eliminated.
2957
2958
2959Question: When won't the _Lifting be eliminated?
2960
2961Answer: When they at the top-level (where it is necessary) or when
2962inlining would duplicate work (or possibly code depending on
2963options). However, the _Lifting will still be eliminated if the
2964strictness analyser deems the lifted binding strict.
2965-}
2966