1{- 2(c) The GRASP/AQUA Project, Glasgow University, 1993-1998 3 4\section[Specialise]{Stamping out overloading, and (optionally) polymorphism} 5-} 6 7{-# LANGUAGE CPP #-} 8{-# LANGUAGE DeriveFunctor #-} 9{-# LANGUAGE ViewPatterns #-} 10module Specialise ( specProgram, specUnfolding ) where 11 12#include "HsVersions.h" 13 14import GhcPrelude 15 16import Id 17import TcType hiding( substTy ) 18import Type hiding( substTy, extendTvSubstList ) 19import Predicate 20import Module( Module, HasModule(..) ) 21import Coercion( Coercion ) 22import CoreMonad 23import qualified CoreSubst as Core 24import CoreUnfold 25import Var ( isLocalVar ) 26import VarSet 27import VarEnv 28import CoreSyn 29import Rules 30import CoreOpt ( collectBindersPushingCo ) 31import CoreUtils ( exprIsTrivial, mkCast, exprType, getIdFromTrivialExpr_maybe ) 32import CoreFVs 33import CoreArity ( etaExpandToJoinPointRule ) 34import UniqSupply 35import Name 36import MkId ( voidArgId, voidPrimId ) 37import TysPrim ( voidPrimTy ) 38import Maybes ( mapMaybe, maybeToList, isJust ) 39import MonadUtils ( foldlM ) 40import BasicTypes 41import HscTypes 42import Bag 43import DynFlags 44import Util 45import Outputable 46import FastString 47import State 48import UniqDFM 49import TyCoRep (TyCoBinder (..)) 50 51import Control.Monad 52import qualified Control.Monad.Fail as MonadFail 53 54{- 55************************************************************************ 56* * 57\subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]} 58* * 59************************************************************************ 60 61These notes describe how we implement specialisation to eliminate 62overloading. 63 64The specialisation pass works on Core 65syntax, complete with all the explicit dictionary application, 66abstraction and construction as added by the type checker. The 67existing type checker remains largely as it is. 68 69One important thought: the {\em types} passed to an overloaded 70function, and the {\em dictionaries} passed are mutually redundant. 71If the same function is applied to the same type(s) then it is sure to 72be applied to the same dictionary(s)---or rather to the same {\em 73values}. (The arguments might look different but they will evaluate 74to the same value.) 75 76Second important thought: we know that we can make progress by 77treating dictionary arguments as static and worth specialising on. So 78we can do without binding-time analysis, and instead specialise on 79dictionary arguments and no others. 80 81The basic idea 82~~~~~~~~~~~~~~ 83Suppose we have 84 85 let f = <f_rhs> 86 in <body> 87 88and suppose f is overloaded. 89 90STEP 1: CALL-INSTANCE COLLECTION 91 92We traverse <body>, accumulating all applications of f to types and 93dictionaries. 94 95(Might there be partial applications, to just some of its types and 96dictionaries? In principle yes, but in practice the type checker only 97builds applications of f to all its types and dictionaries, so partial 98applications could only arise as a result of transformation, and even 99then I think it's unlikely. In any case, we simply don't accumulate such 100partial applications.) 101 102 103STEP 2: EQUIVALENCES 104 105So now we have a collection of calls to f: 106 f t1 t2 d1 d2 107 f t3 t4 d3 d4 108 ... 109Notice that f may take several type arguments. To avoid ambiguity, we 110say that f is called at type t1/t2 and t3/t4. 111 112We take equivalence classes using equality of the *types* (ignoring 113the dictionary args, which as mentioned previously are redundant). 114 115STEP 3: SPECIALISATION 116 117For each equivalence class, choose a representative (f t1 t2 d1 d2), 118and create a local instance of f, defined thus: 119 120 f@t1/t2 = <f_rhs> t1 t2 d1 d2 121 122f_rhs presumably has some big lambdas and dictionary lambdas, so lots 123of simplification will now result. However we don't actually *do* that 124simplification. Rather, we leave it for the simplifier to do. If we 125*did* do it, though, we'd get more call instances from the specialised 126RHS. We can work out what they are by instantiating the call-instance 127set from f's RHS with the types t1, t2. 128 129Add this new id to f's IdInfo, to record that f has a specialised version. 130 131Before doing any of this, check that f's IdInfo doesn't already 132tell us about an existing instance of f at the required type/s. 133(This might happen if specialisation was applied more than once, or 134it might arise from user SPECIALIZE pragmas.) 135 136Recursion 137~~~~~~~~~ 138Wait a minute! What if f is recursive? Then we can't just plug in 139its right-hand side, can we? 140 141But it's ok. The type checker *always* creates non-recursive definitions 142for overloaded recursive functions. For example: 143 144 f x = f (x+x) -- Yes I know its silly 145 146becomes 147 148 f a (d::Num a) = let p = +.sel a d 149 in 150 letrec fl (y::a) = fl (p y y) 151 in 152 fl 153 154We still have recursion for non-overloaded functions which we 155specialise, but the recursive call should get specialised to the 156same recursive version. 157 158 159Polymorphism 1 160~~~~~~~~~~~~~~ 161 162All this is crystal clear when the function is applied to *constant 163types*; that is, types which have no type variables inside. But what if 164it is applied to non-constant types? Suppose we find a call of f at type 165t1/t2. There are two possibilities: 166 167(a) The free type variables of t1, t2 are in scope at the definition point 168of f. In this case there's no problem, we proceed just as before. A common 169example is as follows. Here's the Haskell: 170 171 g y = let f x = x+x 172 in f y + f y 173 174After typechecking we have 175 176 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x 177 in +.sel a d (f a d y) (f a d y) 178 179Notice that the call to f is at type type "a"; a non-constant type. 180Both calls to f are at the same type, so we can specialise to give: 181 182 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x 183 in +.sel a d (f@a y) (f@a y) 184 185 186(b) The other case is when the type variables in the instance types 187are *not* in scope at the definition point of f. The example we are 188working with above is a good case. There are two instances of (+.sel a d), 189but "a" is not in scope at the definition of +.sel. Can we do anything? 190Yes, we can "common them up", a sort of limited common sub-expression deal. 191This would give: 192 193 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d 194 f@a (x::a) = +.sel@a x x 195 in +.sel@a (f@a y) (f@a y) 196 197This can save work, and can't be spotted by the type checker, because 198the two instances of +.sel weren't originally at the same type. 199 200Further notes on (b) 201 202* There are quite a few variations here. For example, the defn of 203 +.sel could be floated ouside the \y, to attempt to gain laziness. 204 It certainly mustn't be floated outside the \d because the d has to 205 be in scope too. 206 207* We don't want to inline f_rhs in this case, because 208that will duplicate code. Just commoning up the call is the point. 209 210* Nothing gets added to +.sel's IdInfo. 211 212* Don't bother unless the equivalence class has more than one item! 213 214Not clear whether this is all worth it. It is of course OK to 215simply discard call-instances when passing a big lambda. 216 217Polymorphism 2 -- Overloading 218~~~~~~~~~~~~~~ 219Consider a function whose most general type is 220 221 f :: forall a b. Ord a => [a] -> b -> b 222 223There is really no point in making a version of g at Int/Int and another 224at Int/Bool, because it's only instantiating the type variable "a" which 225buys us any efficiency. Since g is completely polymorphic in b there 226ain't much point in making separate versions of g for the different 227b types. 228 229That suggests that we should identify which of g's type variables 230are constrained (like "a") and which are unconstrained (like "b"). 231Then when taking equivalence classes in STEP 2, we ignore the type args 232corresponding to unconstrained type variable. In STEP 3 we make 233polymorphic versions. Thus: 234 235 f@t1/ = /\b -> <f_rhs> t1 b d1 d2 236 237We do this. 238 239 240Dictionary floating 241~~~~~~~~~~~~~~~~~~~ 242Consider this 243 244 f a (d::Num a) = let g = ... 245 in 246 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)... 247 248Here, g is only called at one type, but the dictionary isn't in scope at the 249definition point for g. Usually the type checker would build a 250definition for d1 which enclosed g, but the transformation system 251might have moved d1's defn inward. Solution: float dictionary bindings 252outwards along with call instances. 253 254Consider 255 256 f x = let g p q = p==q 257 h r s = (r+s, g r s) 258 in 259 h x x 260 261 262Before specialisation, leaving out type abstractions we have 263 264 f df x = let g :: Eq a => a -> a -> Bool 265 g dg p q = == dg p q 266 h :: Num a => a -> a -> (a, Bool) 267 h dh r s = let deq = eqFromNum dh 268 in (+ dh r s, g deq r s) 269 in 270 h df x x 271 272After specialising h we get a specialised version of h, like this: 273 274 h' r s = let deq = eqFromNum df 275 in (+ df r s, g deq r s) 276 277But we can't naively make an instance for g from this, because deq is not in scope 278at the defn of g. Instead, we have to float out the (new) defn of deq 279to widen its scope. Notice that this floating can't be done in advance -- it only 280shows up when specialisation is done. 281 282User SPECIALIZE pragmas 283~~~~~~~~~~~~~~~~~~~~~~~ 284Specialisation pragmas can be digested by the type checker, and implemented 285by adding extra definitions along with that of f, in the same way as before 286 287 f@t1/t2 = <f_rhs> t1 t2 d1 d2 288 289Indeed the pragmas *have* to be dealt with by the type checker, because 290only it knows how to build the dictionaries d1 and d2! For example 291 292 g :: Ord a => [a] -> [a] 293 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-} 294 295Here, the specialised version of g is an application of g's rhs to the 296Ord dictionary for (Tree Int), which only the type checker can conjure 297up. There might not even *be* one, if (Tree Int) is not an instance of 298Ord! (All the other specialision has suitable dictionaries to hand 299from actual calls.) 300 301Problem. The type checker doesn't have to hand a convenient <f_rhs>, because 302it is buried in a complex (as-yet-un-desugared) binding group. 303Maybe we should say 304 305 f@t1/t2 = f* t1 t2 d1 d2 306 307where f* is the Id f with an IdInfo which says "inline me regardless!". 308Indeed all the specialisation could be done in this way. 309That in turn means that the simplifier has to be prepared to inline absolutely 310any in-scope let-bound thing. 311 312 313Again, the pragma should permit polymorphism in unconstrained variables: 314 315 h :: Ord a => [a] -> b -> b 316 {-# SPECIALIZE h :: [Int] -> b -> b #-} 317 318We *insist* that all overloaded type variables are specialised to ground types, 319(and hence there can be no context inside a SPECIALIZE pragma). 320We *permit* unconstrained type variables to be specialised to 321 - a ground type 322 - or left as a polymorphic type variable 323but nothing in between. So 324 325 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-} 326 327is *illegal*. (It can be handled, but it adds complication, and gains the 328programmer nothing.) 329 330 331SPECIALISING INSTANCE DECLARATIONS 332~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 333Consider 334 335 instance Foo a => Foo [a] where 336 ... 337 {-# SPECIALIZE instance Foo [Int] #-} 338 339The original instance decl creates a dictionary-function 340definition: 341 342 dfun.Foo.List :: forall a. Foo a -> Foo [a] 343 344The SPECIALIZE pragma just makes a specialised copy, just as for 345ordinary function definitions: 346 347 dfun.Foo.List@Int :: Foo [Int] 348 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt 349 350The information about what instance of the dfun exist gets added to 351the dfun's IdInfo in the same way as a user-defined function too. 352 353 354Automatic instance decl specialisation? 355~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 356Can instance decls be specialised automatically? It's tricky. 357We could collect call-instance information for each dfun, but 358then when we specialised their bodies we'd get new call-instances 359for ordinary functions; and when we specialised their bodies, we might get 360new call-instances of the dfuns, and so on. This all arises because of 361the unrestricted mutual recursion between instance decls and value decls. 362 363Still, there's no actual problem; it just means that we may not do all 364the specialisation we could theoretically do. 365 366Furthermore, instance decls are usually exported and used non-locally, 367so we'll want to compile enough to get those specialisations done. 368 369Lastly, there's no such thing as a local instance decl, so we can 370survive solely by spitting out *usage* information, and then reading that 371back in as a pragma when next compiling the file. So for now, 372we only specialise instance decls in response to pragmas. 373 374 375SPITTING OUT USAGE INFORMATION 376~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 377 378To spit out usage information we need to traverse the code collecting 379call-instance information for all imported (non-prelude?) functions 380and data types. Then we equivalence-class it and spit it out. 381 382This is done at the top-level when all the call instances which escape 383must be for imported functions and data types. 384 385*** Not currently done *** 386 387 388Partial specialisation by pragmas 389~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 390What about partial specialisation: 391 392 k :: (Ord a, Eq b) => [a] -> b -> b -> [a] 393 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-} 394 395or even 396 397 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-} 398 399Seems quite reasonable. Similar things could be done with instance decls: 400 401 instance (Foo a, Foo b) => Foo (a,b) where 402 ... 403 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-} 404 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-} 405 406Ho hum. Things are complex enough without this. I pass. 407 408 409Requirements for the simplifier 410~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 411The simplifier has to be able to take advantage of the specialisation. 412 413* When the simplifier finds an application of a polymorphic f, it looks in 414f's IdInfo in case there is a suitable instance to call instead. This converts 415 416 f t1 t2 d1 d2 ===> f_t1_t2 417 418Note that the dictionaries get eaten up too! 419 420* Dictionary selection operations on constant dictionaries must be 421 short-circuited: 422 423 +.sel Int d ===> +Int 424 425The obvious way to do this is in the same way as other specialised 426calls: +.sel has inside it some IdInfo which tells that if it's applied 427to the type Int then it should eat a dictionary and transform to +Int. 428 429In short, dictionary selectors need IdInfo inside them for constant 430methods. 431 432* Exactly the same applies if a superclass dictionary is being 433 extracted: 434 435 Eq.sel Int d ===> dEqInt 436 437* Something similar applies to dictionary construction too. Suppose 438dfun.Eq.List is the function taking a dictionary for (Eq a) to 439one for (Eq [a]). Then we want 440 441 dfun.Eq.List Int d ===> dEq.List_Int 442 443Where does the Eq [Int] dictionary come from? It is built in 444response to a SPECIALIZE pragma on the Eq [a] instance decl. 445 446In short, dfun Ids need IdInfo with a specialisation for each 447constant instance of their instance declaration. 448 449All this uses a single mechanism: the SpecEnv inside an Id 450 451 452What does the specialisation IdInfo look like? 453~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 454 455The SpecEnv of an Id maps a list of types (the template) to an expression 456 457 [Type] |-> Expr 458 459For example, if f has this RuleInfo: 460 461 [Int, a] -> \d:Ord Int. f' a 462 463it means that we can replace the call 464 465 f Int t ===> (\d. f' t) 466 467This chucks one dictionary away and proceeds with the 468specialised version of f, namely f'. 469 470 471What can't be done this way? 472~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 473There is no way, post-typechecker, to get a dictionary for (say) 474Eq a from a dictionary for Eq [a]. So if we find 475 476 ==.sel [t] d 477 478we can't transform to 479 480 eqList (==.sel t d') 481 482where 483 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool 484 485Of course, we currently have no way to automatically derive 486eqList, nor to connect it to the Eq [a] instance decl, but you 487can imagine that it might somehow be possible. Taking advantage 488of this is permanently ruled out. 489 490Still, this is no great hardship, because we intend to eliminate 491overloading altogether anyway! 492 493A note about non-tyvar dictionaries 494~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 495Some Ids have types like 496 497 forall a,b,c. Eq a -> Ord [a] -> tau 498 499This seems curious at first, because we usually only have dictionary 500args whose types are of the form (C a) where a is a type variable. 501But this doesn't hold for the functions arising from instance decls, 502which sometimes get arguments with types of form (C (T a)) for some 503type constructor T. 504 505Should we specialise wrt this compound-type dictionary? We used to say 506"no", saying: 507 "This is a heuristic judgement, as indeed is the fact that we 508 specialise wrt only dictionaries. We choose *not* to specialise 509 wrt compound dictionaries because at the moment the only place 510 they show up is in instance decls, where they are simply plugged 511 into a returned dictionary. So nothing is gained by specialising 512 wrt them." 513 514But it is simpler and more uniform to specialise wrt these dicts too; 515and in future GHC is likely to support full fledged type signatures 516like 517 f :: Eq [(a,b)] => ... 518 519 520************************************************************************ 521* * 522\subsubsection{The new specialiser} 523* * 524************************************************************************ 525 526Our basic game plan is this. For let(rec) bound function 527 f :: (C a, D c) => (a,b,c,d) -> Bool 528 529* Find any specialised calls of f, (f ts ds), where 530 ts are the type arguments t1 .. t4, and 531 ds are the dictionary arguments d1 .. d2. 532 533* Add a new definition for f1 (say): 534 535 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2 536 537 Note that we abstract over the unconstrained type arguments. 538 539* Add the mapping 540 541 [t1,b,t3,d] |-> \d1 d2 -> f1 b d 542 543 to the specialisations of f. This will be used by the 544 simplifier to replace calls 545 (f t1 t2 t3 t4) da db 546 by 547 (\d1 d1 -> f1 t2 t4) da db 548 549 All the stuff about how many dictionaries to discard, and what types 550 to apply the specialised function to, are handled by the fact that the 551 SpecEnv contains a template for the result of the specialisation. 552 553We don't build *partial* specialisations for f. For example: 554 555 f :: Eq a => a -> a -> Bool 556 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-} 557 558Here, little is gained by making a specialised copy of f. 559There's a distinct danger that the specialised version would 560first build a dictionary for (Eq b, Eq c), and then select the (==) 561method from it! Even if it didn't, not a great deal is saved. 562 563We do, however, generate polymorphic, but not overloaded, specialisations: 564 565 f :: Eq a => [a] -> b -> b -> b 566 ... SPECIALISE f :: [Int] -> b -> b -> b ... 567 568Hence, the invariant is this: 569 570 *** no specialised version is overloaded *** 571 572 573************************************************************************ 574* * 575\subsubsection{The exported function} 576* * 577************************************************************************ 578-} 579 580-- | Specialise calls to type-class overloaded functions occuring in a program. 581specProgram :: ModGuts -> CoreM ModGuts 582specProgram guts@(ModGuts { mg_module = this_mod 583 , mg_rules = local_rules 584 , mg_binds = binds }) 585 = do { dflags <- getDynFlags 586 587 -- Specialise the bindings of this module 588 ; (binds', uds) <- runSpecM dflags this_mod (go binds) 589 590 ; (spec_rules, spec_binds) <- specImports dflags this_mod top_env 591 local_rules uds 592 593 ; return (guts { mg_binds = spec_binds ++ binds' 594 , mg_rules = spec_rules ++ local_rules }) } 595 where 596 -- We need to start with a Subst that knows all the things 597 -- that are in scope, so that the substitution engine doesn't 598 -- accidentally re-use a unique that's already in use 599 -- Easiest thing is to do it all at once, as if all the top-level 600 -- decls were mutually recursive 601 top_env = SE { se_subst = Core.mkEmptySubst $ mkInScopeSet $ mkVarSet $ 602 bindersOfBinds binds 603 , se_interesting = emptyVarSet } 604 605 go [] = return ([], emptyUDs) 606 go (bind:binds) = do (binds', uds) <- go binds 607 (bind', uds') <- specBind top_env bind uds 608 return (bind' ++ binds', uds') 609 610{- 611Note [Wrap bindings returned by specImports] 612~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 613'specImports' returns a set of specialized bindings. However, these are lacking 614necessary floated dictionary bindings, which are returned by 615UsageDetails(ud_binds). These dictionaries need to be brought into scope with 616'wrapDictBinds' before the bindings returned by 'specImports' can be used. See, 617for instance, the 'specImports' call in 'specProgram'. 618 619 620Note [Disabling cross-module specialisation] 621~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 622Since GHC 7.10 we have performed specialisation of INLINABLE bindings living 623in modules outside of the current module. This can sometimes uncover user code 624which explodes in size when aggressively optimized. The 625-fno-cross-module-specialise option was introduced to allow users to being 626bitten by such instances to revert to the pre-7.10 behavior. 627 628See #10491 629-} 630 631 632{- ********************************************************************* 633* * 634 Specialising imported functions 635* * 636********************************************************************* -} 637 638specImports :: DynFlags -> Module -> SpecEnv 639 -> [CoreRule] 640 -> UsageDetails 641 -> CoreM ([CoreRule], [CoreBind]) 642specImports dflags this_mod top_env local_rules 643 (MkUD { ud_binds = dict_binds, ud_calls = calls }) 644 | not $ gopt Opt_CrossModuleSpecialise dflags 645 -- See Note [Disabling cross-module specialisation] 646 = return ([], wrapDictBinds dict_binds []) 647 648 | otherwise 649 = do { hpt_rules <- getRuleBase 650 ; let rule_base = extendRuleBaseList hpt_rules local_rules 651 652 ; (spec_rules, spec_binds) <- spec_imports dflags this_mod top_env 653 [] rule_base 654 dict_binds calls 655 656 -- Don't forget to wrap the specialized bindings with 657 -- bindings for the needed dictionaries. 658 -- See Note [Wrap bindings returned by specImports] 659 -- and Note [Glom the bindings if imported functions are specialised] 660 ; let final_binds 661 | null spec_binds = wrapDictBinds dict_binds [] 662 | otherwise = [Rec $ flattenBinds $ 663 wrapDictBinds dict_binds spec_binds] 664 665 ; return (spec_rules, final_binds) 666 } 667 668-- | Specialise a set of calls to imported bindings 669spec_imports :: DynFlags 670 -> Module 671 -> SpecEnv -- Passed in so that all top-level Ids are in scope 672 -> [Id] -- Stack of imported functions being specialised 673 -- See Note [specImport call stack] 674 -> RuleBase -- Rules from this module and the home package 675 -- (but not external packages, which can change) 676 -> Bag DictBind -- Dict bindings, used /only/ for filterCalls 677 -- See Note [Avoiding loops in specImports] 678 -> CallDetails -- Calls for imported things 679 -> CoreM ( [CoreRule] -- New rules 680 , [CoreBind] ) -- Specialised bindings 681spec_imports dflags this_mod top_env 682 callers rule_base dict_binds calls 683 = do { let import_calls = dVarEnvElts calls 684 -- ; debugTraceMsg (text "specImports {" <+> 685 -- vcat [ text "calls:" <+> ppr import_calls 686 -- , text "dict_binds:" <+> ppr dict_binds ]) 687 ; (rules, spec_binds) <- go rule_base import_calls 688 -- ; debugTraceMsg (text "End specImports }" <+> ppr import_calls) 689 690 ; return (rules, spec_binds) } 691 where 692 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind]) 693 go _ [] = return ([], []) 694 go rb (cis : other_calls) 695 = do { -- debugTraceMsg (text "specImport {" <+> ppr cis) 696 ; (rules1, spec_binds1) <- spec_import dflags this_mod top_env 697 callers rb dict_binds cis 698 -- ; debugTraceMsg (text "specImport }" <+> ppr cis) 699 700 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls 701 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) } 702 703spec_import :: DynFlags 704 -> Module 705 -> SpecEnv -- Passed in so that all top-level Ids are in scope 706 -> [Id] -- Stack of imported functions being specialised 707 -- See Note [specImport call stack] 708 -> RuleBase -- Rules from this module 709 -> Bag DictBind -- Dict bindings, used /only/ for filterCalls 710 -- See Note [Avoiding loops in specImports] 711 -> CallInfoSet -- Imported function and calls for it 712 -> CoreM ( [CoreRule] -- New rules 713 , [CoreBind] ) -- Specialised bindings 714spec_import dflags this_mod top_env callers 715 rb dict_binds cis@(CIS fn _) 716 | isIn "specImport" fn callers 717 = return ([], []) -- No warning. This actually happens all the time 718 -- when specialising a recursive function, because 719 -- the RHS of the specialised function contains a recursive 720 -- call to the original function 721 722 | null good_calls 723 = do { -- debugTraceMsg (text "specImport:no valid calls") 724 ; return ([], []) } 725 726 | wantSpecImport dflags unfolding 727 , Just rhs <- maybeUnfoldingTemplate unfolding 728 = do { -- Get rules from the external package state 729 -- We keep doing this in case we "page-fault in" 730 -- more rules as we go along 731 ; hsc_env <- getHscEnv 732 ; eps <- liftIO $ hscEPS hsc_env 733 ; vis_orphs <- getVisibleOrphanMods 734 ; let full_rb = unionRuleBase rb (eps_rule_base eps) 735 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn 736 737 ; (rules1, spec_pairs, MkUD { ud_binds = dict_binds1, ud_calls = new_calls }) 738 <- do { -- debugTraceMsg (text "specImport1" <+> vcat [ppr fn, ppr good_calls, ppr rhs]) 739 ; runSpecM dflags this_mod $ 740 specCalls (Just this_mod) top_env rules_for_fn good_calls fn rhs } 741 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs] 742 -- After the rules kick in we may get recursion, but 743 -- we rely on a global GlomBinds to sort that out later 744 -- See Note [Glom the bindings if imported functions are specialised] 745 746 -- Now specialise any cascaded calls 747 -- ; debugTraceMsg (text "specImport 2" <+> (ppr fn $$ ppr rules1 $$ ppr spec_binds1)) 748 ; (rules2, spec_binds2) <- spec_imports dflags this_mod top_env 749 (fn:callers) 750 (extendRuleBaseList rb rules1) 751 (dict_binds `unionBags` dict_binds1) 752 new_calls 753 754 ; let final_binds = wrapDictBinds dict_binds1 $ 755 spec_binds2 ++ spec_binds1 756 757 ; return (rules2 ++ rules1, final_binds) } 758 759 | otherwise 760 = do { tryWarnMissingSpecs dflags callers fn good_calls 761 ; return ([], [])} 762 763 where 764 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers 765 good_calls = filterCalls cis dict_binds 766 -- SUPER IMPORTANT! Drop calls that (directly or indirectly) refer to fn 767 -- See Note [Avoiding loops in specImports] 768 769-- | Returns whether or not to show a missed-spec warning. 770-- If -Wall-missed-specializations is on, show the warning. 771-- Otherwise, if -Wmissed-specializations is on, only show a warning 772-- if there is at least one imported function being specialized, 773-- and if all imported functions are marked with an inline pragma 774-- Use the most specific warning as the reason. 775tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM () 776-- See Note [Warning about missed specialisations] 777tryWarnMissingSpecs dflags callers fn calls_for_fn 778 | wopt Opt_WarnMissedSpecs dflags 779 && not (null callers) 780 && allCallersInlined = doWarn $ Reason Opt_WarnMissedSpecs 781 | wopt Opt_WarnAllMissedSpecs dflags = doWarn $ Reason Opt_WarnAllMissedSpecs 782 | otherwise = return () 783 where 784 allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers 785 doWarn reason = 786 warnMsg reason 787 (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn)) 788 2 (vcat [ text "when specialising" <+> quotes (ppr caller) 789 | caller <- callers]) 790 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) 791 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ]) 792 793wantSpecImport :: DynFlags -> Unfolding -> Bool 794-- See Note [Specialise imported INLINABLE things] 795wantSpecImport dflags unf 796 = case unf of 797 NoUnfolding -> False 798 BootUnfolding -> False 799 OtherCon {} -> False 800 DFunUnfolding {} -> True 801 CoreUnfolding { uf_src = src, uf_guidance = _guidance } 802 | gopt Opt_SpecialiseAggressively dflags -> True 803 | isStableSource src -> True 804 -- Specialise even INLINE things; it hasn't inlined yet, 805 -- so perhaps it never will. Moreover it may have calls 806 -- inside it that we want to specialise 807 | otherwise -> False -- Stable, not INLINE, hence INLINABLE 808 809{- Note [Avoiding loops in specImports] 810~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 811We must take great care when specialising instance declarations 812(functions like $fOrdList) lest we accidentally build a recursive 813dictionary. See Note [Avoiding loops]. 814 815The basic strategy of Note [Avoiding loops] is to use filterCalls 816to discard loopy specialisations. But to do that we must ensure 817that the in-scope dict-binds (passed to filterCalls) contains 818all the needed dictionary bindings. In particular, in the recursive 819call to spec_imorpts in spec_import, we must include the dict-binds 820from the parent. Lacking this caused #17151, a really nasty bug. 821 822Here is what happened. 823* Class struture: 824 Source is a superclass of Mut 825 Index is a superclass of Source 826 827* We started with these dict binds 828 dSource = $fSourcePix @Int $fIndexInt 829 dIndex = sc_sel dSource 830 dMut = $fMutPix @Int dIndex 831 and these calls to specialise 832 $fMutPix @Int dIndex 833 $fSourcePix @Int $fIndexInt 834 835* We specialised the call ($fMutPix @Int dIndex) 836 ==> new call ($fSourcePix @Int dIndex) 837 (because Source is a superclass of Mut) 838 839* We specialised ($fSourcePix @Int dIndex) 840 ==> produces specialised dict $s$fSourcePix, 841 a record with dIndex as a field 842 plus RULE forall d. ($fSourcePix @Int d) = $s$fSourcePix 843 *** This is the bogus step *** 844 845* Now we decide not to specialise the call 846 $fSourcePix @Int $fIndexInt 847 because we alredy have a RULE that matches it 848 849* Finally the simplifer rewrites 850 dSource = $fSourcePix @Int $fIndexInt 851 ==> dSource = $s$fSourcePix 852 853Disaster. Now we have 854 855Rewrite dSource's RHS to $s$fSourcePix Disaster 856 dSource = $s$fSourcePix 857 dIndex = sc_sel dSource 858 $s$fSourcePix = MkSource dIndex ... 859 860Solution: filterCalls should have stopped the bogus step, 861by seeing that dIndex transitively uses $fSourcePix. But 862it can only do that if it sees all the dict_binds. Wow. 863 864-------------- 865Here's another example (#13429). Suppose we have 866 class Monoid v => C v a where ... 867 868We start with a call 869 f @ [Integer] @ Integer $fC[]Integer 870 871Specialising call to 'f' gives dict bindings 872 $dMonoid_1 :: Monoid [Integer] 873 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer 874 875 $dC_1 :: C [Integer] (Node [Integer] Integer) 876 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1 877 878...plus a recursive call to 879 f @ [Integer] @ (Node [Integer] Integer) $dC_1 880 881Specialising that call gives 882 $dMonoid_2 :: Monoid [Integer] 883 $dMonoid_2 = M.$p1C @ [Integer] $dC_1 884 885 $dC_2 :: C [Integer] (Node [Integer] Integer) 886 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2 887 888Now we have two calls to the imported function 889 M.$fCvNode :: Monoid v => C v a 890 M.$fCvNode @v @a m = C m some_fun 891 892But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2) 893for specialisation, else we get: 894 895 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1 896 $dMonoid_2 = M.$p1C @ [Integer] $dC_1 897 $s$fCvNode = C $dMonoid_2 ... 898 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode 899 900Now use the rule to rewrite the call in the RHS of $dC_1 901and we get a loop! 902 903 904Note [specImport call stack] 905~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 906When specialising an imports function 'f', we may get new calls 907of an imported fuction 'g', which we want to specialise in turn, 908and similarly specialising 'g' might expose a new call to 'h'. 909 910We track the stack of enclosing functions. So when specialising 'h' we 911haev a specImport call stack of [g,f]. We do this for two reasons: 912* Note [Warning about missed specialisations] 913* Note [Avoiding recursive specialisation] 914 915Note [Warning about missed specialisations] 916~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 917Suppose 918 * In module Lib, you carefully mark a function 'foo' INLINABLE 919 * Import Lib(foo) into another module M 920 * Call 'foo' at some specialised type in M 921Then you jolly well expect it to be specialised in M. But what if 922'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be 923specialised too. But if 'bar' is not marked INLINABLE it may well 924not be specialised. The warning Opt_WarnMissedSpecs warns about this. 925 926It's more noisy to warning about a missed specialisation opportunity 927for /every/ overloaded imported function, but sometimes useful. That 928is what Opt_WarnAllMissedSpecs does. 929 930ToDo: warn about missed opportunities for local functions. 931 932Note [Avoiding recursive specialisation] 933~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 934When we specialise 'f' we may find new overloaded calls to 'g', 'h' in 935'f's RHS. So we want to specialise g,h. But we don't want to 936specialise f any more! It's possible that f's RHS might have a 937recursive yet-more-specialised call, so we'd diverge in that case. 938And if the call is to the same type, one specialisation is enough. 939Avoiding this recursive specialisation loop is one reason for the 940'callers' stack passed to specImports and specImport. 941 942Note [Specialise imported INLINABLE things] 943~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 944What imported functions do we specialise? The basic set is 945 * DFuns and things with INLINABLE pragmas. 946but with -fspecialise-aggressively we add 947 * Anything with an unfolding template 948 949#8874 has a good example of why we want to auto-specialise DFuns. 950 951We have the -fspecialise-aggressively flag (usually off), because we 952risk lots of orphan modules from over-vigorous specialisation. 953However it's not a big deal: anything non-recursive with an 954unfolding-template will probably have been inlined already. 955 956Note [Glom the bindings if imported functions are specialised] 957~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 958Suppose we have an imported, *recursive*, INLINABLE function 959 f :: Eq a => a -> a 960 f = /\a \d x. ...(f a d)... 961In the module being compiled we have 962 g x = f (x::Int) 963Now we'll make a specialised function 964 f_spec :: Int -> Int 965 f_spec = \x -> ...(f Int dInt)... 966 {-# RULE f Int _ = f_spec #-} 967 g = \x. f Int dInt x 968Note that f_spec doesn't look recursive 969After rewriting with the RULE, we get 970 f_spec = \x -> ...(f_spec)... 971BUT since f_spec was non-recursive before it'll *stay* non-recursive. 972The occurrence analyser never turns a NonRec into a Rec. So we must 973make sure that f_spec is recursive. Easiest thing is to make all 974the specialisations for imported bindings recursive. 975 976 977 978************************************************************************ 979* * 980\subsubsection{@specExpr@: the main function} 981* * 982************************************************************************ 983-} 984 985data SpecEnv 986 = SE { se_subst :: Core.Subst 987 -- We carry a substitution down: 988 -- a) we must clone any binding that might float outwards, 989 -- to avoid name clashes 990 -- b) we carry a type substitution to use when analysing 991 -- the RHS of specialised bindings (no type-let!) 992 993 994 , se_interesting :: VarSet 995 -- Dict Ids that we know something about 996 -- and hence may be worth specialising against 997 -- See Note [Interesting dictionary arguments] 998 } 999 1000instance Outputable SpecEnv where 1001 ppr (SE { se_subst = subst, se_interesting = interesting }) 1002 = text "SE" <+> braces (sep $ punctuate comma 1003 [ text "subst =" <+> ppr subst 1004 , text "interesting =" <+> ppr interesting ]) 1005 1006specVar :: SpecEnv -> Id -> CoreExpr 1007specVar env v = Core.lookupIdSubst (text "specVar") (se_subst env) v 1008 1009specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails) 1010 1011---------------- First the easy cases -------------------- 1012specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs) 1013specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs) 1014specExpr env (Var v) = return (specVar env v, emptyUDs) 1015specExpr _ (Lit lit) = return (Lit lit, emptyUDs) 1016specExpr env (Cast e co) 1017 = do { (e', uds) <- specExpr env e 1018 ; return ((mkCast e' (substCo env co)), uds) } 1019specExpr env (Tick tickish body) 1020 = do { (body', uds) <- specExpr env body 1021 ; return (Tick (specTickish env tickish) body', uds) } 1022 1023---------------- Applications might generate a call instance -------------------- 1024specExpr env expr@(App {}) 1025 = go expr [] 1026 where 1027 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg 1028 (fun', uds_app) <- go fun (arg':args) 1029 return (App fun' arg', uds_arg `plusUDs` uds_app) 1030 1031 go (Var f) args = case specVar env f of 1032 Var f' -> return (Var f', mkCallUDs env f' args) 1033 e' -> return (e', emptyUDs) -- I don't expect this! 1034 go other _ = specExpr env other 1035 1036---------------- Lambda/case require dumping of usage details -------------------- 1037specExpr env e@(Lam _ _) = do 1038 (body', uds) <- specExpr env' body 1039 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds 1040 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds) 1041 where 1042 (bndrs, body) = collectBinders e 1043 (env', bndrs') = substBndrs env bndrs 1044 -- More efficient to collect a group of binders together all at once 1045 -- and we don't want to split a lambda group with dumped bindings 1046 1047specExpr env (Case scrut case_bndr ty alts) 1048 = do { (scrut', scrut_uds) <- specExpr env scrut 1049 ; (scrut'', case_bndr', alts', alts_uds) 1050 <- specCase env scrut' case_bndr alts 1051 ; return (Case scrut'' case_bndr' (substTy env ty) alts' 1052 , scrut_uds `plusUDs` alts_uds) } 1053 1054---------------- Finally, let is the interesting case -------------------- 1055specExpr env (Let bind body) 1056 = do { -- Clone binders 1057 (rhs_env, body_env, bind') <- cloneBindSM env bind 1058 1059 -- Deal with the body 1060 ; (body', body_uds) <- specExpr body_env body 1061 1062 -- Deal with the bindings 1063 ; (binds', uds) <- specBind rhs_env bind' body_uds 1064 1065 -- All done 1066 ; return (foldr Let body' binds', uds) } 1067 1068-------------- 1069specLam :: SpecEnv -> [OutBndr] -> InExpr -> SpecM (OutExpr, UsageDetails) 1070-- The binders have been substituted, but the body has not 1071specLam env bndrs body 1072 | null bndrs 1073 = specExpr env body 1074 | otherwise 1075 = do { (body', uds) <- specExpr env body 1076 ; let (free_uds, dumped_dbs) = dumpUDs bndrs uds 1077 ; return (mkLams bndrs (wrapDictBindsE dumped_dbs body'), free_uds) } 1078 1079-------------- 1080specTickish :: SpecEnv -> Tickish Id -> Tickish Id 1081specTickish env (Breakpoint ix ids) 1082 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]] 1083 -- drop vars from the list if they have a non-variable substitution. 1084 -- should never happen, but it's harmless to drop them anyway. 1085specTickish _ other_tickish = other_tickish 1086 1087-------------- 1088specCase :: SpecEnv 1089 -> CoreExpr -- Scrutinee, already done 1090 -> Id -> [CoreAlt] 1091 -> SpecM ( CoreExpr -- New scrutinee 1092 , Id 1093 , [CoreAlt] 1094 , UsageDetails) 1095specCase env scrut' case_bndr [(con, args, rhs)] 1096 | isDictId case_bndr -- See Note [Floating dictionaries out of cases] 1097 , interestingDict env scrut' 1098 , not (isDeadBinder case_bndr && null sc_args') 1099 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args') 1100 1101 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg') 1102 [(con, args', Var sc_arg')] 1103 | sc_arg' <- sc_args' ] 1104 1105 -- Extend the substitution for RHS to map the *original* binders 1106 -- to their floated versions. 1107 mb_sc_flts :: [Maybe DictId] 1108 mb_sc_flts = map (lookupVarEnv clone_env) args' 1109 clone_env = zipVarEnv sc_args' sc_args_flt 1110 subst_prs = (case_bndr, Var case_bndr_flt) 1111 : [ (arg, Var sc_flt) 1112 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ] 1113 env_rhs' = env_rhs { se_subst = Core.extendIdSubstList (se_subst env_rhs) subst_prs 1114 , se_interesting = se_interesting env_rhs `extendVarSetList` 1115 (case_bndr_flt : sc_args_flt) } 1116 1117 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs 1118 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut') 1119 case_bndr_set = unitVarSet case_bndr_flt 1120 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set) 1121 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ] 1122 flt_binds = scrut_bind : sc_binds 1123 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds 1124 all_uds = flt_binds `addDictBinds` free_uds 1125 alt' = (con, args', wrapDictBindsE dumped_dbs rhs') 1126 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) } 1127 where 1128 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args) 1129 sc_args' = filter is_flt_sc_arg args' 1130 1131 clone_me bndr = do { uniq <- getUniqueM 1132 ; return (mkUserLocalOrCoVar occ uniq ty loc) } 1133 where 1134 name = idName bndr 1135 ty = idType bndr 1136 occ = nameOccName name 1137 loc = getSrcSpan name 1138 1139 arg_set = mkVarSet args' 1140 is_flt_sc_arg var = isId var 1141 && not (isDeadBinder var) 1142 && isDictTy var_ty 1143 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set) 1144 where 1145 var_ty = idType var 1146 1147 1148specCase env scrut case_bndr alts 1149 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts 1150 ; return (scrut, case_bndr', alts', uds_alts) } 1151 where 1152 (env_alt, case_bndr') = substBndr env case_bndr 1153 spec_alt (con, args, rhs) = do 1154 (rhs', uds) <- specExpr env_rhs rhs 1155 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds 1156 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds) 1157 where 1158 (env_rhs, args') = substBndrs env_alt args 1159 1160{- 1161Note [Floating dictionaries out of cases] 1162~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1163Consider 1164 g = \d. case d of { MkD sc ... -> ...(f sc)... } 1165Naively we can't float d2's binding out of the case expression, 1166because 'sc' is bound by the case, and that in turn means we can't 1167specialise f, which seems a pity. 1168 1169So we invert the case, by floating out a binding 1170for 'sc_flt' thus: 1171 sc_flt = case d of { MkD sc ... -> sc } 1172Now we can float the call instance for 'f'. Indeed this is just 1173what'll happen if 'sc' was originally bound with a let binding, 1174but case is more efficient, and necessary with equalities. So it's 1175good to work with both. 1176 1177You might think that this won't make any difference, because the 1178call instance will only get nuked by the \d. BUT if 'g' itself is 1179specialised, then transitively we should be able to specialise f. 1180 1181In general, given 1182 case e of cb { MkD sc ... -> ...(f sc)... } 1183we transform to 1184 let cb_flt = e 1185 sc_flt = case cb_flt of { MkD sc ... -> sc } 1186 in 1187 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... } 1188 1189The "_flt" things are the floated binds; we use the current substitution 1190to substitute sc -> sc_flt in the RHS 1191 1192************************************************************************ 1193* * 1194 Dealing with a binding 1195* * 1196************************************************************************ 1197-} 1198 1199specBind :: SpecEnv -- Use this for RHSs 1200 -> CoreBind -- Binders are already cloned by cloneBindSM, 1201 -- but RHSs are un-processed 1202 -> UsageDetails -- Info on how the scope of the binding 1203 -> SpecM ([CoreBind], -- New bindings 1204 UsageDetails) -- And info to pass upstream 1205 1206-- Returned UsageDetails: 1207-- No calls for binders of this bind 1208specBind rhs_env (NonRec fn rhs) body_uds 1209 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs 1210 1211 ; let zapped_fn = zapIdDemandInfo fn 1212 -- We zap the demand info because the binding may float, 1213 -- which would invaidate the demand info (see #17810 for example). 1214 -- Destroying demand info is not terrible; specialisation is 1215 -- always followed soon by demand analysis. 1216 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds zapped_fn rhs 1217 1218 ; let pairs = spec_defns ++ [(fn', rhs')] 1219 -- fn' mentions the spec_defns in its rules, 1220 -- so put the latter first 1221 1222 combined_uds = body_uds1 `plusUDs` rhs_uds 1223 1224 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds 1225 1226 final_binds :: [DictBind] 1227 -- See Note [From non-recursive to recursive] 1228 final_binds 1229 | not (isEmptyBag dump_dbs) 1230 , not (null spec_defns) 1231 = [recWithDumpedDicts pairs dump_dbs] 1232 | otherwise 1233 = [mkDB $ NonRec b r | (b,r) <- pairs] 1234 ++ bagToList dump_dbs 1235 1236 ; if float_all then 1237 -- Rather than discard the calls mentioning the bound variables 1238 -- we float this (dictionary) binding along with the others 1239 return ([], free_uds `snocDictBinds` final_binds) 1240 else 1241 -- No call in final_uds mentions bound variables, 1242 -- so we can just leave the binding here 1243 return (map fst final_binds, free_uds) } 1244 1245 1246specBind rhs_env (Rec pairs) body_uds 1247 -- Note [Specialising a recursive group] 1248 = do { let (bndrs,rhss) = unzip pairs 1249 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss 1250 ; let scope_uds = body_uds `plusUDs` rhs_uds 1251 -- Includes binds and calls arising from rhss 1252 1253 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs 1254 1255 ; (bndrs3, spec_defns3, uds3) 1256 <- if null spec_defns1 -- Common case: no specialisation 1257 then return (bndrs1, [], uds1) 1258 else do { -- Specialisation occurred; do it again 1259 (bndrs2, spec_defns2, uds2) 1260 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss) 1261 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) } 1262 1263 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3 1264 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss') 1265 dumped_dbs 1266 1267 ; if float_all then 1268 return ([], final_uds `snocDictBind` final_bind) 1269 else 1270 return ([fst final_bind], final_uds) } 1271 1272 1273--------------------------- 1274specDefns :: SpecEnv 1275 -> UsageDetails -- Info on how it is used in its scope 1276 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS 1277 -> SpecM ([OutId], -- Original Ids with RULES added 1278 [(OutId,OutExpr)], -- Extra, specialised bindings 1279 UsageDetails) -- Stuff to fling upwards from the specialised versions 1280 1281-- Specialise a list of bindings (the contents of a Rec), but flowing usages 1282-- upwards binding by binding. Example: { f = ...g ...; g = ...f .... } 1283-- Then if the input CallDetails has a specialised call for 'g', whose specialisation 1284-- in turn generates a specialised call for 'f', we catch that in this one sweep. 1285-- But not vice versa (it's a fixpoint problem). 1286 1287specDefns _env uds [] 1288 = return ([], [], uds) 1289specDefns env uds ((bndr,rhs):pairs) 1290 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs 1291 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs 1292 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) } 1293 1294--------------------------- 1295specDefn :: SpecEnv 1296 -> UsageDetails -- Info on how it is used in its scope 1297 -> OutId -> InExpr -- The thing being bound and its un-processed RHS 1298 -> SpecM (Id, -- Original Id with added RULES 1299 [(Id,CoreExpr)], -- Extra, specialised bindings 1300 UsageDetails) -- Stuff to fling upwards from the specialised versions 1301 1302specDefn env body_uds fn rhs 1303 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds 1304 rules_for_me = idCoreRules fn 1305 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me 1306 calls_for_me fn rhs 1307 ; return ( fn `addIdSpecialisations` rules 1308 , spec_defns 1309 , body_uds_without_me `plusUDs` spec_uds) } 1310 -- It's important that the `plusUDs` is this way 1311 -- round, because body_uds_without_me may bind 1312 -- dictionaries that are used in calls_for_me passed 1313 -- to specDefn. So the dictionary bindings in 1314 -- spec_uds may mention dictionaries bound in 1315 -- body_uds_without_me 1316 1317--------------------------- 1318specCalls :: Maybe Module -- Just this_mod => specialising imported fn 1319 -- Nothing => specialising local fn 1320 -> SpecEnv 1321 -> [CoreRule] -- Existing RULES for the fn 1322 -> [CallInfo] 1323 -> OutId -> InExpr 1324 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details 1325 1326-- This function checks existing rules, and does not create 1327-- duplicate ones. So the caller does not need to do this filtering. 1328-- See 'already_covered' 1329 1330type SpecInfo = ( [CoreRule] -- Specialisation rules 1331 , [(Id,CoreExpr)] -- Specialised definition 1332 , UsageDetails ) -- Usage details from specialised RHSs 1333 1334specCalls mb_mod env existing_rules calls_for_me fn rhs 1335 -- The first case is the interesting one 1336 | notNull calls_for_me -- And there are some calls to specialise 1337 && not (isNeverActive (idInlineActivation fn)) 1338 -- Don't specialise NOINLINE things 1339 -- See Note [Auto-specialisation and RULES] 1340 1341-- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small 1342-- See Note [Inline specialisation] for why we do not 1343-- switch off specialisation for inline functions 1344 1345 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $ 1346 foldlM spec_call ([], [], emptyUDs) calls_for_me 1347 1348 | otherwise -- No calls or RHS doesn't fit our preconceptions 1349 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me, 1350 text "Missed specialisation opportunity for" 1351 <+> ppr fn $$ _trace_doc ) 1352 -- Note [Specialisation shape] 1353 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $ 1354 return ([], [], emptyUDs) 1355 where 1356 _trace_doc = sep [ ppr rhs_bndrs, ppr (idInlineActivation fn) ] 1357 1358 fn_type = idType fn 1359 fn_arity = idArity fn 1360 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here 1361 inl_prag = idInlinePragma fn 1362 inl_act = inlinePragmaActivation inl_prag 1363 is_local = isLocalId fn 1364 is_dfun = isDFunId fn 1365 1366 -- Figure out whether the function has an INLINE pragma 1367 -- See Note [Inline specialisations] 1368 1369 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs 1370 -- See Note [Account for casts in binding] 1371 1372 in_scope = Core.substInScope (se_subst env) 1373 1374 already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool 1375 already_covered dflags new_rules args -- Note [Specialisations already covered] 1376 = isJust (lookupRule dflags (in_scope, realIdUnfolding) 1377 (const True) fn args 1378 (new_rules ++ existing_rules)) 1379 -- NB: we look both in the new_rules (generated by this invocation 1380 -- of specCalls), and in existing_rules (passed in to specCalls) 1381 1382 ---------------------------------------------------------- 1383 -- Specialise to one particular call pattern 1384 spec_call :: SpecInfo -- Accumulating parameter 1385 -> CallInfo -- Call instance 1386 -> SpecM SpecInfo 1387 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc) _ci@(CI { ci_key = call_args }) 1388 = -- See Note [Specialising Calls] 1389 do { let all_call_args | is_dfun = call_args ++ repeat UnspecArg 1390 | otherwise = call_args 1391 -- See Note [Specialising DFuns] 1392 ; ( useful, rhs_env2, leftover_bndrs 1393 , rule_bndrs, rule_lhs_args 1394 , spec_bndrs1, dx_binds, spec_args) <- specHeader env rhs_bndrs all_call_args 1395 1396-- ; pprTrace "spec_call" (vcat [ text "call info: " <+> ppr _ci 1397-- , text "useful: " <+> ppr useful 1398-- , text "rule_bndrs:" <+> ppr rule_bndrs 1399-- , text "lhs_args: " <+> ppr rule_lhs_args 1400-- , text "spec_bndrs:" <+> ppr spec_bndrs1 1401-- , text "spec_args: " <+> ppr spec_args 1402-- , text "dx_binds: " <+> ppr dx_binds 1403-- , text "rhs_env2: " <+> ppr (se_subst rhs_env2) 1404-- , ppr dx_binds ]) $ 1405-- return () 1406 1407 ; dflags <- getDynFlags 1408 ; if not useful -- No useful specialisation 1409 || already_covered dflags rules_acc rule_lhs_args 1410 then return spec_acc 1411 else 1412 do { -- Run the specialiser on the specialised RHS 1413 -- The "1" suffix is before we maybe add the void arg 1414 ; (spec_rhs1, rhs_uds) <- specLam rhs_env2 (spec_bndrs1 ++ leftover_bndrs) rhs_body 1415 ; let spec_fn_ty1 = exprType spec_rhs1 1416 1417 -- Maybe add a void arg to the specialised function, 1418 -- to avoid unlifted bindings 1419 -- See Note [Specialisations Must Be Lifted] 1420 -- C.f. GHC.Core.Op.WorkWrap.Lib.mkWorkerArgs 1421 add_void_arg = isUnliftedType spec_fn_ty1 && not (isJoinId fn) 1422 (spec_bndrs, spec_rhs, spec_fn_ty) 1423 | add_void_arg = ( voidPrimId : spec_bndrs1 1424 , Lam voidArgId spec_rhs1 1425 , mkVisFunTy voidPrimTy spec_fn_ty1) 1426 | otherwise = (spec_bndrs1, spec_rhs1, spec_fn_ty1) 1427 1428 join_arity_decr = length rule_lhs_args - length spec_bndrs 1429 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn 1430 = Just (orig_join_arity - join_arity_decr) 1431 | otherwise 1432 = Nothing 1433 1434 ; spec_fn <- newSpecIdSM fn spec_fn_ty spec_join_arity 1435 ; this_mod <- getModule 1436 ; let 1437 -- The rule to put in the function's specialisation is: 1438 -- forall x @b d1' d2'. 1439 -- f x @T1 @b @T2 d1' d2' = f1 x @b 1440 -- See Note [Specialising Calls] 1441 herald = case mb_mod of 1442 Nothing -- Specialising local fn 1443 -> text "SPEC" 1444 Just this_mod -- Specialising imported fn 1445 -> text "SPEC/" <> ppr this_mod 1446 1447 rule_name = mkFastString $ showSDoc dflags $ 1448 herald <+> ftext (occNameFS (getOccName fn)) 1449 <+> hsep (mapMaybe ppr_call_key_ty call_args) 1450 -- This name ends up in interface files, so use occNameString. 1451 -- Otherwise uniques end up there, making builds 1452 -- less deterministic (See #4012 comment:61 ff) 1453 1454 rule_wout_eta = mkRule 1455 this_mod 1456 True {- Auto generated -} 1457 is_local 1458 rule_name 1459 inl_act -- Note [Auto-specialisation and RULES] 1460 (idName fn) 1461 rule_bndrs 1462 rule_lhs_args 1463 (mkVarApps (Var spec_fn) spec_bndrs) 1464 1465 spec_rule 1466 = case isJoinId_maybe fn of 1467 Just join_arity -> etaExpandToJoinPointRule join_arity rule_wout_eta 1468 Nothing -> rule_wout_eta 1469 1470 -- Add the { d1' = dx1; d2' = dx2 } usage stuff 1471 -- See Note [Specialising Calls] 1472 spec_uds = foldr consDictBind rhs_uds dx_binds 1473 1474 -------------------------------------- 1475 -- Add a suitable unfolding if the spec_inl_prag says so 1476 -- See Note [Inline specialisations] 1477 (spec_inl_prag, spec_unf) 1478 | not is_local && isStrongLoopBreaker (idOccInfo fn) 1479 = (neverInlinePragma, noUnfolding) 1480 -- See Note [Specialising imported functions] in OccurAnal 1481 1482 | InlinePragma { inl_inline = Inlinable } <- inl_prag 1483 = (inl_prag { inl_inline = NoUserInline }, noUnfolding) 1484 1485 | otherwise 1486 = (inl_prag, specUnfolding dflags spec_bndrs (`mkApps` spec_args) 1487 rule_lhs_args fn_unf) 1488 1489 -------------------------------------- 1490 -- Adding arity information just propagates it a bit faster 1491 -- See Note [Arity decrease] in Simplify 1492 -- Copy InlinePragma information from the parent Id. 1493 -- So if f has INLINE[1] so does spec_fn 1494 arity_decr = count isValArg rule_lhs_args - count isId spec_bndrs 1495 spec_f_w_arity = spec_fn `setIdArity` max 0 (fn_arity - arity_decr) 1496 `setInlinePragma` spec_inl_prag 1497 `setIdUnfolding` spec_unf 1498 `asJoinId_maybe` spec_join_arity 1499 1500 _rule_trace_doc = vcat [ ppr fn <+> dcolon <+> ppr fn_type 1501 , ppr spec_fn <+> dcolon <+> ppr spec_fn_ty 1502 , ppr rhs_bndrs, ppr call_args 1503 , ppr spec_rule 1504 ] 1505 1506 ; -- pprTrace "spec_call: rule" _rule_trace_doc 1507 return ( spec_rule : rules_acc 1508 , (spec_f_w_arity, spec_rhs) : pairs_acc 1509 , spec_uds `plusUDs` uds_acc 1510 ) } } 1511 1512{- Note [Specialising DFuns] 1513~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1514DFuns have a special sort of unfolding (DFunUnfolding), and these are 1515hard to specialise a DFunUnfolding to give another DFunUnfolding 1516unless the DFun is fully applied (#18120). So, in the case of DFunIds 1517we simply extend the CallKey with trailing UnspecArgs, so we'll 1518generate a rule that completely saturates the DFun. 1519 1520There is an ASSERT that checks this, in the DFunUnfolding case of 1521GHC.Core.Unfold.specUnfolding. 1522 1523Note [Specialisation Must Preserve Sharing] 1524~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1525Consider a function: 1526 1527 f :: forall a. Eq a => a -> blah 1528 f = 1529 if expensive 1530 then f1 1531 else f2 1532 1533As written, all calls to 'f' will share 'expensive'. But if we specialise 'f' 1534at 'Int', eg: 1535 1536 $sfInt = SUBST[a->Int,dict->dEqInt] (if expensive then f1 else f2) 1537 1538 RULE "SPEC f" 1539 forall (d :: Eq Int). 1540 f Int _ = $sfIntf 1541 1542We've now lost sharing between 'f' and '$sfInt' for 'expensive'. Yikes! 1543 1544To avoid this, we only generate specialisations for functions whose arity is 1545enough to bind all of the arguments we need to specialise. This ensures our 1546specialised functions don't do any work before receiving all of their dicts, 1547and thus avoids the 'f' case above. 1548 1549Note [Specialisations Must Be Lifted] 1550~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1551Consider a function 'f': 1552 1553 f = forall a. Eq a => Array# a 1554 1555used like 1556 1557 case x of 1558 True -> ...f @Int dEqInt... 1559 False -> 0 1560 1561Naively, we might generate an (expensive) specialisation 1562 1563 $sfInt :: Array# Int 1564 1565even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to 1566the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to 1567preserve laziness. 1568 1569Note [Specialising Calls] 1570~~~~~~~~~~~~~~~~~~~~~~~~~ 1571Suppose we have a function with a complicated type: 1572 1573 f :: forall a b c. Int -> Eq a => Show b => c -> Blah 1574 f @a @b @c i dEqA dShowA x = blah 1575 1576and suppose it is called at: 1577 1578 f 7 @T1 @T2 @T3 dEqT1 ($dfShow dShowT2) t3 1579 1580This call is described as a 'CallInfo' whose 'ci_key' is: 1581 1582 [ SpecType T1, SpecType T2, UnspecType, UnspecArg, SpecDict dEqT1 1583 , SpecDict ($dfShow dShowT2), UnspecArg ] 1584 1585Why are 'a' and 'b' identified as 'SpecType', while 'c' is 'UnspecType'? 1586Because we must specialise the function on type variables that appear 1587free in its *dictionary* arguments; but not on type variables that do not 1588appear in any dictionaries, i.e. are fully polymorphic. 1589 1590Because this call has dictionaries applied, we'd like to specialise 1591the call on any type argument that appears free in those dictionaries. 1592In this case, those are [a :-> T1, b :-> T2]. 1593 1594We also need to substitute the dictionary binders with their 1595specialised dictionaries. The simplest substitution would be 1596[dEqA :-> dEqT1, dShowA :-> $dfShow dShowT2], but this duplicates 1597work, since `$dfShow dShowT2` is a function application. Therefore, we 1598also want to *float the dictionary out* (via bindAuxiliaryDict), 1599creating a new dict binding 1600 1601 dShow1 = $dfShow dShowT2 1602 1603and the substitution [dEqA :-> dEqT1, dShowA :-> dShow1]. 1604 1605With the substitutions in hand, we can generate a specialised function: 1606 1607 $sf :: forall c. Int -> c -> Blah 1608 $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah) 1609 1610Note that the substitution is applied to the whole thing. This is 1611convenient, but just slightly fragile. Notably: 1612 * There had better be no name clashes in a/b/c 1613 1614We must construct a rewrite rule: 1615 1616 RULE "SPEC f @T1 @T2 _" 1617 forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2). 1618 f @T1 @T2 @c i d1 d2 = $sf @c i 1619 1620In the rule, d1 and d2 are just wildcards, not used in the RHS. Note 1621additionally that 'x' isn't captured by this rule --- we bind only 1622enough etas in order to capture all of the *specialised* arguments. 1623 1624Note [Drop dead args from specialisations] 1625~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1626When specialising a function, it’s possible some of the arguments may 1627actually be dead. For example, consider: 1628 1629 f :: forall a. () -> Show a => a -> String 1630 f x y = show y ++ "!" 1631 1632We might generate the following CallInfo for `f @Int`: 1633 1634 [SpecType Int, UnspecArg, SpecDict $dShowInt, UnspecArg] 1635 1636Normally we’d include both the x and y arguments in the 1637specialisation, since we’re not specialising on either of them. But 1638that’s silly, since x is actually unused! So we might as well drop it 1639in the specialisation: 1640 1641 $sf :: Int -> String 1642 $sf y = show y ++ "!" 1643 1644 {-# RULE "SPEC f @Int" forall x. f @Int x $dShow = $sf #-} 1645 1646This doesn’t save us much, since the arg would be removed later by 1647worker/wrapper, anyway, but it’s easy to do. Note, however, that we 1648only drop dead arguments if: 1649 1650 1. We don’t specialise on them. 1651 2. They come before an argument we do specialise on. 1652 1653Doing the latter would require eta-expanding the RULE, which could 1654make it match less often, so it’s not worth it. Doing the former could 1655be more useful --- it would stop us from generating pointless 1656specialisations --- but it’s more involved to implement and unclear if 1657it actually provides much benefit in practice. 1658 1659Note [Zap occ info in rule binders] 1660~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1661When we generate a specialisation RULE, we need to drop occurrence 1662info on the binders. If we don’t, things go wrong when we specialise a 1663function like 1664 1665 f :: forall a. () -> Show a => a -> String 1666 f x y = show y ++ "!" 1667 1668since we’ll generate a RULE like 1669 1670 RULE "SPEC f @Int" forall x [Occ=Dead]. 1671 f @Int x $dShow = $sf 1672 1673and Core Lint complains, even though x only appears on the LHS (due to 1674Note [Drop dead args from specialisations]). 1675 1676Why is that a Lint error? Because the arguments on the LHS of a rule 1677are syntactically expressions, not patterns, so Lint treats the 1678appearance of x as a use rather than a binding. Fortunately, the 1679solution is simple: we just make sure to zap the occ info before 1680using ids as wildcard binders in a rule. 1681 1682Note [Account for casts in binding] 1683~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1684Consider 1685 f :: Eq a => a -> IO () 1686 {-# INLINABLE f 1687 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g 1688 #-} 1689 f = ... 1690 1691In f's stable unfolding we have done some modest simplification which 1692has pushed the cast to the outside. (I wonder if this is the Right 1693Thing, but it's what happens now; see SimplUtils Note [Casts and 1694lambdas].) Now that stable unfolding must be specialised, so we want 1695to push the cast back inside. It would be terrible if the cast 1696defeated specialisation! Hence the use of collectBindersPushingCo. 1697 1698Note [Evidence foralls] 1699~~~~~~~~~~~~~~~~~~~~~~~~~~ 1700Suppose (#12212) that we are specialising 1701 f :: forall a b. (Num a, F a ~ F b) => blah 1702with a=b=Int. Then the RULE will be something like 1703 RULE forall (d:Num Int) (g :: F Int ~ F Int). 1704 f Int Int d g = f_spec 1705But both varToCoreExpr (when constructing the LHS args), and the 1706simplifier (when simplifying the LHS args), will transform to 1707 RULE forall (d:Num Int) (g :: F Int ~ F Int). 1708 f Int Int d <F Int> = f_spec 1709by replacing g with Refl. So now 'g' is unbound, which results in a later 1710crash. So we use Refl right off the bat, and do not forall-quantify 'g': 1711 * varToCoreExpr generates a Refl 1712 * exprsFreeIdsList returns the Ids bound by the args, 1713 which won't include g 1714 1715You might wonder if this will match as often, but the simplifier replaces 1716complicated Refl coercions with Refl pretty aggressively. 1717 1718Note [Orphans and auto-generated rules] 1719~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1720When we specialise an INLINABLE function, or when we have 1721-fspecialise-aggressively, we auto-generate RULES that are orphans. 1722We don't want to warn about these, or we'd generate a lot of warnings. 1723Thus, we only warn about user-specified orphan rules. 1724 1725Indeed, we don't even treat the module as an orphan module if it has 1726auto-generated *rule* orphans. Orphan modules are read every time we 1727compile, so they are pretty obtrusive and slow down every compilation, 1728even non-optimised ones. (Reason: for type class instances it's a 1729type correctness issue.) But specialisation rules are strictly for 1730*optimisation* only so it's fine not to read the interface. 1731 1732What this means is that a SPEC rules from auto-specialisation in 1733module M will be used in other modules only if M.hi has been read for 1734some other reason, which is actually pretty likely. 1735 1736Note [From non-recursive to recursive] 1737~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1738Even in the non-recursive case, if any dict-binds depend on 'fn' we might 1739have built a recursive knot 1740 1741 f a d x = <blah> 1742 MkUD { ud_binds = NonRec d7 (MkD ..f..) 1743 , ud_calls = ...(f T d7)... } 1744 1745The we generate 1746 1747 Rec { fs x = <blah>[T/a, d7/d] 1748 f a d x = <blah> 1749 RULE f T _ = fs 1750 d7 = ...f... } 1751 1752Here the recursion is only through the RULE. 1753 1754However we definitely should /not/ make the Rec in this wildly common 1755case: 1756 d = ... 1757 MkUD { ud_binds = NonRec d7 (...d...) 1758 , ud_calls = ...(f T d7)... } 1759 1760Here we want simply to add d to the floats, giving 1761 MkUD { ud_binds = NonRec d (...) 1762 NonRec d7 (...d...) 1763 , ud_calls = ...(f T d7)... } 1764 1765In general, we need only make this Rec if 1766 - there are some specialisations (spec_binds non-empty) 1767 - there are some dict_binds that depend on f (dump_dbs non-empty) 1768 1769Note [Avoiding loops] 1770~~~~~~~~~~~~~~~~~~~~~ 1771When specialising /dictionary functions/ we must be very careful to 1772avoid building loops. Here is an example that bit us badly: #3591 1773 1774 class Eq a => C a 1775 instance Eq [a] => C [a] 1776 1777This translates to 1778 dfun :: Eq [a] -> C [a] 1779 dfun a d = MkD a d (meth d) 1780 1781 d4 :: Eq [T] = <blah> 1782 d2 :: C [T] = dfun T d4 1783 d1 :: Eq [T] = $p1 d2 1784 d3 :: C [T] = dfun T d1 1785 1786None of these definitions is recursive. What happened was that we 1787generated a specialisation: 1788 RULE forall d. dfun T d = dT :: C [T] 1789 dT = (MkD a d (meth d)) [T/a, d1/d] 1790 = MkD T d1 (meth d1) 1791 1792But now we use the RULE on the RHS of d2, to get 1793 d2 = dT = MkD d1 (meth d1) 1794 d1 = $p1 d2 1795 1796and now d1 is bottom! The problem is that when specialising 'dfun' we 1797should first dump "below" the binding all floated dictionary bindings 1798that mention 'dfun' itself. So d2 and d3 (and hence d1) must be 1799placed below 'dfun', and thus unavailable to it when specialising 1800'dfun'. That in turn means that the call (dfun T d1) must be 1801discarded. On the other hand, the call (dfun T d4) is fine, assuming 1802d4 doesn't mention dfun. 1803 1804Solution: 1805 Discard all calls that mention dictionaries that depend 1806 (directly or indirectly) on the dfun we are specialising. 1807 This is done by 'filterCalls' 1808 1809-------------- 1810Here's yet another example 1811 1812 class C a where { foo,bar :: [a] -> [a] } 1813 1814 instance C Int where 1815 foo x = r_bar x 1816 bar xs = reverse xs 1817 1818 r_bar :: C a => [a] -> [a] 1819 r_bar xs = bar (xs ++ xs) 1820 1821That translates to: 1822 1823 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs) 1824 1825 Rec { $fCInt :: C Int = MkC foo_help reverse 1826 foo_help (xs::[Int]) = r_bar Int $fCInt xs } 1827 1828The call (r_bar $fCInt) mentions $fCInt, 1829 which mentions foo_help, 1830 which mentions r_bar 1831But we DO want to specialise r_bar at Int: 1832 1833 Rec { $fCInt :: C Int = MkC foo_help reverse 1834 foo_help (xs::[Int]) = r_bar Int $fCInt xs 1835 1836 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs) 1837 RULE r_bar Int _ = r_bar_Int 1838 1839 r_bar_Int xs = bar Int $fCInt (xs ++ xs) 1840 } 1841 1842Note that, because of its RULE, r_bar joins the recursive 1843group. (In this case it'll unravel a short moment later.) 1844 1845 1846Note [Specialising a recursive group] 1847~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1848Consider 1849 let rec { f x = ...g x'... 1850 ; g y = ...f y'.... } 1851 in f 'a' 1852Here we specialise 'f' at Char; but that is very likely to lead to 1853a specialisation of 'g' at Char. We must do the latter, else the 1854whole point of specialisation is lost. 1855 1856But we do not want to keep iterating to a fixpoint, because in the 1857presence of polymorphic recursion we might generate an infinite number 1858of specialisations. 1859 1860So we use the following heuristic: 1861 * Arrange the rec block in dependency order, so far as possible 1862 (the occurrence analyser already does this) 1863 1864 * Specialise it much like a sequence of lets 1865 1866 * Then go through the block a second time, feeding call-info from 1867 the RHSs back in the bottom, as it were 1868 1869In effect, the ordering maxmimises the effectiveness of each sweep, 1870and we do just two sweeps. This should catch almost every case of 1871monomorphic recursion -- the exception could be a very knotted-up 1872recursion with multiple cycles tied up together. 1873 1874This plan is implemented in the Rec case of specBindItself. 1875 1876Note [Specialisations already covered] 1877~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1878We obviously don't want to generate two specialisations for the same 1879argument pattern. There are two wrinkles 1880 18811. We do the already-covered test in specDefn, not when we generate 1882the CallInfo in mkCallUDs. We used to test in the latter place, but 1883we now iterate the specialiser somewhat, and the Id at the call site 1884might therefore not have all the RULES that we can see in specDefn 1885 18862. What about two specialisations where the second is an *instance* 1887of the first? If the more specific one shows up first, we'll generate 1888specialisations for both. If the *less* specific one shows up first, 1889we *don't* currently generate a specialisation for the more specific 1890one. (See the call to lookupRule in already_covered.) Reasons: 1891 (a) lookupRule doesn't say which matches are exact (bad reason) 1892 (b) if the earlier specialisation is user-provided, it's 1893 far from clear that we should auto-specialise further 1894 1895Note [Auto-specialisation and RULES] 1896~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1897Consider: 1898 g :: Num a => a -> a 1899 g = ... 1900 1901 f :: (Int -> Int) -> Int 1902 f w = ... 1903 {-# RULE f g = 0 #-} 1904 1905Suppose that auto-specialisation makes a specialised version of 1906g::Int->Int That version won't appear in the LHS of the RULE for f. 1907So if the specialisation rule fires too early, the rule for f may 1908never fire. 1909 1910It might be possible to add new rules, to "complete" the rewrite system. 1911Thus when adding 1912 RULE forall d. g Int d = g_spec 1913also add 1914 RULE f g_spec = 0 1915 1916But that's a bit complicated. For now we ask the programmer's help, 1917by *copying the INLINE activation pragma* to the auto-specialised 1918rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule 1919will also not be active until phase 2. And that's what programmers 1920should jolly well do anyway, even aside from specialisation, to ensure 1921that g doesn't inline too early. 1922 1923This in turn means that the RULE would never fire for a NOINLINE 1924thing so not much point in generating a specialisation at all. 1925 1926Note [Specialisation shape] 1927~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1928We only specialise a function if it has visible top-level lambdas 1929corresponding to its overloading. E.g. if 1930 f :: forall a. Eq a => .... 1931then its body must look like 1932 f = /\a. \d. ... 1933 1934Reason: when specialising the body for a call (f ty dexp), we want to 1935substitute dexp for d, and pick up specialised calls in the body of f. 1936 1937This doesn't always work. One example I came across was this: 1938 newtype Gen a = MkGen{ unGen :: Int -> a } 1939 1940 choose :: Eq a => a -> Gen a 1941 choose n = MkGen (\r -> n) 1942 1943 oneof = choose (1::Int) 1944 1945It's a silly example, but we get 1946 choose = /\a. g `cast` co 1947where choose doesn't have any dict arguments. Thus far I have not 1948tried to fix this (wait till there's a real example). 1949 1950Mind you, then 'choose' will be inlined (since RHS is trivial) so 1951it doesn't matter. This comes up with single-method classes 1952 1953 class C a where { op :: a -> a } 1954 instance C a => C [a] where .... 1955==> 1956 $fCList :: C a => C [a] 1957 $fCList = $copList |> (...coercion>...) 1958 ....(uses of $fCList at particular types)... 1959 1960So we suppress the WARN if the rhs is trivial. 1961 1962Note [Inline specialisations] 1963~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1964Here is what we do with the InlinePragma of the original function 1965 * Activation/RuleMatchInfo: both transferred to the 1966 specialised function 1967 * InlineSpec: 1968 (a) An INLINE pragma is transferred 1969 (b) An INLINABLE pragma is *not* transferred 1970 1971Why (a): transfer INLINE pragmas? The point of INLINE was precisely to 1972specialise the function at its call site, and arguably that's not so 1973important for the specialised copies. BUT *pragma-directed* 1974specialisation now takes place in the typechecker/desugarer, with 1975manually specified INLINEs. The specialisation here is automatic. 1976It'd be very odd if a function marked INLINE was specialised (because 1977of some local use), and then forever after (including importing 1978modules) the specialised version wasn't INLINEd. After all, the 1979programmer said INLINE! 1980 1981You might wonder why we specialise INLINE functions at all. After 1982all they should be inlined, right? Two reasons: 1983 1984 * Even INLINE functions are sometimes not inlined, when they aren't 1985 applied to interesting arguments. But perhaps the type arguments 1986 alone are enough to specialise (even though the args are too boring 1987 to trigger inlining), and it's certainly better to call the 1988 specialised version. 1989 1990 * The RHS of an INLINE function might call another overloaded function, 1991 and we'd like to generate a specialised version of that function too. 1992 This actually happens a lot. Consider 1993 replicateM_ :: (Monad m) => Int -> m a -> m () 1994 {-# INLINABLE replicateM_ #-} 1995 replicateM_ d x ma = ... 1996 The strictness analyser may transform to 1997 replicateM_ :: (Monad m) => Int -> m a -> m () 1998 {-# INLINE replicateM_ #-} 1999 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma 2000 2001 $wreplicateM_ :: (Monad m) => Int# -> m a -> m () 2002 {-# INLINABLE $wreplicateM_ #-} 2003 $wreplicateM_ = ... 2004 Now an importing module has a specialised call to replicateM_, say 2005 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_! 2006 This particular example had a huge effect on the call to replicateM_ 2007 in nofib/shootout/n-body. 2008 2009Why (b): discard INLINABLE pragmas? See #4874 for persuasive examples. 2010Suppose we have 2011 {-# INLINABLE f #-} 2012 f :: Ord a => [a] -> Int 2013 f xs = letrec f' = ...f'... in f' 2014Then, when f is specialised and optimised we might get 2015 wgo :: [Int] -> Int# 2016 wgo = ...wgo... 2017 f_spec :: [Int] -> Int 2018 f_spec xs = case wgo xs of { r -> I# r } 2019and we clearly want to inline f_spec at call sites. But if we still 2020have the big, un-optimised of f (albeit specialised) captured in an 2021INLINABLE pragma for f_spec, we won't get that optimisation. 2022 2023So we simply drop INLINABLE pragmas when specialising. It's not really 2024a complete solution; ignoring specialisation for now, INLINABLE functions 2025don't get properly strictness analysed, for example. But it works well 2026for examples involving specialisation, which is the dominant use of 2027INLINABLE. See #4874. 2028-} 2029 2030{- ********************************************************************* 2031* * 2032 SpecArg, and specHeader 2033* * 2034********************************************************************* -} 2035 2036-- | An argument that we might want to specialise. 2037-- See Note [Specialising Calls] for the nitty gritty details. 2038data SpecArg 2039 = 2040 -- | Type arguments that should be specialised, due to appearing 2041 -- free in the type of a 'SpecDict'. 2042 SpecType Type 2043 2044 -- | Type arguments that should remain polymorphic. 2045 | UnspecType 2046 2047 -- | Dictionaries that should be specialised. mkCallUDs ensures 2048 -- that only "interesting" dictionary arguments get a SpecDict; 2049 -- see Note [Interesting dictionary arguments] 2050 | SpecDict DictExpr 2051 2052 -- | Value arguments that should not be specialised. 2053 | UnspecArg 2054 2055instance Outputable SpecArg where 2056 ppr (SpecType t) = text "SpecType" <+> ppr t 2057 ppr UnspecType = text "UnspecType" 2058 ppr (SpecDict d) = text "SpecDict" <+> ppr d 2059 ppr UnspecArg = text "UnspecArg" 2060 2061specArgFreeVars :: SpecArg -> VarSet 2062specArgFreeVars (SpecType ty) = tyCoVarsOfType ty 2063specArgFreeVars (SpecDict dx) = exprFreeVars dx 2064specArgFreeVars UnspecType = emptyVarSet 2065specArgFreeVars UnspecArg = emptyVarSet 2066 2067isSpecDict :: SpecArg -> Bool 2068isSpecDict (SpecDict {}) = True 2069isSpecDict _ = False 2070 2071-- | Given binders from an original function 'f', and the 'SpecArg's 2072-- corresponding to its usage, compute everything necessary to build 2073-- a specialisation. 2074-- 2075-- We will use the running example from Note [Specialising Calls]: 2076-- 2077-- f :: forall a b c. Int -> Eq a => Show b => c -> Blah 2078-- f @a @b @c i dEqA dShowA x = blah 2079-- 2080-- Suppose we decide to specialise it at the following pattern: 2081-- 2082-- [ SpecType T1, SpecType T2, UnspecType, UnspecArg 2083-- , SpecDict dEqT1, SpecDict ($dfShow dShowT2), UnspecArg ] 2084-- 2085-- We'd eventually like to build the RULE 2086-- 2087-- RULE "SPEC f @T1 @T2 _" 2088-- forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2). 2089-- f @T1 @T2 @c i d1 d2 = $sf @c i 2090-- 2091-- and the specialisation '$sf' 2092-- 2093-- $sf :: forall c. Int -> c -> Blah 2094-- $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah) 2095-- 2096-- where dShow1 is a floated binding created by bindAuxiliaryDict. 2097-- 2098-- The cases for 'specHeader' below are presented in the same order as this 2099-- running example. The result of 'specHeader' for this example is as follows: 2100-- 2101-- ( -- Returned arguments 2102-- env + [a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] 2103-- , [x] 2104-- 2105-- -- RULE helpers 2106-- , [c, i, d1, d2] 2107-- , [T1, T2, c, i, d1, d2] 2108-- 2109-- -- Specialised function helpers 2110-- , [c, i, x] 2111-- , [dShow1 = $dfShow dShowT2] 2112-- , [T1, T2, c, i, dEqT1, dShow1] 2113-- ) 2114specHeader 2115 :: SpecEnv 2116 -> [InBndr] -- The binders from the original function 'f' 2117 -> [SpecArg] -- From the CallInfo 2118 -> SpecM ( Bool -- True <=> some useful specialisation happened 2119 -- Not the same as any (isSpecDict args) because 2120 -- the args might be longer than bndrs 2121 2122 -- Returned arguments 2123 , SpecEnv -- Substitution to apply to the body of 'f' 2124 , [OutBndr] -- Leftover binders from the original function 'f' 2125 -- that don’t have a corresponding SpecArg 2126 2127 -- RULE helpers 2128 , [OutBndr] -- Binders for the RULE 2129 , [OutExpr] -- Args for the LHS of the rule 2130 2131 -- Specialised function helpers 2132 , [OutBndr] -- Binders for $sf 2133 , [DictBind] -- Auxiliary dictionary bindings 2134 , [OutExpr] -- Specialised arguments for unfolding 2135 -- Same length as "args for LHS of rule" 2136 ) 2137 2138-- We want to specialise on type 'T1', and so we must construct a substitution 2139-- 'a->T1', as well as a LHS argument for the resulting RULE and unfolding 2140-- details. 2141specHeader env (bndr : bndrs) (SpecType t : args) 2142 = do { let env' = extendTvSubstList env [(bndr, t)] 2143 ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args) 2144 <- specHeader env' bndrs args 2145 ; pure ( useful 2146 , env'' 2147 , leftover_bndrs 2148 , rule_bs 2149 , Type t : rule_es 2150 , bs' 2151 , dx 2152 , Type t : spec_args 2153 ) 2154 } 2155 2156-- Next we have a type that we don't want to specialise. We need to perform 2157-- a substitution on it (in case the type refers to 'a'). Additionally, we need 2158-- to produce a binder, LHS argument and RHS argument for the resulting rule, 2159-- /and/ a binder for the specialised body. 2160specHeader env (bndr : bndrs) (UnspecType : args) 2161 = do { let (env', bndr') = substBndr env bndr 2162 ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args) 2163 <- specHeader env' bndrs args 2164 ; pure ( useful 2165 , env'' 2166 , leftover_bndrs 2167 , bndr' : rule_bs 2168 , varToCoreExpr bndr' : rule_es 2169 , bndr' : bs' 2170 , dx 2171 , varToCoreExpr bndr' : spec_args 2172 ) 2173 } 2174 2175-- Next we want to specialise the 'Eq a' dict away. We need to construct 2176-- a wildcard binder to match the dictionary (See Note [Specialising Calls] for 2177-- the nitty-gritty), as a LHS rule and unfolding details. 2178specHeader env (bndr : bndrs) (SpecDict d : args) 2179 = do { bndr' <- newDictBndr env bndr -- See Note [Zap occ info in rule binders] 2180 ; let (env', dx_bind, spec_dict) = bindAuxiliaryDict env bndr bndr' d 2181 ; (_, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args) 2182 <- specHeader env' bndrs args 2183 ; pure ( True -- Ha! A useful specialisation! 2184 , env'' 2185 , leftover_bndrs 2186 -- See Note [Evidence foralls] 2187 , exprFreeIdsList (varToCoreExpr bndr') ++ rule_bs 2188 , varToCoreExpr bndr' : rule_es 2189 , bs' 2190 , maybeToList dx_bind ++ dx 2191 , spec_dict : spec_args 2192 ) 2193 } 2194 2195-- Finally, we have the unspecialised argument 'i'. We need to produce 2196-- a binder, LHS and RHS argument for the RULE, and a binder for the 2197-- specialised body. 2198-- 2199-- NB: Calls to 'specHeader' will trim off any trailing 'UnspecArg's, which is 2200-- why 'i' doesn't appear in our RULE above. But we have no guarantee that 2201-- there aren't 'UnspecArg's which come /before/ all of the dictionaries, so 2202-- this case must be here. 2203specHeader env (bndr : bndrs) (UnspecArg : args) 2204 = do { -- see Note [Zap occ info in rule binders] 2205 let (env', bndr') = substBndr env (zapIdOccInfo bndr) 2206 ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args) 2207 <- specHeader env' bndrs args 2208 ; pure ( useful 2209 , env'' 2210 , leftover_bndrs 2211 , bndr' : rule_bs 2212 , varToCoreExpr bndr' : rule_es 2213 , if isDeadBinder bndr 2214 then bs' -- see Note [Drop dead args from specialisations] 2215 else bndr' : bs' 2216 , dx 2217 , varToCoreExpr bndr' : spec_args 2218 ) 2219 } 2220 2221-- If we run out of binders, stop immediately 2222-- See Note [Specialisation Must Preserve Sharing] 2223specHeader env [] _ = pure (False, env, [], [], [], [], [], []) 2224 2225-- Return all remaining binders from the original function. These have the 2226-- invariant that they should all correspond to unspecialised arguments, so 2227-- it's safe to stop processing at this point. 2228specHeader env bndrs [] 2229 = pure (False, env', bndrs', [], [], [], [], []) 2230 where 2231 (env', bndrs') = substBndrs env bndrs 2232 2233 2234-- | Binds a dictionary argument to a fresh name, to preserve sharing 2235bindAuxiliaryDict 2236 :: SpecEnv 2237 -> InId -> OutId -> OutExpr -- Original dict binder, and the witnessing expression 2238 -> ( SpecEnv -- Substitute for orig_dict_id 2239 , Maybe DictBind -- Auxiliary dict binding, if any 2240 , OutExpr) -- Witnessing expression (always trivial) 2241bindAuxiliaryDict env@(SE { se_subst = subst, se_interesting = interesting }) 2242 orig_dict_id fresh_dict_id dict_expr 2243 2244 -- If the dictionary argument is trivial, 2245 -- don’t bother creating a new dict binding; just substitute 2246 | Just dict_id <- getIdFromTrivialExpr_maybe dict_expr 2247 = let env' = env { se_subst = Core.extendSubst subst orig_dict_id dict_expr 2248 `Core.extendInScope` dict_id 2249 -- See Note [Keep the old dictionaries interesting] 2250 , se_interesting = interesting `extendVarSet` dict_id } 2251 in (env', Nothing, dict_expr) 2252 2253 | otherwise -- Non-trivial dictionary arg; make an auxiliary binding 2254 = let dict_bind = mkDB (NonRec fresh_dict_id dict_expr) 2255 env' = env { se_subst = Core.extendSubst subst orig_dict_id (Var fresh_dict_id) 2256 `Core.extendInScope` fresh_dict_id 2257 -- See Note [Make the new dictionaries interesting] 2258 , se_interesting = interesting `extendVarSet` fresh_dict_id } 2259 in (env', Just dict_bind, Var fresh_dict_id) 2260 2261{- 2262Note [Make the new dictionaries interesting] 2263~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2264Important! We're going to substitute dx_id1 for d 2265and we want it to look "interesting", else we won't gather *any* 2266consequential calls. E.g. 2267 f d = ...g d.... 2268If we specialise f for a call (f (dfun dNumInt)), we'll get 2269a consequent call (g d') with an auxiliary definition 2270 d' = df dNumInt 2271We want that consequent call to look interesting 2272 2273Note [Keep the old dictionaries interesting] 2274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2275In bindAuxiliaryDict, we don’t bother creating a new dict binding if 2276the dict expression is trivial. For example, if we have 2277 2278 f = \ @m1 (d1 :: Monad m1) -> ... 2279 2280and we specialize it at the pattern 2281 2282 [SpecType IO, SpecArg $dMonadIO] 2283 2284it would be silly to create a new binding for $dMonadIO; it’s already 2285a binding! So we just extend the substitution directly: 2286 2287 m1 :-> IO 2288 d1 :-> $dMonadIO 2289 2290But this creates a new subtlety: the dict expression might be a dict 2291binding we floated out while specializing another function. For 2292example, we might have 2293 2294 d2 = $p1Monad $dMonadIO -- floated out by bindAuxiliaryDict 2295 $sg = h @IO d2 2296 h = \ @m2 (d2 :: Applicative m2) -> ... 2297 2298and end up specializing h at the following pattern: 2299 2300 [SpecType IO, SpecArg d2] 2301 2302When we created the d2 binding in the first place, we locally marked 2303it as interesting while specializing g as described above by 2304Note [Make the new dictionaries interesting]. But when we go to 2305specialize h, it isn’t in the SpecEnv anymore, so we’ve lost the 2306knowledge that we should specialize on it. 2307 2308To fix this, we have to explicitly add d2 *back* to the interesting 2309set. That way, it will still be considered interesting while 2310specializing the body of h. See !2913. 2311-} 2312 2313 2314{- ********************************************************************* 2315* * 2316 UsageDetails and suchlike 2317* * 2318********************************************************************* -} 2319 2320data UsageDetails 2321 = MkUD { 2322 ud_binds :: !(Bag DictBind), 2323 -- See Note [Floated dictionary bindings] 2324 -- The order is important; 2325 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1 2326 -- (Remember, Bags preserve order in GHC.) 2327 2328 ud_calls :: !CallDetails 2329 2330 -- INVARIANT: suppose bs = bindersOf ud_binds 2331 -- Then 'calls' may *mention* 'bs', 2332 -- but there should be no calls *for* bs 2333 } 2334 2335-- | A 'DictBind' is a binding along with a cached set containing its free 2336-- variables (both type variables and dictionaries) 2337type DictBind = (CoreBind, VarSet) 2338 2339{- Note [Floated dictionary bindings] 2340~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2341We float out dictionary bindings for the reasons described under 2342"Dictionary floating" above. But not /just/ dictionary bindings. 2343Consider 2344 2345 f :: Eq a => blah 2346 f a d = rhs 2347 2348 $c== :: T -> T -> Bool 2349 $c== x y = ... 2350 2351 $df :: Eq T 2352 $df = Eq $c== ... 2353 2354 gurgle = ...(f @T $df)... 2355 2356We gather the call info for (f @T $df), and we don't want to drop it 2357when we come across the binding for $df. So we add $df to the floats 2358and continue. But then we have to add $c== to the floats, and so on. 2359These all float above the binding for 'f', and and now we can 2360successfully specialise 'f'. 2361 2362So the DictBinds in (ud_binds :: Bag DictBind) may contain 2363non-dictionary bindings too. 2364-} 2365 2366instance Outputable UsageDetails where 2367 ppr (MkUD { ud_binds = dbs, ud_calls = calls }) 2368 = text "MkUD" <+> braces (sep (punctuate comma 2369 [text "binds" <+> equals <+> ppr dbs, 2370 text "calls" <+> equals <+> ppr calls])) 2371 2372emptyUDs :: UsageDetails 2373emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv } 2374 2375------------------------------------------------------------ 2376type CallDetails = DIdEnv CallInfoSet 2377 -- The order of specialized binds and rules depends on how we linearize 2378 -- CallDetails, so to get determinism we must use a deterministic set here. 2379 -- See Note [Deterministic UniqFM] in UniqDFM 2380 2381data CallInfoSet = CIS Id (Bag CallInfo) 2382 -- The list of types and dictionaries is guaranteed to 2383 -- match the type of f 2384 -- The Bag may contain duplicate calls (i.e. f @T and another f @T) 2385 -- These dups are eliminated by already_covered in specCalls 2386 2387data CallInfo 2388 = CI { ci_key :: [SpecArg] -- All arguments 2389 , ci_fvs :: VarSet -- Free vars of the ci_key 2390 -- call (including tyvars) 2391 -- [*not* include the main id itself, of course] 2392 } 2393 2394type DictExpr = CoreExpr 2395 2396ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet 2397ciSetFilter p (CIS id a) = CIS id (filterBag p a) 2398 2399instance Outputable CallInfoSet where 2400 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn) 2401 2 (ppr map) 2402 2403pprCallInfo :: Id -> CallInfo -> SDoc 2404pprCallInfo fn (CI { ci_key = key }) 2405 = ppr fn <+> ppr key 2406 2407ppr_call_key_ty :: SpecArg -> Maybe SDoc 2408ppr_call_key_ty (SpecType ty) = Just $ char '@' <+> pprParendType ty 2409ppr_call_key_ty UnspecType = Just $ char '_' 2410ppr_call_key_ty (SpecDict _) = Nothing 2411ppr_call_key_ty UnspecArg = Nothing 2412 2413instance Outputable CallInfo where 2414 ppr (CI { ci_key = key, ci_fvs = fvs }) 2415 = text "CI" <> braces (hsep [ fsep (mapMaybe ppr_call_key_ty key), ppr fvs ]) 2416 2417unionCalls :: CallDetails -> CallDetails -> CallDetails 2418unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2 2419 2420unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet 2421unionCallInfoSet (CIS f calls1) (CIS _ calls2) = 2422 CIS f (calls1 `unionBags` calls2) 2423 2424callDetailsFVs :: CallDetails -> VarSet 2425callDetailsFVs calls = 2426 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls 2427 -- It's OK to use nonDetFoldUDFM here because we forget the ordering 2428 -- immediately by converting to a nondeterministic set. 2429 2430callInfoFVs :: CallInfoSet -> VarSet 2431callInfoFVs (CIS _ call_info) = 2432 foldr (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info 2433 2434getTheta :: [TyCoBinder] -> [PredType] 2435getTheta = fmap tyBinderType . filter isInvisibleBinder . filter (not . isNamedBinder) 2436 2437 2438------------------------------------------------------------ 2439singleCall :: Id -> [SpecArg] -> UsageDetails 2440singleCall id args 2441 = MkUD {ud_binds = emptyBag, 2442 ud_calls = unitDVarEnv id $ CIS id $ 2443 unitBag (CI { ci_key = args -- used to be tys 2444 , ci_fvs = call_fvs }) } 2445 where 2446 call_fvs = foldr (unionVarSet . specArgFreeVars) emptyVarSet args 2447 -- The type args (tys) are guaranteed to be part of the dictionary 2448 -- types, because they are just the constrained types, 2449 -- and the dictionary is therefore sure to be bound 2450 -- inside the binding for any type variables free in the type; 2451 -- hence it's safe to neglect tyvars free in tys when making 2452 -- the free-var set for this call 2453 -- BUT I don't trust this reasoning; play safe and include tys_fvs 2454 -- 2455 -- We don't include the 'id' itself. 2456 2457mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails 2458mkCallUDs env f args 2459 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ]) 2460 res 2461 where 2462 res = mkCallUDs' env f args 2463 2464mkCallUDs' env f args 2465 | not (want_calls_for f) -- Imported from elsewhere 2466 || null ci_key -- No useful specialisation 2467 -- See also Note [Specialisations already covered] 2468 = -- pprTrace "mkCallUDs: discarding" _trace_doc 2469 emptyUDs 2470 2471 | otherwise 2472 = -- pprTrace "mkCallUDs: keeping" _trace_doc 2473 singleCall f ci_key 2474 where 2475 _trace_doc = vcat [ppr f, ppr args, ppr ci_key] 2476 pis = fst $ splitPiTys $ idType f 2477 constrained_tyvars = tyCoVarsOfTypes $ getTheta pis 2478 2479 ci_key :: [SpecArg] 2480 ci_key = dropWhileEndLE (not . isSpecDict) $ 2481 zipWith mk_spec_arg args pis 2482 -- Drop trailing args until we get to a SpecDict 2483 -- In this way the RULE has as few args as possible, 2484 -- which broadens its applicability, since rules only 2485 -- fire when saturated 2486 2487 mk_spec_arg :: CoreExpr -> TyCoBinder -> SpecArg 2488 mk_spec_arg arg (Named bndr) 2489 | binderVar bndr `elemVarSet` constrained_tyvars 2490 = case arg of 2491 Type ty -> SpecType ty 2492 _ -> pprPanic "ci_key" $ ppr arg 2493 | otherwise = UnspecType 2494 2495 -- For "InvisArg", which are the type-class dictionaries, 2496 -- we decide on a case by case basis if we want to specialise 2497 -- on this argument; if so, SpecDict, if not UnspecArg 2498 mk_spec_arg arg (Anon InvisArg pred) 2499 | type_determines_value pred 2500 , interestingDict env arg -- Note [Interesting dictionary arguments] 2501 = SpecDict arg 2502 | otherwise = UnspecArg 2503 2504 mk_spec_arg _ (Anon VisArg _) 2505 = UnspecArg 2506 2507 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f)) 2508 -- For imported things, we gather call instances if 2509 -- there is an unfolding that we could in principle specialise 2510 -- We might still decide not to use it (consulting dflags) 2511 -- in specImports 2512 -- Use 'realIdUnfolding' to ignore the loop-breaker flag! 2513 2514 type_determines_value pred -- See Note [Type determines value] 2515 = case classifyPredType pred of 2516 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs 2517 EqPred {} -> True 2518 IrredPred {} -> True -- Things like (D []) where D is a 2519 -- Constraint-ranged family; #7785 2520 ForAllPred {} -> True 2521 2522{- 2523Note [Type determines value] 2524~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2525Only specialise on non-IP *class* params, because these are the ones 2526whose *type* determines their *value*. In particular, with implicit 2527params, the type args *don't* say what the value of the implicit param 2528is! See #7101. 2529 2530So we treat implicit params just like ordinary arguments for the 2531purposes of specialisation. Note that we still want to specialise 2532functions with implicit params if they have *other* dicts which are 2533class params; see #17930. 2534 2535One apparent additional complexity involves type families. For 2536example, consider 2537 type family D (v::*->*) :: Constraint 2538 type instance D [] = () 2539 f :: D v => v Char -> Int 2540If we see a call (f "foo"), we'll pass a "dictionary" 2541 () |> (g :: () ~ D []) 2542and it's good to specialise f at this dictionary. 2543 2544So the question is: can an implicit parameter "hide inside" a 2545type-family constraint like (D a). Well, no. We don't allow 2546 type instance D Maybe = ?x:Int 2547Hence the IrredPred case in type_determines_value. See #7785. 2548 2549Note [Interesting dictionary arguments] 2550~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2551Consider this 2552 \a.\d:Eq a. let f = ... in ...(f d)... 2553There really is not much point in specialising f wrt the dictionary d, 2554because the code for the specialised f is not improved at all, because 2555d is lambda-bound. We simply get junk specialisations. 2556 2557What is "interesting"? Just that it has *some* structure. But what about 2558variables? 2559 2560 * A variable might be imported, in which case its unfolding 2561 will tell us whether it has useful structure 2562 2563 * Local variables are cloned on the way down (to avoid clashes when 2564 we float dictionaries), and cloning drops the unfolding 2565 (cloneIdBndr). Moreover, we make up some new bindings, and it's a 2566 nuisance to give them unfoldings. So we keep track of the 2567 "interesting" dictionaries as a VarSet in SpecEnv. 2568 We have to take care to put any new interesting dictionary 2569 bindings in the set. 2570 2571We accidentally lost accurate tracking of local variables for a long 2572time, because cloned variables don't have unfoldings. But makes a 2573massive difference in a few cases, eg #5113. For nofib as a 2574whole it's only a small win: 2.2% improvement in allocation for ansi, 25751.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size. 2576-} 2577 2578interestingDict :: SpecEnv -> CoreExpr -> Bool 2579-- A dictionary argument is interesting if it has *some* structure 2580-- NB: "dictionary" arguments include constraints of all sorts, 2581-- including equality constraints; hence the Coercion case 2582interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v) 2583 || isDataConWorkId v 2584 || v `elemVarSet` se_interesting env 2585interestingDict _ (Type _) = False 2586interestingDict _ (Coercion _) = False 2587interestingDict env (App fn (Type _)) = interestingDict env fn 2588interestingDict env (App fn (Coercion _)) = interestingDict env fn 2589interestingDict env (Tick _ a) = interestingDict env a 2590interestingDict env (Cast e _) = interestingDict env e 2591interestingDict _ _ = True 2592 2593plusUDs :: UsageDetails -> UsageDetails -> UsageDetails 2594plusUDs (MkUD {ud_binds = db1, ud_calls = calls1}) 2595 (MkUD {ud_binds = db2, ud_calls = calls2}) 2596 = MkUD { ud_binds = db1 `unionBags` db2 2597 , ud_calls = calls1 `unionCalls` calls2 } 2598 2599----------------------------- 2600_dictBindBndrs :: Bag DictBind -> [Id] 2601_dictBindBndrs dbs = foldr ((++) . bindersOf . fst) [] dbs 2602 2603-- | Construct a 'DictBind' from a 'CoreBind' 2604mkDB :: CoreBind -> DictBind 2605mkDB bind = (bind, bind_fvs bind) 2606 2607-- | Identify the free variables of a 'CoreBind' 2608bind_fvs :: CoreBind -> VarSet 2609bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs) 2610bind_fvs (Rec prs) = foldl' delVarSet rhs_fvs bndrs 2611 where 2612 bndrs = map fst prs 2613 rhs_fvs = unionVarSets (map pair_fvs prs) 2614 2615pair_fvs :: (Id, CoreExpr) -> VarSet 2616pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs 2617 `unionVarSet` idFreeVars bndr 2618 -- idFreeVars: don't forget variables mentioned in 2619 -- the rules of the bndr. C.f. OccAnal.addRuleUsage 2620 -- Also tyvars mentioned in its type; they may not appear 2621 -- in the RHS 2622 -- type T a = Int 2623 -- x :: T a = 3 2624 where 2625 interesting :: InterestingVarFun 2626 interesting v = isLocalVar v || (isId v && isDFunId v) 2627 -- Very important: include DFunIds /even/ if it is imported 2628 -- Reason: See Note [Avoiding loops], the second exmaple 2629 -- involving an imported dfun. We must know whether 2630 -- a dictionary binding depends on an imported dfun, 2631 -- in case we try to specialise that imported dfun 2632 -- #13429 illustrates 2633 2634-- | Flatten a set of "dumped" 'DictBind's, and some other binding 2635-- pairs, into a single recursive binding. 2636recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind 2637recWithDumpedDicts pairs dbs 2638 = (Rec bindings, fvs) 2639 where 2640 (bindings, fvs) = foldr add 2641 ([], emptyVarSet) 2642 (dbs `snocBag` mkDB (Rec pairs)) 2643 add (NonRec b r, fvs') (pairs, fvs) = 2644 ((b,r) : pairs, fvs `unionVarSet` fvs') 2645 add (Rec prs1, fvs') (pairs, fvs) = 2646 (prs1 ++ pairs, fvs `unionVarSet` fvs') 2647 2648snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails 2649-- Add ud_binds to the tail end of the bindings in uds 2650snocDictBinds uds dbs 2651 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs } 2652 2653consDictBind :: DictBind -> UsageDetails -> UsageDetails 2654consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds } 2655 2656addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails 2657addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds } 2658 2659snocDictBind :: UsageDetails -> DictBind -> UsageDetails 2660snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind } 2661 2662wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind] 2663wrapDictBinds dbs binds 2664 = foldr add binds dbs 2665 where 2666 add (bind,_) binds = bind : binds 2667 2668wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr 2669wrapDictBindsE dbs expr 2670 = foldr add expr dbs 2671 where 2672 add (bind,_) expr = Let bind expr 2673 2674---------------------- 2675dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind) 2676-- Used at a lambda or case binder; just dump anything mentioning the binder 2677dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls }) 2678 | null bndrs = (uds, emptyBag) -- Common in case alternatives 2679 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $ 2680 (free_uds, dump_dbs) 2681 where 2682 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls } 2683 bndr_set = mkVarSet bndrs 2684 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set 2685 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor 2686 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be 2687 -- no calls for any of the dicts in dump_dbs 2688 2689dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool) 2690-- Used at a let(rec) binding. 2691-- We return a boolean indicating whether the binding itself is mentioned, 2692-- directly or indirectly, by any of the ud_calls; in that case we want to 2693-- float the binding itself; 2694-- See Note [Floated dictionary bindings] 2695dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls }) 2696 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $ 2697 (free_uds, dump_dbs, float_all) 2698 where 2699 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls } 2700 bndr_set = mkVarSet bndrs 2701 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set 2702 free_calls = deleteCallsFor bndrs orig_calls 2703 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls 2704 2705callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo]) 2706callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls }) 2707 = -- pprTrace ("callsForMe") 2708 -- (vcat [ppr fn, 2709 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs), 2710 -- text "Orig calls =" <+> ppr orig_calls, 2711 -- text "Dep set =" <+> ppr dep_set, 2712 -- text "Calls for me =" <+> ppr calls_for_me]) $ 2713 (uds_without_me, calls_for_me) 2714 where 2715 uds_without_me = MkUD { ud_binds = orig_dbs 2716 , ud_calls = delDVarEnv orig_calls fn } 2717 calls_for_me = case lookupDVarEnv orig_calls fn of 2718 Nothing -> [] 2719 Just cis -> filterCalls cis orig_dbs 2720 -- filterCalls: drop calls that (directly or indirectly) 2721 -- refer to fn. See Note [Avoiding loops] 2722 2723---------------------- 2724filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo] 2725-- See Note [Avoiding loops] 2726filterCalls (CIS fn call_bag) dbs 2727 = filter ok_call (bagToList call_bag) 2728 where 2729 dump_set = foldl' go (unitVarSet fn) dbs 2730 -- This dump-set could also be computed by splitDictBinds 2731 -- (_,_,dump_set) = splitDictBinds dbs {fn} 2732 -- But this variant is shorter 2733 2734 go so_far (db,fvs) | fvs `intersectsVarSet` so_far 2735 = extendVarSetList so_far (bindersOf db) 2736 | otherwise = so_far 2737 2738 ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set) 2739 2740---------------------- 2741splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet) 2742-- splitDictBinds dbs bndrs returns 2743-- (free_dbs, dump_dbs, dump_set) 2744-- where 2745-- * dump_dbs depends, transitively on bndrs 2746-- * free_dbs does not depend on bndrs 2747-- * dump_set = bndrs `union` bndrs(dump_dbs) 2748splitDictBinds dbs bndr_set 2749 = foldl' split_db (emptyBag, emptyBag, bndr_set) dbs 2750 -- Important that it's foldl' not foldr; 2751 -- we're accumulating the set of dumped ids in dump_set 2752 where 2753 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs) 2754 | dump_idset `intersectsVarSet` fvs -- Dump it 2755 = (free_dbs, dump_dbs `snocBag` db, 2756 extendVarSetList dump_idset (bindersOf bind)) 2757 2758 | otherwise -- Don't dump it 2759 = (free_dbs `snocBag` db, dump_dbs, dump_idset) 2760 2761 2762---------------------- 2763deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails 2764-- Remove calls *mentioning* bs in any way 2765deleteCallsMentioning bs calls 2766 = mapDVarEnv (ciSetFilter keep_call) calls 2767 where 2768 keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs) 2769 2770deleteCallsFor :: [Id] -> CallDetails -> CallDetails 2771-- Remove calls *for* bs 2772deleteCallsFor bs calls = delDVarEnvList calls bs 2773 2774{- 2775************************************************************************ 2776* * 2777\subsubsection{Boring helper functions} 2778* * 2779************************************************************************ 2780-} 2781 2782newtype SpecM a = SpecM (State SpecState a) deriving (Functor) 2783 2784data SpecState = SpecState { 2785 spec_uniq_supply :: UniqSupply, 2786 spec_module :: Module, 2787 spec_dflags :: DynFlags 2788 } 2789 2790instance Applicative SpecM where 2791 pure x = SpecM $ return x 2792 (<*>) = ap 2793 2794instance Monad SpecM where 2795 SpecM x >>= f = SpecM $ do y <- x 2796 case f y of 2797 SpecM z -> 2798 z 2799#if !MIN_VERSION_base(4,13,0) 2800 fail = MonadFail.fail 2801#endif 2802 2803instance MonadFail.MonadFail SpecM where 2804 fail str = SpecM $ error str 2805 2806instance MonadUnique SpecM where 2807 getUniqueSupplyM 2808 = SpecM $ do st <- get 2809 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st 2810 put $ st { spec_uniq_supply = us2 } 2811 return us1 2812 2813 getUniqueM 2814 = SpecM $ do st <- get 2815 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st 2816 put $ st { spec_uniq_supply = us' } 2817 return u 2818 2819instance HasDynFlags SpecM where 2820 getDynFlags = SpecM $ liftM spec_dflags get 2821 2822instance HasModule SpecM where 2823 getModule = SpecM $ liftM spec_module get 2824 2825runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a 2826runSpecM dflags this_mod (SpecM spec) 2827 = do us <- getUniqueSupplyM 2828 let initialState = SpecState { 2829 spec_uniq_supply = us, 2830 spec_module = this_mod, 2831 spec_dflags = dflags 2832 } 2833 return $ evalState spec initialState 2834 2835mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails) 2836mapAndCombineSM _ [] = return ([], emptyUDs) 2837mapAndCombineSM f (x:xs) = do (y, uds1) <- f x 2838 (ys, uds2) <- mapAndCombineSM f xs 2839 return (y:ys, uds1 `plusUDs` uds2) 2840 2841extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv 2842extendTvSubstList env tv_binds 2843 = env { se_subst = Core.extendTvSubstList (se_subst env) tv_binds } 2844 2845substTy :: SpecEnv -> Type -> Type 2846substTy env ty = Core.substTy (se_subst env) ty 2847 2848substCo :: SpecEnv -> Coercion -> Coercion 2849substCo env co = Core.substCo (se_subst env) co 2850 2851substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr) 2852substBndr env bs = case Core.substBndr (se_subst env) bs of 2853 (subst', bs') -> (env { se_subst = subst' }, bs') 2854 2855substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr]) 2856substBndrs env bs = case Core.substBndrs (se_subst env) bs of 2857 (subst', bs') -> (env { se_subst = subst' }, bs') 2858 2859cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind) 2860-- Clone the binders of the bind; return new bind with the cloned binders 2861-- Return the substitution to use for RHSs, and the one to use for the body 2862cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs) 2863 = do { us <- getUniqueSupplyM 2864 ; let (subst', bndr') = Core.cloneIdBndr subst us bndr 2865 interesting' | interestingDict env rhs 2866 = interesting `extendVarSet` bndr' 2867 | otherwise = interesting 2868 ; return (env, env { se_subst = subst', se_interesting = interesting' } 2869 , NonRec bndr' rhs) } 2870 2871cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs) 2872 = do { us <- getUniqueSupplyM 2873 ; let (subst', bndrs') = Core.cloneRecIdBndrs subst us (map fst pairs) 2874 env' = env { se_subst = subst' 2875 , se_interesting = interesting `extendVarSetList` 2876 [ v | (v,r) <- pairs, interestingDict env r ] } 2877 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) } 2878 2879newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr 2880-- Make up completely fresh binders for the dictionaries 2881-- Their bindings are going to float outwards 2882newDictBndr env b = do { uniq <- getUniqueM 2883 ; let n = idName b 2884 ty' = substTy env (idType b) 2885 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) } 2886 2887newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id 2888 -- Give the new Id a similar occurrence name to the old one 2889newSpecIdSM old_id new_ty join_arity_maybe 2890 = do { uniq <- getUniqueM 2891 ; let name = idName old_id 2892 new_occ = mkSpecOcc (nameOccName name) 2893 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name) 2894 `asJoinId_maybe` join_arity_maybe 2895 ; return new_id } 2896 2897{- 2898 Old (but interesting) stuff about unboxed bindings 2899 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2900 2901What should we do when a value is specialised to a *strict* unboxed value? 2902 2903 map_*_* f (x:xs) = let h = f x 2904 t = map f xs 2905 in h:t 2906 2907Could convert let to case: 2908 2909 map_*_Int# f (x:xs) = case f x of h# -> 2910 let t = map f xs 2911 in h#:t 2912 2913This may be undesirable since it forces evaluation here, but the value 2914may not be used in all branches of the body. In the general case this 2915transformation is impossible since the mutual recursion in a letrec 2916cannot be expressed as a case. 2917 2918There is also a problem with top-level unboxed values, since our 2919implementation cannot handle unboxed values at the top level. 2920 2921Solution: Lift the binding of the unboxed value and extract it when it 2922is used: 2923 2924 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h# 2925 t = map f xs 2926 in case h of 2927 _Lift h# -> h#:t 2928 2929Now give it to the simplifier and the _Lifting will be optimised away. 2930 2931The benefit is that we have given the specialised "unboxed" values a 2932very simple lifted semantics and then leave it up to the simplifier to 2933optimise it --- knowing that the overheads will be removed in nearly 2934all cases. 2935 2936In particular, the value will only be evaluated in the branches of the 2937program which use it, rather than being forced at the point where the 2938value is bound. For example: 2939 2940 filtermap_*_* p f (x:xs) 2941 = let h = f x 2942 t = ... 2943 in case p x of 2944 True -> h:t 2945 False -> t 2946 ==> 2947 filtermap_*_Int# p f (x:xs) 2948 = let h = case (f x) of h# -> _Lift h# 2949 t = ... 2950 in case p x of 2951 True -> case h of _Lift h# 2952 -> h#:t 2953 False -> t 2954 2955The binding for h can still be inlined in the one branch and the 2956_Lifting eliminated. 2957 2958 2959Question: When won't the _Lifting be eliminated? 2960 2961Answer: When they at the top-level (where it is necessary) or when 2962inlining would duplicate work (or possibly code depending on 2963options). However, the _Lifting will still be eliminated if the 2964strictness analyser deems the lifted binding strict. 2965-} 2966