Date: 8/10/03 Author: Scott McPeak Convertibility and Operator Overloading --- Introduction --- This is a characterization of (standard) convertibility relation (C++ standard, section 4), and its role in operator overload analysis. The first part builds a partial order on types, R, to describe convertibility. The second part uses R to assist in the implementation of operator overload resolution. (I refrain from saying "subtyping" here because in C++ subtyping is context-dependent; convertibility is a particular form of subtyping.) What we want is a (reflexive) partial order: - reflexive R(a,a) - transitive R(a,b) & R(b,c) => R(a,c) - anti-symmetric R(a,b) => not R(b,a) In this context, R(a,b) means a is convertible to b. The reason we want a partial order is it will help simplify the analysis. Among other things, transitivity is useful when trying to select a "best" candidate from an infinite set, and anti-symmetry avoids ambiguity through circularity (more on this below). In what follows, I make reference to the C++ standard: International Organization for Standardization. ISO/IEC 14882:1998: Programming languages -- C++. International Organization for Standardization, Geneva, Switzerland, September 1998. --- Atomic Types --- The atomic types are: - simple types - char, unsigned char, signed char, wchar_t - bool - short int, unsigned short int - int, unsigned int - long int, unsigned long int - long long int, unsigned long long int (GNU, C99) - float, double, long double - void - bitfields - enums - classes (by which I mean class, struct, or union) First, unify all the arithmetic types into a single representative, AR. AR includes chars, bool, ints, floats, and bitfields. All the things inside AR are inter-convertible, so would defy convertibility being a partial order. Note that I do *not* unify them when under a pointer constructor, e.g. int* != float*. Nothing is convertible to or from void. But in this context, nothing has type void anyway, so it doesn't much matter. Every enum type is convertible to AR, and to itself. Nothing is convertible to the enum besides itself. Every class type is convertible only to and from itself. Thus, the partial order on atomic types looks like this: AR / | \ class1 class2 ... classN void enum1 enum2 ... enumN (Note that inheritance isn't relevant until we start talking about pointers or references.) --- Constructed Types --- The constructed types are: - pointer - function - array - pointer to member - reference While there is a standard conversion from function types to pointer to function types, we can eliminate that conversion from consideration by: - normalizing all parameters to accept pointers to function rather than function - noting that conversion operators cannot yield functions (12.3.2 paragraph 4) Therefore, we can regard each function type as only convertible to and from itself. Similarly, array types can be normalized out of parameters, and cannot be returned from conversion operators, and so are isolated. --- Pointers --- Two things factor into pointer convertibility: inheritance, and cv-qualification. Inheritance lets C* convert to P* when C is a child of P. (It does not matter whether P is private or ambiguous; those concerns come after all decisions regarding convertibility and overloading.) cv-qualification lets const and volatile be added anywhere with a pointer type, provided that the *resulting* type has 'const' at all levels above the lowest changed qualifier. For example, int * -> int const * int volatile * -> int volatile * int ** -> int const * const volatile * int ** -> int volatile * const * int **** -> int volatile * const * const * const * but not int ** -> int const ** These two combine, but inheritance only works when there is exactly one level of pointer. For example C * -> P const * C const * -> P const volatile * but not C ** -> P ** or even C ** -> P const * const * even though this last example would not violate soundness of the type system, were it allowed in the language and given the obvious semantics. (Perhaps a design oversight, or maybe just an intentional simplification of an already complex language.) The order induced by the conversion is straightforward. For one-level pointers, it is the product of the inheritance order and the cv subset lattice: R(A cv *, B cv' *) iff A inherits from B and cv is a subset of cv' For two- or more level pointers, it's the union of all cartesian products of the cv subset lattice, plus the restriction ($) about adding const above changed qualifiers: R(T cv1 * cv2 * ... cvN *, T cv1' * cv2' * ... cvN' *) iff cv1 is a subset of cv1' and cv2 is a subset of cv2' and ... cvN is a subset of cvN' and for all I in 1..N, ($) if cvI != cvI', then for all J in I+1..N, cvJ' contains const Ignoring the restriction ($), we have the union of products of partial orders, and the sets of objects they relate are disjoint, so the result is also a partial order. The effect of ($) is to make the relation smaller, but it won't break transitivity anywhere (it monotonically adds const), and reflexivity and anti-symmetry are obviously preserved. Finally (4.10 para 2), for any T, R(T cv *, void cv *) (i.e., any pointer can be converted to a void pointer). This doesn't compromise the partial order; it's a one-way trip to void-land. Also note that pointers cannot convert to or from any of the atomic types. (TODO: What about convertibility from the null literal? I probably need to treat that like a class with conversions to int and T* (for any T), though Elsa at the moment does not. Hmm...) --- Pointers to members --- Pointers to members behave very much like pointers, except the order induced by inheritance is opposite what it is for pointers: T P::* (pointer to member of P with type T) can convert to T C::*, when C is a child of P. The interaction with the cv-lattice is analogous to the pointer case. Furthermore, types constructed from a mix of pointer and pointer to member also have this analagous behavior. For example, int C::* -> int const C::* int C::* D::* -> int const C::* const D::* int C::* * -> int volatile C::* const * int * C::* * D::* -> int volatile * const C::* const * const D::* For one level pointer to member, we have the order: R(T cv A::*, T cv' B::*) iff B inherits from A and // reversed from pointer case cv is a subset of cv' and for multi-level and mixed we have: R(T cv1 P1 cv2 P2 ... cvN PN, T cv1' P1 cv2' P2 ... * cvN' PN) iff cv1 is a subset of cv1' and cv2 is a subset of cv2' and ... cvN is a subset of cvN' and Every Pi is either * (ordinary pointer) or C::* (pointer to member) for some class C (not necessarily the same class for every i). (This order includes the order described above for multi-level ordinary pointers.) Again, it's clear that we have a partial order. --- References --- References act like ordinary pointers. The order on references is obtained from the order on pointers by substituting the word "reference" for "pointer" when "pointer" is the topmost type constructor. For example: int & -> int const & int volatile & -> int volatile & int *& -> int const * const & int *& -> int volatile * const & int ***& -> int volatile * const * const * const & C & -> P const & There's no conversion to 'void &', however. (TODO: I'm not 100% sure about this, but it's what my implementation does for now.) --- Order on conversions --- Now, the above partial order on types is a key component to the analysis, but is not enough. The rules for overload resolution do not talk about comparing *types*, but rather comparing *conversions* between types. For example, consider 13.3.3.2, paragraph 3 (excerpt): "Standard conversion sequence S1 is a better conversion sequence than S2 if ... S1 and S2 differ only in their qualification conversion and yield similar types T1 and T2 (4.4), respectively, and the cv-qualification signature of type T1 is a proper subset of the cv-qualification signature of T2". What we need is an order on conversions that takes advantage of the order on types. Luckily, the relation S appears to be equal to the order given in 13.3.3.2, paragraphs 3 and 4: S("A->B", "A->C") iff R(B,C) ($$) That is, "A->B" (the standard conversion from A to B) is better than "A->C" if and only if B is convertible to C (using the convertibility relation R defined above, where the atomic types are unified, etc.). For the excerpt quoted, the order is the same: that induced by the cv subset lattice. (Technical point: I want S to be nonreflexive, so I really mean the nonreflexive version of R.) Now, proving this for all types A, B and C requires a detailed analysis of the specific rules provided in 13.3.3.2. I think it's right, but I'm not completely sure. I haven't so far been able to adequately formalize those rules in a way that would allow a formal proof (not that I've tried much). In any case, it's close enough that I'm confident if a discrepancy is found, it won't be difficult to hack around it in the implementation. Update: Here's one counterexample: If the hierarchy is A / B / C then the conversion "B* -> A*" is better than "C* -> A*" (13.3.3.2 para 4), but (A*,A*) is not in the nonreflexive R. This doesn't appear to compromise the analysis, but I'm not sure how to build the argument. --- Application to operator overload resolution --- Where this analysis becomes useful is when we try to implement operator overload resolution. The difficulty arises from section 13.6, which says (paragraph 14, for example): "For every T, where T is a pointer to object type, there exist candidate operator functions of the form ptrdiff_t operator- (T, T);" Two things make this difficult. First, this is an infinite set of candidate functions; we can't straightforwardly materialize all of the candidates and hand them to overload resolution. Second, the types of the arguments are *correlated*; in fact, they have to be equal. If they were not correlated (as in e.g. 13.6 paragraph 12), then we could simply analyze the arguments separately: ambiguity in either argument would imply ambiguity for the entire analysis. But for operator-, it is in fact possible to have (naive) ambiguity in *both* arguments but still an unambiguous call: struct A { operator int* (); operator float* (); }; struct B { operator float* (); operator char* (); }; int f() { A a; B b; return a-b; // overloaded operator call }; Both arguments can convert to "pointer to object type", but only one such type (float*) can be converted-to by both arguments. So the candidate ptrdiff_t operator- (float*,float*); is the unambiguous best candidate, and the call site has implicit calls to A::operator float*() and B::operator float*(). These examples can get more complicated, where the winning candidate instantiates the pattern function with a type that does not explicitly appear anywhere in the program. For example: struct C { operator int const **(); }; struct D { operator int volatile **(); }; int g() { C c; D d; return c-d; // overloaded operator call } The winning candidate is ptrdiff_t operator- (int const volatile * const *, int const volatile * const *); even though that parameter type doesn't appear in the program. In fact, current versions of GCC (through 3.3 at least) mistakenly believe the expression above is illegal, because GCC only tries (a subset of) types that do appear in the program. So what's the answer? Well, consider any pair of types to which both arguments can convert (directly). If the ultimate resolution picks those two conversions, what would the instantiation of the operator function be? Suppose the first argument to operator- has a conversion to U, and the second has a conversion to V. We need a type T such that both U and V can convert to T (existence). We also need (for uniqueness) that for any T' not equal to T, such that U and V can convert to T': U->T' is not better than U->T, and V->T' is not better than V->T, and either U->T is better than U->T', or V->T is better than V->T' (or both are better) (These rules paraphrase the relevant parts of 13.3.3 paragraph 1.) Applying equation ($$) above, rewrite this as not R(T',T), and not R(T',T), and either R(T,T') or R(T,T') Applying antisymmetry and simplifying, we have R(T,T') Ok! Now this is a familiar algebraic question: do U and V have a least upper bound (LUB) w.r.t. the partial order R? By definition, the LUB(U,V) is the T that satisfies the existence and uniqueness requirements above. Consequently, it is the only possible instantiation of T that could yield a winning candidate function, if the arguments are converted to U and V. Based on the characterization of R given above, computing the LUB is fairly easy. For single-level pointers, we need the LUB in the class hierarchy, which is straightforward to compute. For single-level pointers to members, we need the greatest lower bound (GLB); but see the note about this at the end. Then in all cases, we simply union the cv flags (only down to the first non-pointer, non-pointer-to-member type constructor), adding const at all levels above the deepest difference. That is, the LUB in R is essentially the product of the underlying LUBs, just as the order itself is the product of underlying orders. But not all pairs of types have a LUB. First, there might not be any types to which both can convert; 'int*' and 'int' are such a pair. In that case, the pair cannot possibly be the pair selected for the winning candidate; the set of viable candidates to which that conversion pair corresponds is empty. Second, there may not be a least element for the set of types to which U and V can convert. For example, if classes C1 and C2 both inherit from P1 and P2, but P1 and P2 are not related by inheritance, then 'C1*' and 'C2*' can both convert to 'P1*' and 'P2*', but these latter two types are not comparable under R. When the upper bound set is not empty, but there is no least element, resolution among the set of candidates generated by the pattern is guaranteed to be ambiguous. To see why, select any two elements T and T' from the upper bound set, for which T and T' do not convert (such a pair must exist; the upper bound set is finite). There will be candidates ptrdiff_t operator-(T, T); ptrdiff_t operator-(T', T'); and to compare the candidates, we'll have to compare conversions U->T with U->T' and/or V->T with V->T' but by ($$), this won't be possible. (Just do the original reduction to R(T,T') backwards.) Furthermore, no other pairs of conversions (U',V') can yield a candidate that can beat either of these candidates, because they will use different user-defined conversion functions, and hence be incomparable as implicit conversion sequences (13.3.3.2, para 3, final bullet). (Note that the reason we *can* choose among user-defined conversion functions when instantiating for a particular pair is that it's the situation covered by 13.3.3 para 1, final bullet. But once we're looking across pairs, that provision no longer applies.) However, though there cannot be a clear winner from among the pattern (built-in) candidates if the LUB is ambiguous, there might be a non-built-in (i.e. user-defined) candidate that is better than any of the built-in candidates, so the algorithm must take that into account. --- An algorithm for candidate instantiation --- Thus, to summarize, the algorithm for generating a finite set of built-in operator candidates, when the infinite set has correlated types, is: - consider every pair (U,V) of types to which the arguments can convert - compute the LUB(U,V) w.r.t. the order R - if the LUB exists, instantiate the pattern with it - if the UB set is empty, do nothing with that pair - if the UB set is nonempty, generate a fake candidate, both of whose arguments are converted by the ambiguous conversion sequence (13.3.3.1 para 10); this effectively summarizes the built-in candidate set Then, run the usual overload resolution algorithm on this finite candidate set (plus member and non-member user-defined candidates; see 13.3.1.2 paragraph 3). If the fake candidate is chosen, report an ambiguity. Otherwise, if a winner arises, it is certain that that winner is better than any of the pattern candidates, even those that were not instantiated, because of (among other things) the transitivity of R. Two comments are in order regarding instantiation. First, if the LUB(U,V) = LUB(U',V') for distinct pairs (U,V) and (U',V'), do not instantiate the pattern twice with the same type. The standard is clear about there being only one candidate with a given signature; we're computing an instantiation sub*set*, not a multiset. Second, a given LUB type might not fit the requirements of the pattern. Taking the operator- example, it says T has "pointer to object type". That means (1.8 para 1) that T is not a pointer to function type, nor pointer to void. This can be implemented simply by filtering types before instantiating, because those types are at the top of the hierarchy; they represent the best type from a (small) set of upper bound types, none of which are admissable to the pattern. (The admissability rules always respect R, in the sense that if T is not admissable, than neither is any T' for which R(T,T').) --- Performance, etc. --- The algorithm given above requires O(n*m) LUB computations, where n and m are the number of types to which the arguments can (directly) convert. That is, it can be at least quadratic in the size of the program. In practice, memoization is a very good way to ensure this does not become a problem. Even if a program contains classes with many conversion functions, the program is very likely to use the same (pairs of) classes over and over in expressions. But the question remains: could this algorithm be improved to use fewer LUB calls? I don't know. I certainly don't see how, but also don't see an argument (such as a constructive adversary) to say that it could not be done. Another interesting question: GCC's (current) algorithm appears to be linear, but it is also wrong. Does that matter? Do users need or want the correct analysis? Might there be hidden ambiguities or other problems due to the incomplete analysis? Or is this a good practical corner to cut? I have relatively little experience with programs that make heavy use of conversion operators, so I just don't know. --- GLB in the class hierarchy --- The greatest lower bound (GLB) is a bit of a pain to compute, for two reasons. First, this is the only place that the GLB is relevant to type checking in C++, so the necessary map (parent -> children) doesn't already exist in the type checker implementation. Adding this map is not hard, but care should be taken to ensure this map isn't mistaken for the whole-program version: any one translation unit need not contain all classes. Second, because classes may be added at any time, GLB queries cannot be memoized in the obvious way. The query memoization infrastructure needs to be able to partly invalidate its old results when new classes are added. (This is essentially a single-translation-unit version of the problem identified in the previous paragraph: the GLB changes as you see more code.) Finally, I think the need for GLB is evidence of a language design mistake. Consider this example that requires the GLB: struct A { int x; }; // A B . struct B { int y; }; // \ / . struct C : A,B { int z; }; // C . // use some operator functions so that the compiler's // static-semantic decisions have run-time consequences struct PTMA { operator int A::* (); // return pointer to member of A }; struct PTMB { operator int B::* (); // return pointer to member of B }; bool f() { PTMA a; PTMB b; // equivalent to: // return a.operator A::*() == b.operator B::*(); // where the two arguments are converted to int C::* for comparison, // i.e. the chosen candidate operator function is // bool operator== (int C::*, int C::*); return a==b; } In the first place, note that the equality could never be true when the original classes are not related by inheritance. But then suppose that we follow this code (in the same translation unit) with: // A B . // |\/| . // |/\| . // C D . struct D : A,B { int w; }; bool g() { PTMA a; PTMB b; // error! cannot decide whether to convert the arguments to // int C::* or int D::*; that is, these candidates conflict: // bool operator== (int C::*, int C::*); // bool operator== (int D::*, int D::*); return a==b; } Now, function g() is no more bizarre than f() was--its comparison is also guaranteed to be false--and yet the compiler must reject it. The programmer is forced to be aware of the *entire* class hierarchy visible at each program point where the GLB might come into play, and not just those classes (and their superclasses) that the programmer is actually using. It's not clear to me what the best language fix is. I think that pointer-to-member (with its current semantics) should not be in the language at all (it's the one feature I would remove if I could). But given its presence, I don't see a quick fix to (e.g.) 13.6. To attack the problem directly, e.g. "you don't need GLB" or "you don't need to instantiate the pattern with types outside such-and-such finite set", could easily make things worse. To require that pointer-to-member conversions down the hierarchy be done with (e.g.) an explicit cast would go against the philosophy of minimal annotation for natural subtyping. It's a dark, angry corner of the language, and it does not want to behave. Side note: In Elsa, I don't implement the GLB analysis. If the classes in a single-level pointer-to-member pair aren't equal, I reject the pair (the pair does not lead to an instantiation). I pity the fool whose code this breaks. I'd be surprised if such code even exists, outside C++ test suites.