Up

C++ Entities and Relationships

This page contains data model diagrams useful for understanding the C++ language. They show the various concepts in C++ static (compile-time) semantics, and their interrelationships.

One use of these data models is to help understand the C++ Standard document. It collects together the essential notions in the language, what components they have, how they are related, etc. The data model is intended to be sufficiently complete that it could serve as the basis for a formal definition of the static semantics.

Another use is to understand the Elsa implementation. Not surprisingly, many concepts in the data models correspond to prominent classes in the Elsa implementation. However, the correspondence is not exact; in some cases, there are deviations due to known bugs in Elsa, while in others Elsa simply uses an implementation technique that may obscure a given concept. Nevertheless, the intent is that Elsa should use a data model that could (in principle) be mapped back and forth to the model presented here.

That the names used in these diagrams do not correspond exactly with either the terminology in the standard (which is often unnecessarily vague) nor that used in Elsa (which makes fewer distinctions, for example among kinds of Variables). However, there is an overall attempt at terminological consistency.

Notation

The following diagrams use a notation that it is a mixture of traditional Entity-Relationship diagrams and UML, with some of my own ideas thrown in:

Diagram 1: Scopes, Variables and Types

The following diagram (xfig sources: er.fig, postscript: er.ps) shows the essential static semantic concepts, such as scopes, variables and types. It omits templates, and it omits executable AST fragments like statements and expressions.

C++ Static Semantics ER diagram

This diagram has four primary dimensions:

Name

Names [3p4] denote Variables. As explained in [3p7], there are three kinds of Names: identifiers, operators, and conversions. Names are extensional: two Names are the same if they have the same attributes and named relationships, and they are immutable once created.

Identifier

An Identifier is a Name consisting of a nonempty sequence of source characters.

OperatorName

An OperatorName is a Name consisting of the keyword operator followed by one of the overloadable operators [13.5p1].

ConversionName

A ConversionName is a Name consisting of the keyword operator followed by a Type [12.3.2].

Scope

A Scope ("declarative region") [3.3] is a set of Variables. Name lookup [3.4] searches for a Name amongst a certain sequence of Scopes, yielding the set of Variables that have the given Name. There are rules [3.3p4] that restrict the set of Variables that may have the same Name and Scope; the data model itself does not enforce them.

The Standard's notion of the "potential scope of a variable" is induced by the timing of the sequence of updates to Scopes corresponding to processing declarations. I have chosen not to include an explicit notion of time or point of declaration in this data model, but I may reconsider.

(Currently missing from data model: function scopes, which contain the labels for goto statements.)

Namespace

A Namespace [3.3.5, 7.3] is a scope outside any class or function. They include the GlobalScope. Namespaces can be connected by UsingDirectives [7.3.4].

BlockScope

A BlockScope ("local scope") [3.3.2] contains local variables in a function. FunctionDefinitions include a BlockScope that spans their body.

PrototypeScope

A PrototypeScope [3.3.3] contains the parameters of a function declarator that is not the declarator of a function definition.

ClassType (as a Scope)

A ClassType [9] includes the scope [3.3.6] that contains the class members. It is described further in another entry below.

Variable

A Variable (a mixture of Standard concepts "entity" [3p3] and "declaration" [3p5]) is something that can be found by lookup. Every Variable is a member of exactly one Scope.

A Variable's name is both optional and mutable. It is optional because function and template parameters may be declared without names [8.3.5p8, 14.1p1]. It is mutable because subsequent declarations may add or change a name. Variables without names are not subject to the rules of [3.3p4] that restrict same-named Variables in a Scope.

NamespaceAlias

A NamespaceAlias [7.3.2] provides an alternate name for a Namespace.

Function

A Function (Variable) is a unique name for a FunctionDefinition, which is a fragment of executable AST. A Function may or may not have a definition, and that aspect may change during processing of a translation unit.

Enumerator

An Enumerator [7.2] is one of the elements of an Enumeration.

UsingAlias

A UsingAlias is introduced by a using-declaration [7.3.3]. When a lookup finds a UsingAlias, it transparently follows the "target" field.

DataVariable

A DataVariable denotes a piece of named run-time data. It can have any type other than a FunctionType (or void). DataVariables are partitioned:

ClassMember

A ClassMember is any Variable other than an Enumerator that is a member of a ClassType Scope. Such Variables have an "access" attribute that is public, protected or private. The access attribute of an Enumerator in a ClassType Scope can be determined by examining its type.

ImplicitlyDeclaredMemberFunction

An ImplicitlyDeclaredMemberFunction is a constructor, destructor or copy assignment operator that is implicitly declared [12p1].

TypeName

A TypeName names a type for purposes of lookup.

Typedef

A Typedef [7.1.3] is a type name in a scope, introduced via the typedef keyword.

InjectedClassName

An InjectedClassName [9p2] is an alias for a ClassType within the scope of that ClassType.

NamedAtomicType

A NamedAtomicType is both a TypeName and an AtomicType. That is, it can be found by lookup, and it can be used to form constructed ("compound") types and be cv-qualified.

ClassType

A ClassType is a class or struct or union; the "keyword" field distinguishes these. It has a sequence of base classes, each of which has an "access" specifier and may be virtual. It may also have a ClassDefinition, which is an AST fragment. The presence of a ClassDefinition distinguishes forward declarations [3.4.4, 7.1.5.3, 9.2p2], and is also useful for implementing templates.

A ClassType has a set of constructors and a destructor. They are found by direct examination of the ClassType, not by lookup, because they do not have names [12.1p1, 12.4(?)]. The data model allows both to be missing to accomodate the analysis of the class itself, which does not know in advance how to make them.

Currently, ClassType is called "CompoundType" in Elsa.

Enumeration

An Enumeration [7.2] is a type consisting of a set of named constants, called enumerators.

Type

A Type [3.9, 8.3] is an entity or concept that is used within the static semantics to approximate the dynamic semantics. In the data model here, "Type" denotes those types constructed by applying type constructors and/or cv-qualifiers [3.9.2, 8.3] to other Types or AtomicTypes. Types are extensional: two Types are equivalent if they have the same structure (attributes and relationships), and they are immutable once created.

FunctionType

A FunctionType [8.3.5] is the type of a function. It has a return type, parameters, and an optional exception specification. Parameter types are equivalent if they have the same cv-unqualified type; that is, the name and the top-level cv-qualification of parameter types are not significant for type equivalence.

Even though the syntax of function declarators includes an exception specification, such specifications are only allowed when part of the certain declarations [15.4p1].

MethodType

The type of a nonstatic member function, what I call a method, is a FunctionType that has a receiver parameter. The receiver is always a reference to a (possibly cv-qualified) class type; the cv-qualification of the receiver records the cv-qualification of the member function declarator [8.3.5p4].

DataType

An DataType is a Type that is not a FunctionType. Note that the related Standard term "object type" [3.9p9] does not include references or void.

CVAtomicType

A CVAtomicType is a leaf in a Type tree. It refers to an AtomicType, and possibly cv-qualifies [3.9.3] it.

IndirectionType

IndirectionType is the common ancestor for the types that are constructed primarily from one other type.

PointerType

A PointerType [8.3.1] is the type of a pointer to some other type, possibly with cv-qualification. The cv-qualifiers apply to the pointer itself, not the referent (the type pointed to).

PointerToMemberType

A PointerToMemberType [8.3.3] is the type of a pointer to a nonstatic class member. In addition to cv-qualifiers, it names the class of which the referent is a member.

ReferenceType

A ReferenceType [8.3.2] is the type of a reference to some other type. It cannot be cv-qualified.

ArrayType

An ArrayType [8.3.4] is the type of an array. It does not (necessarily) have a known size.

SizedArrayType

A SizedArrayType is the type of an array with known size.

AtomicType

An AtomicType is a type which is not built from type constructors. Their identity is physical, not structural: two AtomicTypes can be different even when they have the same attributes and relationships.

NamedAtomicType (as an AtomicType)

NamedAtomicType was already discussed above. However, in this context it is worth explaining that even though class types are in some sense built by applying a "type constructor", this data model and the C++ language regard the class { ... } syntax as creating a new type, not equivalent to another type built with the same syntax.

FundamentalType

A FundamentalType [3.9.1] is one of the following:

Note that the data model does not imply any orthogonality between modifiers like signed and "basic" types like int.

The FundamentalTypes are further subdivided into categories [3.9.1] not reflected here.

Diagram 2: Templates

This diagram (xfig sources: template_er.fig, postscript: template_er.ps) is an extension of the previous diagram, showing all the new and extended concepts needed to represent templates.

C++ Templates diagram

One point of terminology: whereas the Standard refers to "template parameters" (etc.), I refer to "meta-parameters". I use the term "template" only for templates themselves, and use the prefix "meta-" (or "Meta") for the parameters and values involved in defining and using templates.

This diagram has five primary dimensions. (They are all colored black because assigning color seemed unnecessary, due to minimal overlap, and difficult to keep compatible with the first diagram.)

This diagram makes use of what I call non-orthogonal inheritance products. This means that a set of entities is derived by multiplying two inheritance hierarchies together, but then giving attributes and relationships to the product tuples independent of their factors. For example, an IntegerMetaParameter has an "AtomicType", attribute/relationship, but this is not common to MetaParameters nor IntegerMetaEntities.

Though non-orthogonal product tuples are individually named, their derivation as product tuples is still available (though implicit), and this is used to define the concept of corresponding relationships, shown with dashed arrows. For example, every NameMetaParameter has an optional "val" relationship with a NameMetaValue. There are corresponding "val" relationships for the other kinds of MetaValues. These relationships are consistent with respect to the meta-entity-kind axis, so it is well-defined to introduce a "val" relationship between MetaParameters and MetaValues: given a MetaParameter, find out which kind it is, and retrieve its "val", which must be a MetaValue. This makes it possible to generically refer to the default value (when present) of a MetaParameter as being a MetaValue, while at the same time having the "val" relationships be well-typed w.r.t. the meta-entity-kind axis. A similar technique is used for the specializations/template relationship.

The diagram has essentially two parts. The central part, comprising the five major dimensions above, is the data model required to represent templates and specializations themselves. Then, sprinkled around the edges, is the second part which consists of miscellanous extensions and refinements of Diagram 1 necessary to integrate template concepts into the core data model.

Terminology:

MetaParameter

A MetaParameter is a parameter of a template [14.1]. By supplying a MetaValue for the MetaParameter, the template can be instantiated to produce a concrete entity. Note that it is not a full meta-variable; its value cannot be changed once set [14.1p6].

A MetaParameter has an optional default value. As explained above, the kind of MetaValue is dependent on the kind of MetaParameter, according to a parallel decomposition of each along the meta-entity-kind axis.

MetaBinding

A MetaParameter is a parameter, while a MetaBinding is a binding of a MetaParameter to a MetaValue. Unlike a MetaParameter, the "val" relationship of a MetaBinding is not optional, because it carries the binding. Furthermore, the MetaValue to which it is bound must be concrete; either a concrete Type, or one of the three ConcreteMetaValues. (Neither of these constraints is depicted graphically.)

TypeNameMetaParameter

The type branch of the MetaParameter hierarchy [14.1p3] is represented by the TypeNameMetaParameter. It is a descendant of NamedAtomicType, and as such can be looked up as a type, and used to build constructed types.

TypeMetaParameter

A TypeMetaParameter, such as the ubiquitous "<class T>", is a parameter that can be bound to concrete Type. (Despite the idiomatic use of the class keyword to introduce such parameters, they need not be bound to ClassTypes.)

TemplateMetaParameter

A TemplateMetaParameter is syntactically introduced with something like

  template <template <class S> class T> /*...*/
and known in the Standard as a "template template parameter". It can be bound to a ClassTemplate, enabling one template to use multiple specializations of another (class) template.

Note that it is not possible to pass a template function or a template data member as the argument to a TemplateMetaParameter, though the effect could be simulated by creating an appropriate wrapper ClassTemplate.

NontypeMetaParameter

A NontypeMetaParameter is a descendant of Variable, and can be used like an object variable.

IntegerMetaParameter

An IntegerMetaParameter may be bound to an integer value [14.3.2p1]. It has an AtomicType, but only integral types [3.9p7] and Enumerations are permitted [14.1p4]. This data model does not record the cv-qualification on the type of an IntegerMetaParameter because they are irrelevant [14.1p5].

NameMetaParameter

A NameMetaParameter may be bound to a Function, GlobalVariable or DataMember [14.3.2p1]. It has an IndirectionType, which must be a PointerType, PointerToMemberType or ReferenceType [14.1p4].

MetaValue

A MetaValue is something that can fill a hole of a MetaParameter. Syntactically, they are denoted with template arguments [14.3]. They are decomposed along the meta-entity-kind axis, just like MetaParameters, though the diagram omits the names TypeNameMetaValue and NontypeMetaValue for reasons of space.

TypeMetaValue

A TypeMetaValue is a Type. It may be concrete or abstract, depending on the Type.

TemplateMetaValue

A TemplateMetaValue is either a ConcreteTemplateMetaValue, which is a specific ClassTemplate, or an AbstractTemplateMetaValue, which is a reference to a TemplateMetaParameter. AbstractTemplateMetaValues only occur as the default values of TemplateMetaParameters.

IntegerMetaValue

An IntegerMetaValue is either a ConcreteIntegerMetaValue, which is an integer, or an AbstractIntegerMetaValue, which is an Expression that refers to MetaParameters.

NameMetaValue

A NameMetaValue is either a ConcreteNameMetaValue, which is a specific concrete Variable, or is an AbstractNameMetaValue, which is a reference to a NameMetaParameter. AbstractNameMetaValues only occur as the default values of NameMetaParameters.

The name of a template function may be passed as a syntactic template argument, but overload resolution will select a specific specialization for the binding [14.3.2p5b4,5].

Template

A Template is an entity with named holes (parameters) that can be filled by supplying values. Templates are distinct from (say) ordinary parameterized functions in that template parameters are are filled at compile time (only).

A Template is associated with a set of Specializations, instances of that template.

Note that members of ClassTemplates are not necessarily called Templates in this data model, even if they have holes due to the parameters of the class. Only member templates (where the "template </*...*/>" syntax is repeated inside the class body) are both ClassMembers and Templates.

Note further, however, that currently (2005-08-19) Elsa takes the opposite approach, regarding every member of a class template as being a template in its own right. I now regard this design decision as a mistake, and hope to eventually rewrite that part of Elsa using the data model shown here.

MetaParameterList

A Template is/has a MetaParameterList, a sequence of parameters.

Specialization

A Specialization is a Template with the holes filled with MetaValue arguments. The arguments are stored in a sequence that corresponds to the parameter sequence. Except in the case of a ClassTemplatePartialSpecialization, a Specialization's arguments are always concrete.

A Specialization may be either implicit or explicit. An implicit specialization (also called an "instantiation") is generated automatically by the compiler by filling the holes of the Template definition with the supplied arguments. An explicit specialization is supplied by the programmer, and effectively overrides implicit instantiation for a specific argument sequence.

Note that the "explicit" attribute is a mutable boolean field. This is because it is possible for the program to refer to a specialization prior to declaring it as an explicit specialization, as long as the compiler has no need to implicitly instantiate it in the meantime [14.7.3p6]. Thus, the static semantics and this data model allow for a specialization to be initially assumed implicit, but later determined to be explicit.

ClassTemplate

A ClassTemplate is a Template of a ClassType. It also is a ClassType: it can be looked up as a type (in which case template arguments would then have to be supplied), has members, etc.

ClassTemplateSpecialization

A ClassTemplateSpecialization is a Specialization of a ClassTemplate.

ClassTemplatePartialSpecialization

A ClassTemplatePartialSpecialization [14.5.4] is a Specialization of a ClassTemplate, whose template arguments ("args[]") are not all concrete. The holes in its arguments are declared in its MetaParameterList; i.e., it is itself a ClassTemplate.

FunctionTemplate

A FunctionTemplate is a Template of a Function.

FunctionTemplateSpecialization

A FunctionTemplateSpecialization is a Specialization of a FunctionTemplate.

Note that, unlike for ClassTemplates, a Function cannot be both a FunctionTemplate and a FunctionTemplateSpecialization (there is no such thing as a function template partial specialization).

Abstract NamedAtomicTypes

Abstract types are supported by several extensions of NamedAtomicType.

TypeNameMetaParameter (as a NamedAtomicType)

Discussed above, a TypeNameMetaParameter is the leaf of any abstract type. Note that a TypeNameMetaParameter need not be abstract (it could be a TypeNameMetaBinding).

AbstractTemplateId

An AbstractTemplateId is a template-id (template name plus arguments) where either the template or the arguments are abstract. The template is indicated as a NamedAtomicType, but it must always be either a ClassTemplate or a TemplateMetaParameter. For example:

  template <class T>
  struct A;
  
  template <class T>
  struct B {
    A<T> a;
  };
The B::a member has type A<T>, but (in the context of the template B) T is abstract, so A<T> is too.

Here is an example with an abstract template portion of an AbstractTemplateId:

  template <template <class S> class T, class U>
  struct A {
    T<U> t;
  };
Here, A::t has type T<U>, in which both T and U are abstract.

In Elsa, this is called a "PseudoInstantiation".

AbstractQualifiedType

An AbstractQualifiedType arises when a qualified name [3.4.3, 5.1p7] is used, but the qualifier is an abstract type name. The "first" part (the abstract qualifier) is indicated as a NamedAtomicType, but must be a TypeMetaParameter or an AbstractTemplateId. The "rest" part is a PossiblyQualifiedName, which is a sequence of identifiers and template-ids (to be shown on the as-yet-nonexistent Diagram 3).

Example:

  template <class T>
  struct A {
    typename T::SomeType s;
  };
The type of A::s is T::SomeType, which is abstract because T is.

In Elsa, this is called a "DependentQualifiedType" or "DQT".

AbstractSizedArrayType

An AbstractSizedArrayType is an ArrayType whose size is specified by an abstract Expression. This is a necessary concept to permit matching declarations with definitions in some cases, for example (variant of ../in/t0435.cc):

  template <int n>
  struct A {
    int foo(int (*p)[n]);
    int foo(int (*p)[n+1]);
  };

  template <int n>
  int A<n>::foo(int (*p)[n]) { /*...*/ }

  template <int n>
  int A<n>::foo(int (*p)[n+1]) { /*...*/ }
Here, the array size expressions must be recorded in the data model to enable the declarations of A::foo to be matched with their definitions.

MetaEntity Scopes

Two extensions of Scope are used for template support.

MetaParameterScope

MetaParameters are stored in a MetaParameterScope. When this scope is searched by lookup, MetaParameters can be found.

MetaBindingScope

When an implicit specialization is instantiated, the static semantics require analyzing the specialization in light of its arguments. The arguments are bound in a MetaBindingScope for this purpose.

(At the moment I cannot justify distinguishing a MetaParameterScope from a MetaBindingScope, but I know that Elsa needs to be able to tell them apart somewhere, so I'm reasonably confident that they do in fact need to be distinguished.)

AbstractEnumerator

An AbstractEnumerator is like an Enumerator, but the precise value is unknown because its initializing expression is abstract. It is a refinement of the Enumerator entity presented in Diagram 1.

It is not clear whether it is adequate to simply record that its value is abstract, as done here, or if it must actually carry the expression. The testcase ../in/t0578.cc would be valid if AbstractEnumerators carried their value, and invalid otherwise. Both ICC and GCC reject it, so at least the data model here is consistent their their interpretation. If the Standard had a clear data model, such as the one presented in these diagrams, such a question would have a clear answer.

OutOfLineTemplateMemberDefinition (abbreviated: OOLTMDefn)

A ClassMember of a ClassTemplate may have an out-of-line definition. If so, it has an OutOfLineTemplateMemberDefinition, which carries a sequence of MetaParameterLists. These parameter lists override the nominal parameter names coming from the enclosing template class(es). For example, in the following syntax:

  template <class S1>
  struct A {
    template <class T2>
    struct B {
      int f();
    };
  };

  template <class S2>
  template <class T2>
  int A<S2>::B<T2>::f() { /*...*/ }
the ClassMember for A::B::f will have enclosing parameter lists <S2> and <T2>. The definition of A::B::f may refer to the parameters in its enclosing parameter lists (S2 and T2), but not to the parameters in the enclosing TemplateClasses (S and T).

The order of the enclosingParams[] sequence is significant. The first parameter list corresponds to the outermost enclosing template class, then proceeding towards the next innermost in sequence, until the last parameter list corresponds to the innermost enclosing template class. In other words, the enclosingParams[] sequence order is the same as the order in which they appear syntactically [14.7.3p17(?)].

Diagram 3: C++ Abstract Syntax

This is work in progress, so no pretty pictures yet.

  TranslationUnit: 
    decls: Declaration[]
    
  Declaration
    - BlockDeclaration
    | - SimpleDeclaration
    | |   declSpecifier
    | |   initDecl[]
    | - AsmDefinition
    | |   string
    | - NamespaceAliasDefinition
    | |   src: Identifier
    | |   tgt: PQName
    | - UsingDeclaration
    | |   t?
    | |   PQName
    | - UsingDirective
    |     PQName
    - FunctionDefinition
    |   declSpecifier
    |   declarator
    |   ctorInit[]
    |   handlers[]
    - TemplateDeclaration
    |   e?
    |   params[+]
    |   d: Declaration
    - ExplicitInstantiation
    |   d: Declaration
    - ExplicitSpecialization
    |   d: Declaration
    - LinkageSpecification
    | | string
    | - LSSeq
    | |   decls[]
    | - LSOne
    |     decl: Declaration
    - NamespaceDefinition
        n: Id?
        decls[]
        
  InitDeclarator
    declarator
    init?
    
  Initializer
    - IN_expr
    |   expr
    - IN_compound
    |   initializer[]
    - IN_ctor
        args: Expression[+]
        
  DeclSpecifier(Seq)
    declflag*
    typeSpec
    
  DeclFlag
    - StorageClass
      - auto
      - register
      - static
      - extern
      - mutable
    - FunctionSpecifier
      - inline
      - virtual
      - explicit
    - typedef
    - friend
    - CVFlag
      - const
      - volatile
      
  TypeSpecifier
    - TS_name
    |   t?
    |   PQName
    - TS_classSpec
    |   keyword
    |   PQName?
    |   BaseClause[]
    |   Member[]
    - TS_enumSpec
    |   Id?
    |   EnumeratorDefn[]
    - TS_elaborated
    |   keyword
    |   PQName
    - TS_fundamental
        FundamentalType
        
  Declarator           <----+
    - D_name                |
    |   PQName?             |
    - D_indirect            |
      | base: Declarator ---+
      - D_pointer
      |   cv
      - D_reference
      - D_ptm
      |   cv
      |   class: PQName
      - D_func
      |   cv
      |   params: ParameterDeclaration[]
      |   exnSpec?
      - D_array
          size: Expression

  PQName     <----+
    - QName       |
    | | rest -----+
    | - QN_global
    | - QN_nested
    |     qual -------+
    - PQ_nameId   <---+
    | | n: Id
    | - PQ_name
    | - PQ_template
    |     args: TemplateArg[]
    - PQ_operator
    |   op
    - PQ_conversion
    |   t: Type
    - PQ_dtor
        c: PQ_nameId

Valid HTML 4.01!

Up  
 
 
 
 
 
 
 
 
 
 
 

This space left intentionally blank.