FOSD origami

Feature-oriented programming or feature-oriented software development (FOSD) is a general paradigm for program synthesis in software product lines. The feature-oriented programming page is recommended, it explains how an FOSD model of a domain is a tuple of 0-ary functions (called values) and a set of 1-ary (unary) functions called features. This page discusses multidimensional generalizations of FOSD models, which are important for compact specifications of complex programs.

Origami

A fundamental generalization of metamodels is origami. The essential idea is that a program's design need not be represented by a single expression; multiple expressions can be used.[1][2][3] This involves the use of multiple orthogonal GenVoca models.

Example: Let T be a tool model, which has features P (parse), H (harvest), D (doclet), and J (translate to Java). P is a value and the rest are unary-functions. A tool T1 that parses a file written in a Java dialect language and translates it to pure Java is modeled by: T1 = J•P. And a javadoc-like tool T2 parses a file in a Java dialect, harvests comments, and translates harvested comments into an HTML page is: T2 = D•H•P. So tools T1 and T2 are among the products of the product line of T.
A language model L describes a family (product line) of Java dialects. It includes the features: B (Java 1.4), G (generics), S (State machines). B is a value, and the rest are unary functions. So a dialect of Java L1 that has generics (i.e., Java 1.5) is: L1 = G•B. And a dialect of Java L2 that has language support for state machines is: L2 = S•B. So dialects L1 and L2 are among the products of the product line of L.
To describe a javadoc like tool (E) for the dialect of Java with state machines requires two expressions: one that defines the tool functionality for E (using the T model) and its Java dialect (using the L model):
  E = D•H•P    -- tool equation
  E = S•B      -- language equation
Models L and T are orthogonal GenVoca models: one expresses the feature-based structure of the E tool, and the other the feature-based structure of its input language. Note that models T and L really are abstract in the following sense: the implementation of any feature of T really depends on the tool's dialect (expressed by L), and (symmetrically) the implementation of any feature of L really depends on the tool's functionality (expressed by T). So the only way one could implement E is by knowing both T and L equations.

Let U=[U1,U2,...,Un] be a GenVoca model of n features, and W=[W1,...Wm] be a GenVoca model of m features. The relationship between two orthogonal models U and W is a matrix UW, called an Origami matrix, where each row corresponds to a feature in U and each column corresponds to a feature in W. Entry UWij is a function that implements the combination of features Ui and Wj.

Note: UW is the tensor product of U and W (i.e., UW=U×W).
U W = U × W = [ U W 11 U W 12 U W 1 n U W m 1 U W m 2 U W m n ] {\displaystyle UW=U\times W={\begin{bmatrix}UW_{11}&UW_{12}&\cdots &UW_{1n}\\\vdots &\vdots &\ddots &\vdots \\UW_{m1}&UW_{m2}&\cdots &UW_{mn}\end{bmatrix}}}
Example. Recall models T=[P,H,D,J] and L=[B,G,S]. The Origami matrix TL is:
T L = T × L = [ P B P G P S H B H G H S D B D G D S J B J G J S ] {\displaystyle TL=T\times L={\begin{bmatrix}PB&PG&PS\\HB&HG&HS\\DB&DG&DS\\JB&JG&JS\end{bmatrix}}}
where PB is a value that implements a parser for Java, PG is a unary-function that extends a Java parser to parse generics, and PS is a unary-function that extends a Java parser to parse state machine specifications. HB is a unary-function that implements a harvester of comments on Java code. HG is a unary-function that implements a harvester of comments on generic code, and HS is a unary-function that implements a harvester of comments on state machine specifications, and so on.

To see how multiple equations are used to synthesize a program, again consider models U and W. A program F is described by two equations, one per model. We can write an equation for F in two different ways: referencing features by name or by their index position, such as:

F = U 1 U 2 U 4 = i = 1 , 2 , 4 U i {\displaystyle F=U_{1}\cdot U_{2}\cdot U_{4}=\sum _{i=1,2,4}U_{i}} — U expression of F
F = W 1 W 3 = j = 1 , 3 W i {\displaystyle F=W_{1}\cdot W_{3}=\sum _{j=1,3}W_{i}} — W expression of F

The UW model defines how models U and W are implemented. Synthesizing program F involves projecting UW of unneeded columns and rows, and aggregating (a.k.a. tensor contraction):

F = U W 11 U W 21 . . . U W 33 = i = 1 , 2 , 3 j = 1 , 3 U W i , j = j = 1 , 3 i = 1 , 2 , 3 U W i , j {\displaystyle F=UW_{11}\cdot UW_{21}\cdot ...\cdot UW_{33}=\sum _{i=1,2,3}\sum _{j=1,3}UW_{i,j}=\sum _{j=1,3}\sum _{i=1,2,3}UW_{i,j}}

A fundamental property of origami matrices, called orthogonality, is that the order in which dimensions are contracted does not matter. In the above equation, summing across the U dimension (index i) first or the W dimension (index j) first does not matter. Of course, orthogonality is a property that must be verified. Efficient (linear) algorithms have been developed to verify that origami matrices (or tensors/n-dimensional arrays) are orthogonal.[4] The significance of orthogonality is one of view consistency. Aggregating (contracting) along a particular dimension offers a 'view' of a program. Different views should be consistent: if one repairs the program's code in one view (or proves properties about a program in one view), the correctness of those repairs or properties should hold in all views.

In general, a product of a product line may be represented by n expressions, from n orthogonal and abstract GenVoca models G1 ... Gn. The Origami matrix (or cube or tensor) is an n-dimensional array A:

A = G 1 × . . . × G n = k = 1 n G k {\displaystyle A=G_{1}\times ...\times G_{n}=\prod _{k=1}^{n}G_{k}}

A product H of this product line is formed by eliminating unnecessary rows, columns, etc. from A, and aggregating (contracting) the n-cube into a scalar:

H = i 1 i 2 . . . i n G i 1 , i 2 . . . i n {\displaystyle H=\sum _{i_{1}}\sum _{i_{2}}...\sum _{i_{n}}G_{i_{1},i_{2}...i_{n}}}
Example. Recall program E and model T=[P,H,D,J]. E=D•H•P=T2•T1•T0. Similarly, E's representation in model L=[B,G,S] is E=S•B=L2•L0. Synthesizing E given Origami model TL is evaluating the following expression: E = i = 2 , 0 j = 2 , 0 T L i , j = j = 2 , 0 i = 2 , 0 T L i , j {\displaystyle E=\sum _{i=2,0}\sum _{j=2,0}TL_{i,j}=\sum _{j=2,0}\sum _{i=2,0}TL_{i,j}} .

Applications

There are several of product line applications developed using Origami. Among them include:

  • AHEAD Tool Suite and Extensible Java Preprocessors
  • Expression Problem or the Extensibility Problem
  • Refinements and Multi-Dimensional Separation of Concerns

More applications to be supplied.

See also

  • Feature-oriented programming
  • FOSD metamodels

References

  1. ^ "Generating Product-Lines of Product-Families" (PDF).
  2. ^ "Refinements and Multi-Dimensional Separation of Concerns" (PDF).
  3. ^ "Evaluating Support for Features in Advanced Modularization Technologies" (PDF).
  4. ^ "Design and Analysis of Multidimensional Program Structures" (PDF).