RascalTutor
Synopsis

Overview of important terms and concepts.

Description

Rascalopedia gives a quick overview of the most important terms and concepts that are relevant for metaprogrammers in general and metaprogrammers using Rascal in particular.

Rascalopedia is work in progress. Please send us your suggestions for new concepts.

These are the currently covered topics:

  • AbstractDataType: A definition of a data type.

  • Abstract Syntax Tree: Representation of the abstract syntactic structure of a program.

  • Compiler: Tranform source code to an executable form.

  • Domain Specific Language: Programming language targeted for a particular application domain.

  • Dynamic Semantics: Description of the execution behaviour of a program.

  • Grammar: A synonym for Syntax.

  • Interpreter: Directly execute the statements of a program.

  • Language: The set of strings defined by a Grammar.

  • Language Definition: Description of all aspects of a language.

  • List: An ordered sequence of values.

  • MetaProgramming: Analysis or transformation of one program by another program.

  • ParseTree: Detailed represention of the concrete syntactic structure of a program.

  • Parser: Check that a text adheres to the rules of a language (and return a ParseTree).

  • Prettyprinter: Transform an Abstract Syntax Tree into a formatted string.

  • Refactoring: Restructuring source code to improve its internal structure without changing its external behaviour.

  • Relation: An unordered set of tuples.

  • Scope: The visibility and accessibility of names in a program.

  • Set: An unordered collection of values without duplicates.

  • Software Engineering: Discpline of design, building and maintaining software.

  • Software Evolution: Understanding and managing the continuous change of software.

  • Software Metric: A metric to measure a source code property.

  • Static Semantics: Description of the properties of a program that can be determined/checked before it is executed.

  • Syntax: The rules that describe correctly structured programs in a language.

  • Testing: Determine that the quality and functionality of software.

  • Tuple: An ordered, fixed length, sequence of values of possibly different type.

  • Typechecker: Checks the type rules for a source language.

  • Visualization: Visual presentation of scientific or abstract data.

1. AbstractDataType

Synopsis

A definition of a data type.

Description

An Abstract Data Type is a mathematical description of a structure that can be implemented in various ways. For instance, a stack data type can be characterized by empty (the empty stack), two functions push and pop and axioms that define them. At the implementation level, a stack can be implemented using a list, array or something else.

In functional languages, and also in Rascal, abstract datatypes (or ADTs for short) are used to define new data types. Well-known examples are stack and tree.

1.1. Abstract Data Types in Daily Life

Examples

1.2. Abstract Data Types in computer science

  • The run-time stack of a programming language interpreter.

  • A search tree.

  • An ontology.

1.3. Abstract Data Types in Rascal

  • A tree data type:

data MyTree = leaf(int n) | tree(str name, MyTree left, MyTree right);

2. Abstract Syntax Tree

Synopsis

Representation of the abstract syntactic structure of a program.

Description

A ParseTree is a detailed and very precise represention of the concrete syntactic structure of a program. It may even be so detailed that it contains every space, comment and parenthesis in the original source text. In many cases a less detailed representation is sufficient and an abstract syntax tree (or AST for short) is used.

Examples

For the input sentence

example-text

the parse tree (left) and abstract syntax tree (right) may look as follows:

parse-ast

Note that the parse tree on the left did not preserve the spaces in the original text but there are parse tree formats (including the one used by Rascal) that preserve all textual information.

3. Compiler

Synopsis

Tranform source code to an executable form.

Description

A compiler transforms the source code of a program (in a source langue) to an executable form (in a target language) and consists of the following phases:

  • Parser: read the source code and build an Abstract Syntax Tree.

  • Typechecker: perform a semantic analysis of the code, resolve all names and verify that the program is type correct.

  • Optimisation: perform optimisations (e.g., constant folding, dead code elimination, call unfolding). This can be seen as a form of Refactoring.

  • Code generation: generate the final code, this can be asembly language or directly executable code.

4. Domain Specific Language

Synopsis

Programming language targeted for a particular application domain.

Examples
  • SQL (querying data bases).

  • HTML (web pages).

  • CSS (presentation of HTML pages).

  • Excel (spreadsheets).

  • BibTex (bibliography).

  • BNF (grammars).

  • Regular expressions (text matching).

5. Dynamic Semantics

Synopsis

Description of the execution behaviour of a program.

Description

Dynamic semantics describes the execution behaviour of a program and includes:

  • Treatment of declarations, names, variables and Scopes.

  • Execution of procedures, statements and expressions.

Contrast with Static Semantics that describes pre-execution behaviour.

6. Grammar

Synopsis

A synonym for Syntax.

7. Interpreter

Synopsis

Directly execute the statements of a program.

Description

There are two methods to execute a program that is written in some source language:

  • An Interpreter directly executes the source statements (but see the variations below).

  • A Compiler translates the source program to some efficient executable form. That executable form is then executed by a hardware processor.

Interpreters exist in many flavours:

  1. Direct execution of the source.

  2. First parse the source text and build an Abstract Syntax Tree that is then interpreted.

  3. As (2), but convert the AST first to an intermediate form that is more suitable for execution. Then interpret that intermediate form.

  4. As (2), but compile frequently executed parts of the the AST to executable code.

Clearly, going down this list, the interpreter more and more starts resembling a compiler.

The advantages of interpreters versus compiler are:

  • Interpreter:

    • Pro: simpler than compiler, faster development loop, better debugging facilities, better error messages.

    • Con: slower.

  • Compiler:

    • Pro: fast execution.

    • Con: complex, optimizations are error-prone.

8. Language

Synopsis

The set of strings defined by a Grammar.

Description

A Grammar or Syntax defines the formation rules for a language. A language is the (possible infinite) set of strings that are defined by a grammar.

Examples
  • The language of strings of at most 5 a 's: the finite set {"a", "aa", "aaa", "aaaa", "aaaaa"}.

  • The language of strings that correspond to even numbers: the infinite set {"0", "2", "4", "6", …​}

  • The Java language: the infinite set of syntactically correct Java programs.

9. Language Definition

Synopsis

Description of all aspects of a language.

Description

A language definition defines all relevant aspects of a programming language or Domain Specific Language and includes:

  • A Grammar (including lexical and contect-free syntax).

  • Rules to describe the textual formatting of a language. These rules are sufficient to generate a Prettyprinter for it.

  • Rules that describe the Static Semantics of a language. These rules are sufficient to generate a Typechecker.

  • Rules that describe the Dynamic Semantics of a language. These rules are sufficient to generate an Interpreter for it.

  • Rules how to generate code.

Other aspects of a language definition may include editor behaviour, highlighting, debugging, outlining, auto-completion and more.

10. List

Synopsis

An ordered sequence of values.

Description

A list is a sequence of values with the following properties:

  • The list maybe empty.

  • The values in the list are ordered.

  • The same value may occur more than once.

  • The list has a size that is equal to the number of values in the list.

  • Each element in a list L has an index. The first element has index 0. The last element has index size(L)-1.

Formally, a list can be defined as follows. Given the domains ELEM (elements) and LIST (lists) and the functions:

nil :             -> LIST
cons: ELEM x LIST -> LIST
head: LIST        -> ELEM
tail: LIST        -> LIST

nil and cons are so-called constructor functions that define the values in LIST. They can be paraphrased as:

  • The empty list nil is an element of LIST.

  • If e is an element of ELEM and l is an element of LIST, then cons(e, l) is also an element in LIST.

head (take the first element) and tail (take the remainder of a list) are defined functions characterized by the axioms:

head(cons(e, l)) = e
tail(cons(e, l)) = l

The cases head(nil) and tail(nil) are left undefined (and usually correspond to a runtime error in a programming language).

In Rascal, lists are surrounded by brackets [ and ] and the elements are separated by commas. Each list has a type of the form list[T], where T is the smallest common type of all list elements. Read the description of lists and their operators and of library functions on lists.

10.1. Lists in Daily Life

Examples
  • A line of people waiting for the super market checkout or bus stop. bustop credit

  • The wagons of a train.

  • The Top 100 Music Charts. hot100.png credit

  • Twitter users ordered according to number of followers.

  • A to do list.

10.2. Lists in computer science

  • The locations in a computer memory.

  • The list of processes that use most cpu time.

  • The list of procedures that are called by a given procedure.

10.3. Lists in Rascal

  • The empty list: []. Its type is list[void].

  • A list of integers: [3, 1, 4]. Its type is list[int].

  • A list of mixed-type values: [3, "a", 4]. Its type is list[value].

11. MetaProgramming

Synopsis

Analysis or transformation of one program by another program.

Description

All programs have data as input and produce other data as output. A typical example is a desktop calculator program: after entering some numbers and operators, the program displays an answer. For most programs the data are numeric (calculator, spreadsheet) or textual (text editor, word processor).

A metaprogram is a program that uses programs as data. Writing metaprograms is called metaprogramming.

A metaprogram has to be written in some programming language itself. This is called the metalanguage.

The program that is manipulated by the metaprogram is called the source program (also: object program) and is written in the source language (also: object language).

In some cases the metaprogram transforms the source program into a target program in a target language.

Examples

A Refactoring tool for restructuring Java code:

  • Metaprogram: the refactoring tool.

  • Metalanguage: in most cases Java.

  • Source program: the user’s Java program to be refactored.

  • Source language: Java.

  • Target program: the refactored user’s program.

  • Target language: Java.

A Java Compiler:

  • Metaprogram: the Java compiler.

  • Metalanguage: in most cases Java.

  • Source program: the user’s Java program to be compiled.

  • Source language: Java.

  • Target program: the code that is generated by the compiler.

  • Target language: instructions for the JVM (Java Virtual Machine) or machine code, depending on the hardware platform.

A tool to compute Software Metrics of Java programs

  • Metaprogram: the metrics tool.

  • Metalanguage: varies per tool: Java, Rascal.

  • Source program: the user’s Java program for metrics will be computed.

  • Source language Java.

  • Target program: the value of the computed metric.

  • Target language: number.

12. ParseTree

Synopsis

Detailed represention of the concrete syntactic structure of a program.

Description

A parse tree is a detailed and very precise represention of the concrete syntactic structure of a program. It may even be so detailed that it contains every space, comment and parenthesis in the original source text.

Examples

A parse tree for the sentence example

parse-tree

13. Parser

Synopsis

Check that a text adheres to the rules of a language (and return a ParseTree).

Description

A parser checks that a text in language L indeed adheres to the syntax rules of language L. There are two possible answers:

  • Yes. A ParseTree is returned that shows how the text adheres to the syntax rules.

  • No. Error messages pin point the location where the text deviates from the syntax rules.

This is shown below:

parser

14. Prettyprinter

Synopsis

Transform an Abstract Syntax Tree into a formatted string.

Description

A pretty printer formats the source code of programs. Alternative names are formatter or beautifier. Pretty printers differ in the inputs they accept:

  • The source text itself.

  • A ParseTree that corresponds to the source text. This variant is also called unparser.

  • An Abstract Syntax Tree that corresponds to the source text.

Pretty printers also differ in flexibility. They differ in:

  • The source language(s) they can accept.

  • The adaptability of the formatting rules.

Examples

The program fragment

if(x > 10) { System.err.println("x > 10"); } else { System.err.println("x <= 10"); }

can be pretty printed in many different ways. Here are two variants examples:

if(x > 10) {
   System.err.println("x > 10");
} else {
   System.err.println("x <= 10");
}
if( x > 10 )
{
  System.err.println("x > 10");
} else
{
   System.err.println("x <= 10");
}

15. Refactoring

Synopsis

Restructuring source code to improve its internal structure without changing its external behaviour.

Description

Refactoring was popularized by Martin Fowler and aims at improving source code quality. The basic philosophy is to identify small, atomic, refactoring steps that improve the internal structure of the code but do not change its external behaviour. The supposed simplicity of these steps must guarantee their correctness.

Atomic steps can be combined to create large and complex refactorings. The major Interactive Development Environements — Eclipse, IntelliJ, Visual Studio — provide interactive support for refactoring.

Examples

Some well-known refactorings are:

16. Relation

Synopsis

An unordered set of tuples.

Description

In mathematics, given sets D1, D2, …​ Dn, a n-ary relation R is characterized by RD1 × D2 × …​ × Dn. In other words, R consists of a set of tuples < V1, …​, Vn > where each Vi is an element of the set Di. When n = 2, we call the relation a binary relation.

In database theory, a relation is a table with a heading and an unordered set of tuples:

D1 Name1 D2 Name2 …​ Dn Namen

V11

V12

…​

V1n

V21

V22

…​

V2n

V31

V32

…​

V3n

…​

…​

…​

In Rascal, a relation is a set of tuples and is characterized by the type: rel[D1 Name1, D2 Name2, …​, Dn Namen] See Relation Values and for a description of relations and their operators (since relations are sets all set operators also apply to them, see Set Values) and functions on relations (and here again, since relations are sets all set operators also apply to them, see functions on sets).

16.1. Relations in Daily Life

Examples
  • The parent-of or friend-of relation between people. image::/Rascalopedia/Relation/char-relation.jpg[width="350px" ,alt="char-relation"] credit

  • A character relation map, describing the relations between the characters in a play or soap series.

  • A listing of the top 2000 songs of all times including the position, artist name, song title, the year the song was published. top2000-2010 credit

16.2. Relations in computer science

  • A relational data base.

  • Login information including user name, password, home directory, etc.

16.3. Relations in Rascal

  • A parent child relation:

rel[str parent, str child] = {
<"Paul", "Eva">,
<"Paul", "Thomas">,
<"Jurgen", "Simon">,
<"Jurgen", "David">,
<"Tijs", "Mats">
};
  • A fragment of the top 2000 relation:

rel[int position, str artist, str title, int year] Top2000 = {
<1, "Eagles", "Hotel California",1977>,
<2, "Queen", "Bohemian rhapsody", 1975>,
<3, "Boudewijn de Groot", "Avond", 1997>,
...
};

17. Scope

Synopsis

The visibility and accessibility of names in a program.

18. Set

Synopsis

An unordered collection of values without duplicates.

Description

A set is a collection of values with the following properties:

  • The set maybe empty.

  • The values in the list are unordered.

  • A value can only occur once.

  • The set has a size that is equal to the number of values in the set.

In Rascal, sets are surrounded by braces { and } and the elements are separated by commas. Each set has a type of the form set[T], where T is the smallest common type of all set elements. Read the description of sets and their operators and of library functions on sets.

18.1. Sets in Daily Life

Examples
  • A cutlery set consisting of knife, fork and the like. cutlery-set credit

  • A crowd of people.

  • A stamp collection (but be aware that the duplicates will disappear!) stamp-collecting credit

18.2. Sets in computer science

  • The files in a directory. Of course, when you order them (by name, modification date) you need a List to represent them.

  • The set of moves an opponent can play in a game.

  • The set of nodes in a network.

18.3. Sets in Rascal

  • The empty set: {}. Its type is set[void].

  • A set of integers: {3, 1, 4}. Its type is set[int].

  • A set of mixed-type values: {3, "a", 4}. Its type is set[value].

19. Software Engineering

Synopsis

Discpline of design, building and maintaining software.

Description

Software engineering is the discipline that encompasses all aspects of creating software and encompasses:

  • Requirements engineering: determine what the future owners and users of a software system expect.

  • Software Design: design the global architecture as well as as the technical details.

  • Software Construction: build software according to its specification.

  • Software Testing: test that software works according to its specifications.

  • Software Deployment: distribute software to its users.

  • Software Maintenance: maintain software after it has been deployed.

There are various models to organize the above activities. The classical waterfall model organizes them sequentially. Variations are more iterative and allow to go back to earlier phases. Waterfall-based methods follow solid engineering practices but may lead to much bureacracy and an inflexible process that cannot easily cope with changing requirements.

Other approaches promote agile development and are characterized by very short iterations that include all the above activities. Agile methods aim to produce prototypes as early as possible and this makes it easier for future users to assess the prototype and suggest changes.

20. Software Evolution

Synopsis

Understanding and managing the continuous change of software.

Description

Meir M. Lehman was one of the first scientist to observe that software evolves over its lifetime. He formulated several laws about software evolution. Here are three examples of his laws (slightly paraphrased):

  • Continuing Change: Programs must be continually adapted or they become progressively less usefull.

  • Increasing Complexity: When a program evolves, its complexity increases unless work is done to maintain or reduce it.

  • Continuing Growth: The functional content of programs must be continually increased to maintain user satisfaction over their lifetime.

Software evolution is a specialisation in Software Engineering that address the following:

  • Understanding the reasons for software evolution.

  • Understanding the impact of software evolution on the structure and quality of source code.

  • Developing Software Metrics and tools to measure the impact of software evolution.

  • Developing methods and tools for the better understanding of source code.

  • Developing Refactoring tools to counter the effects of software evolution.

21. Software Metric

Synopsis

A metric to measure a source code property.

Description

A software metric is a quantitative measure about source code. A combination of one or more metrics can be used to quantitatively characterize aspects of software quality. Various quality aspects are of interest such as size, reliability, maintainability and so on.

Examples

Examples of software metrics are:

  • Source lines of code (SLOC) measures the size of software. The larger the size, the more is needed to build and maintain it.

  • Cyclomatic complexity measures logical complexity of code. Software components with a high cyclomatic complexity are hard to understand and maintain.

  • Coupling measures the coupling between software components. High coupling indicates problems in the structure of a system.

22. Static Semantics

Synopsis

Description of the properties of a program that can be determined/checked before it is executed.

Description

The static semantics of a program describe all properties that can be determined before the program is executed. A Typechecker is a tool that checks the properties of a program as described by its static semantics.

Static semantics describes properties that are relevant before a program is executed and differs from Dynamic Semantics that describes the execution behaviour itself.

Examples

Examples of static semantic properties include:

  • The proper use of types.

  • The proper use of names.

Language with substantial static semantics: Java, Haskell, Rascal. Languages with only dynamic semantics: Python, Ruby.

23. Syntax

Synopsis

The rules that describe correctly structured programs in a language.

Description

According to the Merriam-Webster dictionary syntax means

  • the way in which linguistic elements (as words) are put together to form constituents (as phrases or clauses);

  • the part of grammar dealing with this.

Dictionary.com is more elaborate and defines syntax as:

  • Linguistics:

    • a. the study of the rules for the formation of grammatical sentences in a language.

    • b. the study of the patterns of formation of sentences and phrases from words.

    • c. the rules or patterns so studied: English syntax.

    • d. a presentation of these: a syntax of English.

    • e. an instance of these: the syntax of a sentence.

  • Computers: the grammatical rules and structural patterns governing the ordered use of appropriate words and symbols for issuing commands, writing code, etc., in a particular software application or programming language.

Wikipedia says: the syntax of a programming language is the set of rules that define the combinations of symbols that are considered to be correctly structured programs in that language.

In linguistics, a Grammar is a concept that includes syntax. However, in the cases that are relevant for meta-programming they can be used interchangeably. We will use them as synonyms.

In programming languages a further subdivision can be made:

  • Lexical syntax defines the form of the lowest level textual items such as keywords, numeric constants, and string constants.

  • Context-free syntax defines the global structure of statements, procedures and modules.

A Parser checks that a text in language L indeed adheres to the syntax rules of language L. There are two possible answers:

  • Yes. A ParseTree is returned that shows how the text adheres to the syntax rules.

  • No. Error messages pin point the location where the text deviates from the syntax rules.

24. Testing

Synopsis

Determine that the quality and functionality of software.

Description

Software testing is the process to determine that a software system meets its specifications and works as expected. This is done by manually or automatically executing test cases and observe the result.

25. Tuple

Synopsis

An ordered, fixed length, sequence of values of possibly different type.

Description

A tuple is an ordered fixed length sequence of values of possibly different type.

In Rascal a tuple is written as < V1, …​, Vn > and a tuple type has the form type[T1, …​, Tn], Ti represents the type of element i. Tuple have two major applications:

  • As tuples in a Relation.

  • For ad-hoc packaging of values, for instance, to return multiple-values from a function.

See Tuple Values for the operations on tuples.

26. Typechecker

Synopsis

Checks the type rules for a source language.

Description

A type system is a set of rules that defines how values, variables and functions may be used in a given programming languages.

A type checker, checks that these rules are enforced. The moment that type checking can be done differs per type system, but two extremes exist:

  • Static type checking: all checking is done before the program is executed.

  • Dynamic type checking: all checking is done during execution of the program.

  • Hybrid type checking: when possible checks are done before execution, the remaining checks are done during execution.

These different styles of type checking have different trade offs:

  • Static typechecking:

    • Pro: most errors are found before execution.

    • Con: more type declarations have to be written by the programmer and in some situations the type systems limits what can be expressed.

  • Dynamic checking:

    • Pro: most flexible and expressive.

    • Con: errors can only be found during execution.

  • Hybrid (or gradual) type checking:

    • Pro: a reasonable compromise.

    • Con not be as safe as full static typechecking.

Examples
  • If in Java a variable has been declared as bool it cannot be added to an integer.

  • If in Java a method has three formal parameters, it cannot be called with four actual parameters.

  • In Python, a variable can first get a string value assigned and later on an integer value.

27. Visualization

Synopsis

Visual presentation of scientific or abstract data.

Description

Visualization is the activity of presenting scientific data or abstract structures in a visual form. There are several subareas:

The Visualization Library library provides a framework for interactive visualization. Simple examples can be found in visualization recipes.

27.1. Scientific visualization

Examples

Robert vam Liere and Wim de Leeuw have visualized liquid flows. There is also an animated version.

flow

27.2. Information Visualization

Facebook’s Friend Wheel shows the connection between friends:

friends

27.3. Software Visualization

Stephen Eick visualizes the frequency of execution for each line in all source files of a software system.

frequency

Daniel Bierwirth shows the connections (colored lines) between system components (at outer circle).

bundle