Rahul Goma Phulore
on March 16th, 2012.
Posted in Programming, Scala
, Tagged with dsl, excel-parser
This is the third and last post in our DSL series. In this post, we will define some operators for our DSL, and then implement a way to invoke arbitrary functions on expressions. You might want to skim thru the last two posts before you proceed.
A: Defining the operators:
The operators we will be implementing are all binary, and have the following structure:
where a and b are both of type Expr[_]. Therefore obviously the method op should go inside Expr[_] trait, as shown below:
trait Expr[+A] { a =>
def eval(st: SymbolTable): A
def op[B, C](b: Expr[B]): Expr[C] = /* impl */
}
Now let us consider operator +. +, in our DSL, requires that the two expressions on its sides have type Real. (Real is just a type alias for BigDecimal.) Modifying the method signature as: def +(b: Expr[Real]): Expr[Real] takes care of the right hand side. But how do we enforce that A be Real? This is where generalized type constraints come in. Using them, we can enforce that A be Real as shown below: (Please refer this stackoverflow thread to understand how this works.)
def +(b: Expr[Real])(implicit ev: A <:< Real): Expr[A]
Let’s now try implementing +.
trait Expr[+A] { a =>
def eval(st: SymbolTable): A
def +(b: Expr[Real])(implicit ev: A <:< Real): Expr[Real] = {
BinOp[Real, Real, Real](a, b, _ + _, "+")
}
}
Oops, that does not work. Compiler complains that a has type Expr[A], but what it requires is a value of type Expr[Real]. You might say, “but wait, didn’t we just provide compiler with an evidence that A <: Real? Since, Expr[_] is a covariant type constructor, it should follow that Expr[A] <: Expr[Real].” Yes, you are right. However Scala type system does not perform this lifting automatically, and so we have to do it explicitly. Here we land into another problem: The class <:< provides no method for such lifting.
Scalaz to the rescue! Scalaz provides a class named Liskov (with type alias <~<) which, as its authors put it, is a better <:< than <:<. It provides a method named co with which we can perform such lifting. Here is a working implementation of + operator on Expr[_]:
trait Expr[+A] { a =>
def eval(st: SymbolTable): A
def +(b: Expr[Real])(implicit ev: A <~< Real): Expr[Real] = {
BinOp[Real, Real, Real](co[Expr, A, Real](ev) apply a, b, _ + _, "+")
}
}
The operators like -, *, / can be defined in the same manner.
The implementation of comparison operators is pretty straightforward. We use scala.Ordering[_] type class to specify the constraint that the types in question must be comparable.
Code:
trait Expr[+A] { a =>
def =:=[B](b: Expr[B]): Expr[Boolean] = BinOp[Boolean, A, B](a, b, _ == _, "=:=")
def <[B >: A](b: Expr[B])(implicit ord: scala.Ordering[B]): Expr[Boolean] = {
BinOp[Boolean, B, B](a, b, ord.lt, "<")
}
}
<=, >, and >= cane be defined in the same manner as <.
The last remaining operator is :=, used for assignment in computations. We require that the expression on the left hand side be an Atom[_]. So it makes sense to define := on Atom[_].
case class Atom[+A](sym: Symbol) extends Expr[A] {
def :=[B](b: Expr[B]) = Computation[B](sym, b)
def eval(st: SymbolTable): A = st(sym).asInstanceOf[A]
override def toString = sym.toString
}
We already saw in last post what the definition of Computation[_] looks like. As I said before, evaluation of computation is beyond the scope of this post.
B: Arbitrary function invocation:
With the operators we implemented above, we can express most computations and verifications we will need to perform. However it may happen that Expr[_] does not have an operator for certain operation. In other words, you may need to invoke arbitrary functions, such as the cases we marked with /* arbitrary function invocation */ before. In such a case, it would make sense to be able to lift an ordinary Scala function into the context Expr[_].
For this, we will be using abstractions called Functor and Applicative Functor (often contracted to Applicative). (These are fairly advanced concepts, and the explanation that follows would be mostly opaque if these concepts are new to you. If that’s the case, I’d advise you to skim this section for now, not caring to understand every little detail, and revisit if after you have understood these concepts. Here is a nice article that may help you understand them.)
The general idea is this:
If you have a function with arity 1 with type A => B, then you can lift it to context F[_], and obtain a function F[A] => F[B], provided that F is a Functor.
If you have a function with some arity n with type (A, B, C, ..) => Z, then you can lift it to context F[_], and obtain a function (F[A], F[B], F[C], …) => F[Z], provided that F is an Applicative. From this it follows that, Applicative is a generalization of Functor, and every Applicative therefore is also a Functor.
So we need to make Expr an Applicative. That is done by defining an implicit of type Applicative[Expr]. Here is the full definition:
implicit object ExprApplicative extends Applicative[Expr] {
def pure[A](a: => A): Expr[A] = Lit(a)
override def fmap[A, B](e: Expr[A], f: A => B): Expr[B] = new Expr[B] {
def eval(st: SymbolTable): B = f apply e.eval(st)
}
override def apply[A, B](f: Expr[A => B], e: Expr[A]): Expr[B] = new Expr[B] {
def eval(st: SymbolTable): B = f.eval(st) apply e.eval(st)
}
}
In this code:
- The method pure takes a value and puts it in some sort of default (or pure) context – a minimal context that still yields that value (explanation taken from LYAH). In our case, that happens to be a literal i.e. Lit[_].
- In fmap, we return a new Expr[_] which yields a value that we obtain from applying a function to the value of the expression on which mapping is being performed.
- apply works similarly except that function is also in the Expr[_] context and thus needs to be evaluated first.
The arbitrary function invocations can now be performed as follows:
'dummy =:= 'code.as[String] ° h
'x := ('y.as[Int] ⊛ 'z.as[Int])((y, z) => f(y) + g(z))
° (ASCII equivalent: map) corresponds to fmap. ⊛ (ASCII equivalent: |@|) is a syntactic sugar from Scalaz for easing the use of applicative pattern in Scala. Regarding .as[Type], as I mentioned before, we need to provide type information at places where it’s not possible for the type inferer to infer it.
And that ends our implementation of the DSL.
A small test case that showcases the DSL in action:
val commands = Seq(
real_('a) := Real(20.0),
real_('b) := Real(11.0),
real_('c) := 'a * 'b,
int_('d) := 'a.as[Real] ° (_.toInt + 2),
string_('e) := ('a.as[Real] ⊛ 'b.as[Real])(_ + "lalala" + _)
)
val results = commands.foldLeft(Map.empty[Symbol, Any]) { case (r, c) =>
r + (c.sym -> c.expr.eval(r))
}
results must_== Map(
'a -> Real(20.0),
'b -> Real(11.0),
'c -> Real(220.0),
'd -> 22,
'e -> "20.0lalala11.0"
)
This is how the specification would finally look like:
val abcSpec = new TableSpec {
def columns = Seq(
"Code" / 'code - string,
"Dummy" / 'dummy - string,
"Customer" / 'customer - string withAttributes (optional),
"Value" / 'value - decimal storedAs string withFormat "###.##" withAttributes (totaled),
"Date" / 'date - date storedAs string withFormat "dd.MM.yyyy"
// more fields...
)
def verifications = Seq(
'dummy =:= 'code.as[String] ° h,
'a =:= 'b + 'c
// more verifications...
)
def computations = Seq(
'tax := 'total_price - ('discount + 'freight + 'price),
'x := ('y.as[Int] ⊛ 'z.as[Int])((y, z) => f(y) + g(z))
// more computations...
)
}
This DSL could be made more natural and type-safe using macros and/or Miles Sabin’s shapeless, both of which either did not exist or were in their nascent stages at the time the DSL was written.
End.
Rahul Goma Phulore
on January 24th, 2012.
Posted in Programming, Scala
, Tagged with dsl, excel-parser
In this post we will start building a DSL for reading and manipulating structured data from Excel sheets. We talked about it while closing the previous post.
The requirements we had can be briefly summed up as follows: (Note: The actual domain requirements were far more involved. We have stripped them down to necessary minimum for pedagogical purposes.)
The sheet contains structured data in a tabular format. Each column corresponds to a different field. The fields come in many different data types and formats. The app is supposed to read the Excel file, extract fields, run certain verifications, perform row-wise computations, merge columns/rows based on some rules and heuristics, and then finally return its JSON representation. The formats are widely varying, and these also include the verifications, computations, and mergings to be performed.
It wouldn’t be feasible to write a new parser and processor for each new format. We needed to abstract over different formats, and so we wrote an embedded language to specify the document formats. The processing part of the app would accept this format specification as a parameter. As we will see in a while, the requirements were such that the column fields couldn’t have been dealt with as first class Scala values. Therefore we went down the meta-programming route.
This post will cover the part of the DSL that deals with building a format specification, constructing expressions (for verifications and computations), and evaluating them. How the verifications and computations are employed in processing is outside the scope of this post.
We begin by making a rough sketch of what the specification should look like: (Note: The following is not a valid Scala code.)
val abcSpec = TableSpec {
columns = {
"Code" / code - string
"Dummy" / dummy - string
"Customer" / customer - string with attributes [optional]
"Value" / value - decimal stored as string with format "###.##" with attributes [totaled]
"Date" / date - date stored as string with format "dd.MM.yyyy"
// more fields...
}
verifications = {
dummy == h(code)
a == b + c
// more verifications...
}
computations = {
tax = total_price - (discount + freight + price)
x = f(y) + g(z)
// more computations...
}
}
When writing an internal DSL, one has to play within the syntactic limits of the host language. With Scala, we can get pretty close to the pseudo-code presented above. This is how we go about it:
- We change the outermost declaration to an anonymous class instance initialization. So TableSpec { … } will now become new TableSpec { … }.
- columns, verifications, and computations can be changed to method definitions evaluating to a sequence of respective required types. So columns = { … } will become def columns = Seq(…). Same with the other two.
- code, customer etc. are runtime values, and thus cannot be mapped to native Scala name bindings. In other words, they are pseudo-variables and as we saw in the previous post, Symbol type fits this use case well.
- To construct the expressions with these symbols, we will need to set up an abstract syntax tree. We’ll represent these constructed expressions with type Expr[A] where A represents the type of value the expression evaluates to.
- We will use Scala’s alternate method invocation syntaxes (as described in the previous post) to achieve the clean spaced-out syntax. We will have to modify our DSL syntax a little to fit Scala’s syntactic rules though.
- Since = is Scala’s assignment operator, and == is a final method on class Any, neither is available for use. Therefore we need to use choose some different operators. Let us go for := and =:= respectively.
This is how the refined syntax would look like: (Note: This doesn’t work either.)
val abcSpec = new TableSpec {
def columns = Seq(
"Code" / 'code - string,
"Dummy" / 'dummy - string,
"Customer" / 'customer - string withAttributes (optional),
"Value" / 'value - decimal storedAs string withFormat "###.##" withAttributes (totaled),
"Date" / 'date - date storedAs string withFormat "dd.MM.yyyy"
// more fields...
)
def verifications = Seq(
'dummy =:= /* arbitrary function invocation */,
'a =:= 'b + 'c
// more verifications...
)
def computations = Seq(
'tax := 'total_price - ('discount + 'freight + 'price),
'x := /* arbitrary function invocation */
// more computations...
)
}
I have marked two places with /* arbitrary function invocation */. As the text suggests, these involve arbitrary function invocations on the values associated with the symbols. Their handling is a little involved, and we will look at that in the next post in the series.
Let’s start looking at the implementation.
First of all, we define a trait TableSpec as follows:
trait TableSpec {
def columns: Seq[ColumnSpec[Any]]
def verifications: Seq[Expr[Boolean]]
def computations: Seq[Computation[Any]]
}
The data types ColumnSpec and Computation are defined as follows:
case class ColumnSpec[+A](
name: String,
identifier: Symbol,
/* more fields */
)
case class Computation[+A](sym: Symbol, expr: Expr[A]) {
override def toString = "%s := %s" format (sym, expr)
}
Verifications are essentially expressions evaluating to a boolean value, and thus aptly represented as Expr[Boolean].
Now, our DSL uses the following syntax for ColumnSpecs:
"Value" / 'value - decimal storedAs string withFormat "###.##" withAttributes (totaled)
As described in previous post, Scala will see the above fluent syntax as:
"Value"./('value).-(decimal).storedAs(string).withFormat("###.##").withAttributes(totaled)
It’s pretty easy to map that to:
ColumnSpec("Value", 'value, ...) // Other fields intentionally dropped as they are not really relevant to the discussion.
All we have to do is to use the good old builder pattern, except that the explicit build call at the end can be avoided thanks to implicit conversions. Here is the code for the same:
implicit def columnDeclarationSyntax(name: String) = new {
def /(identifier: Symbol) = new {
def -(d: DataType) = new ColumnBuilder(name, identifier, d, None, None, Set.empty)
}
}
case class ColumnBuilder(
private val _name: String,
private val _identifier: Symbol,
private val _dataType: DataType,
private val _storedAs: Option[DataType],
private val _format: Option[String],
private val _attributes: Set[Attribute]
) {
def storedAs(d: DataType) = this.copy(_storedAs = Some(d))
def withFormat(f: String) = this.copy(_format = Some(f))
def withAttributes(a: Attribute*) = this.copy(_attributes = a.toSet)
def build: ColumnSpec[Any] = {
/* stuff */
}
}
implicit def columnBuild(cb: ColumnBuilder): ColumnSpec[Any] = cb.build
A couple of notes on the above code:
- The new { } syntax you see there creates what’s called a structural type, essentially an anonymous type that supports the operations defined inside. This construct uses reflection, and hence its usage should be avoided in performance-intensive code.
- Since columns expects Seq[ColumnSpec[Any]], the implicit conversion gets triggered for each item in the sequence, making the build call unnecessary.
Next, we need to define a trait to represent expressions.
trait Expr[+A] {
def eval(st: SymbolTable): A
override def toString = "<expression>"
}
SymbolTable is a type alias for Map[Symbol, Any]. st is a symbol table that is used for looking up values associated with symbols. (Observe how we avoided a global mutable map by making map an argument for the eval method.)
The DSL has to support literals, atoms, and binary operations. Let’s add cases for that.
case class Lit[+A](a: A) extends Expr[A] {
def eval(st: SymbolTable) = a
override def toString = a.toString
}
case class Atom[+A](sym: Symbol) extends Expr[A] {
def eval(st: SymbolTable): A = st(sym).asInstanceOf[A]
override def toString = sym.toString
}
case class BinOp[+A, B, C](b: Expr[B], c: Expr[C], op: (B, C) => A, opString: String) extends Expr[A] {
def eval(st: SymbolTable): A = op(b eval st, c eval st)
override def toString = "(%s %s %s)" format (b, opString, c)
}
Lit[A] represents literal of type A. Atom[A] represents the names we will be binding values to (such as ‘code, ‘customer). BinOp[A] represents a binary operation that takes two expressions of type B and C respectively, and yields an expression of type A. The eval definitions are self-explanatory. Now just add the following two implicit conversions:
implicit def atom[A](s: Symbol): Expr[A] = Atom[A](s)
implicit def lit[A](a: A): Expr[A] = Lit[A](a)
The expression such as follows:
will now be read by the Scala compiler as:
Atom[Int]('b).*(Lit[Int](5))
Since * is an operator (which we will define in the next post) that expects two Expr[Int]s, the type inferer was able to correctly infer ‘b as Atom[Int]. This is very helpful in avoiding the type annotation clutter. However this does not work in all cases, and sometimes there might not be enough context available for the type inferer to infer types correctly. In such cases, we have to provide the explicit type.
For example, the following expression:
will be read by the Scala compiler as:
Atom[Nothing]('a).:=(Atom[Int]('b).+(Atom[Int]('c)))
See the problem? We want ‘a to be an Atom[Int]. However because of the lack of context, type inferer is not able to infer the type correctly. So we need to provide the type explicitly:
('a : Expr[Int]) := 'b + 'c
To make this a little less verbose, we define a few helper methods:
// General.
// With this, you can abbreviate ('x : Expr[Type]) to 'x.as[Type].
implicit def symbolWithAs(s: Symbol) = new { def as[A]: Expr[A] = s }
// More specific ones, for even more brevity.
def double_(s: Symbol) = Atom[Double](s)
def int_(s: Symbol) = Atom[Int](s) // etc
The expression will now look like:
'a.as[Int] := 'b + 'c
// or
int_('a) := 'b + 'c
In case of our DSL, the method computations takes a sequence of Computation objects, which as we saw above, require us to supply types on left hand side of :=. After making this change, the definition of computations in the specification might look like:
def computations = Seq(
int_('tax) := 'total_price - ('discount + 'freight + 'price),
string_('x) := /* arbitrary function invocation */
// more computations...
)
Rest of the TableSpec would look the same.
That’s all for now. In the next post, we will define some common operators (such as +, -, *, /, <, > etc) on expressions, and provide a way to invoke arbitrary functions on them. Till then, happy coding and happy learning!
Dhananjay Nene
on January 18th, 2012.
Posted in Build and Release Automation, Gradle, Scala
Gradle is a useful and productivity enhancing tool for setting up build and release management environments. Historically this space has been dominated by ant and later by maven. We were using Maven for our Java / Scala based applications. However, recently we started migrating all our build scripts to Gradle. While many are aware of the capabilities of gradle for maintaining java projects, it is also very easy to use with Scala. This post predominantly refers to using gradle with a Scala project, though Java developers will also find it of interest. The build scripts remain quite similar and I’ve made notes which inform what java users need to do differently. We shall use the smallest possible (almost just a hello world) project and explore various features of gradle using it.
Installation
Gradle requires JDK 1.5 or higher. The full gradle release can be downloaded from http://gradle.org. I shall not detail the installation steps here, but shall instead refer you to the Installation chapter in the gradle user guide.
Hello World
We shall start with the smallest possible scala program and the smallest possible gradle script. Create a directory for the project. Start by adding the following “hello world!” program to src/main/scala/in/vayana/blog/gradle_demo/GradleDemo.scala. By convention scala code is located at src/main/scala and src/test/scala, and java code at src/main/java and src/test/java. The initial code is as follows.
1 2 3 4 5 6 7
| package in. vayana. blog. gradle_demo
object GradleDemo {
def main (args : Array [String ]) = {
println ("hello world!")
}
} |
This is pretty self explanatory scala example – the canonical “Hello World!”.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| apply plugin: 'eclipse'
apply plugin: 'scala'
scalaVersion = '2.9.1'
repositories {
mavenCentral()
}
dependencies {
// Libraries needed to run the scala tools
scalaTools 'org.scala-lang:scala-compiler:' + scalaVersion,
'org.scala-lang:scala-library:' + scalaVersion
// Libraries needed for scala api
compile 'org.scala-lang:scala-library:' + scalaVersion
} |
In the build.gradle above,note the following :
- The file begins by importing two plugins. One for eclipse and one for scala. The first one is not relevant to this post and could be ignored. The second one enables using gradle with scala projects. Just the file with an apply plugin: ‘java’ alone would’ve been sufficient. The remaining lines are required for scala.
- Since scala builds require the scala version to be supplied, we start by declaring the scalaVersion property, so that the same can be used consistently in multiple places wherever it is required.
- Scala builds require scala-compiler.jar and scala-library.jar during compilation stage. The files of the version corresponding to the scalaVersion property need to be downloaded. These files are available from the Maven Central Repository. The repositories { mavenCentral() } directive instructs gradle to download jars it needs from this repository.
- The specific scala plugin requires the two scala jars to be supplied as a scalaTools dependency. Moreover since many scala libraries and collection classes are to be found in scala-library.jar, we need to specify the dependency on that jar as a compile time dependency as well.
Run gradle as follows to build a jar
This will download the necessary dependencies (in this case the scala jars) if they are not already present in your local gradle repository (On Linux machines it is at $HOME/.gradle). The jar will be built in the build/libs subdirectory of the project directory. It will be called gradle-demo.jar. Don’t worry about the lack of version information, we shall take care of it later. In order to run the application, you may issue the following command from the console while your current directory is the project base directory.
1
| scala -classpath build/libs/gradle-demo.jar in.vayana.blog.gradle_demo.GradleDemo |
There, we just created and used our first build script.
The full source at this stage can be found here.
Note to Java users: Your build.gradle only needs to contain one line, ie. apply plugin: ‘java’. The hello world app needs to be java code at src/main/java/in/vayana/blog/gradle_demo/GradleDemo.java. And you would run the application using java instead of scala. Though go ahead and add the repositories section, since you will need it later.
Running the application using gradle
While we ran the application from the command line, it is a tad inconvenient to type the whole line each time. We shall thus extend the build.gradle script to run the application using gradle.
Add the following to the end of the build.gradle file.
1 2 3 4 5
| task(run, dependsOn: 'classes', type: JavaExec) {
main = 'in.vayana.blog.gradle_demo.GradleDemo'
classpath = sourceSets.main.runtimeClasspath
args 'foo', 'bar'
} |
Now from command line issue the following command
You should see the application run, and “hello world!” appear at the end of the console output.
What we did above was to add a new task to the gradle script called run. This task has declared a dependsOn on classes. classes is the task which compiles all the sources into .class files. So the run task will trigger the compilation of the sources, and after they are compiled, it will run the application. In order to run the app, the main property specifies the fully qualified class name of the class containing main method. The classpath property specifies the classpath to be used, which in this case we set to the runtime classpath that gradle computes automatically. Finally two arguments ‘foo’ and ‘bar’ are passed to the application (which in this case are ignored).
The full code at this stage can be reviewed here.
Logging. Adding dependencies
One of the main reasons for using maven, gradle etc. is to use them to properly track the dependencies. In this case we shall add a log statement. Since that shall require the use of logging libraries we shall need to instruct gradle to download them. The scala libraries we shall use for logging are not available in the central maven repo. Hence we shall add yet another repository to the gradle script as follows.
1 2 3 4
| repositories {
mavenCentral()
maven { url = 'http://scala-tools.org/repo-releases' }
} |
Having introduced the repositories we need to declare the dependencies as well. Modify the compile dependencies list by adding two libraries used for logging as follows.
1 2 3
| compile 'org.scala-lang:scala-library:' + scalaVersion,
'com.weiglewilczek.slf4s:slf4s_2.9.1:1.0.7',
'ch.qos.logback:logback-classic:1.0.0' |
Also change the application to log the “hello world!” statement instead of printing it to console.
1 2 3 4 5 6 7 8 9
| package in. vayana. blog. gradle_demo
import com. weiglewilczek. slf4s. Logger
object GradleDemo {
val logger = Logger ("in.vayana.blog.gradle_demo")
def main (args : Array [String ]) = {
logger. debug("hello world!")
}
} |
To download the dependencies alone, you could issue the following command
As mentioned earlier, to build the jar you could issue the command
As mentioned earlier, to run the application you could issue this command
Upon running the application you should now see the debug output.
The full source code at the end of this stage can be reviewed here.
Note to Java users: You could just add the logback dependency alone and not add the slf4s one.
Run a test case
The next step is to use gradle to compile and run the test cases. We shall use the specs2 testing library for the same. We shall first need to add the dependencies for the test cases by adding the following lines to the dependencies section
1 2 3 4
| // Libraries needed for test cases
testCompile 'junit:junit:4.5',
'org.specs2:specs2_2.9.1:1.6.1',
'org.specs2:specs2-scalaz-core_2.9.1:6.0.1' |
To allow for something to be tested, we shall first add a method as follows to the GradleDemo object
1 2 3 4
| def foo () = {
logger. debug("in method foo")
"bar"
} |
And of course we shall also add the complete test case
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| package in. vayana. blog. gradle_demo
import org. junit. runner. RunWith
import org. specs2. mutable. SpecificationWithJUnit
import org. specs2. runner. JUnitRunner
import com. weiglewilczek. slf4s. Logger
import org. slf4j. LoggerFactory
@RunWith (classOf [JUnitRunner ])
class TestGradleDemo extends SpecificationWithJUnit {
val logger = Logger ("in.vayana.blog.gradle_demo.test")
"Using GradleDemo" should {
"Method foo should return value bar" in {
println ("this is a console output")
logger. debug("this is a debug statement")
GradleDemo. foo() must _== "bar"
}
}
} |
Now run the following command from the console
You might be a little surprised at not seeing the log outputs appear on the console. Gradle produces very nicely formatted html pages to document the test results. It also redirects all stdout and stderr outputs and makes them available for review corresponding to each of the test case in the html output. The html output can be seen by opening the build/reports/tests/index.html file in your favourite browser. You should be able to see the debug statements appear there, and of course that the test case was completed successfully (see the last image in the series of images that follow)
Note to Java users: Just introduce a dependency on junit alone, and implement a simple junit test case.
Test Results summary for project:

Test Results summary for package:

Test Results summary for Test Case Class (Tests tab being shown):

Test Results summary for Test Case Class (Redirected outputs being shown):

The full source code at this stage can be reviewed here.
We’ve just got started. There’s lot more to cover, including versioning, creating pom files, pushing to maven repositories, multiple projects, scala version cross builds etc. And of course the power of groovy to allow you to extend gradle to suit your needs. That all follows in subsequent installments of this series.
Recently Saager Mhatre (@dexterous) from Vayana, presented a talk on Simplifying builds using gradle. The slides are shown below.
Rahul Goma Phulore
on January 17th, 2012.
Posted in Programming, Scala
, Tagged with dsl, excel-parser
In early high level languages such as Lisp and Forth, the programming style was to build meta-linguistic abstractions towards the domain, and then write programs using these abstractions. The term “meta-linguistic” implies that these abstractions would be virtually indistinguishable from the native language features. In other words, when programming in this style, you write an embedded language in your host language to write your programs in. To quote Paul Graham,
In Lisp, you don’t just write your program down toward the language, you also build the language up toward your program. As you’re writing a program you may think “I wish Lisp had such-and-such an operator.” So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on. Language and program evolve together. Like the border between two warring states, the boundary between language and program is drawn and redrawn, until eventually it comes to rest along the mountains and rivers, the natural frontiers of your problem. In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient.
This technique proves to be very useful in taming complexity as programs grow larger and larger in size. Known by various names such as “language oriented programming”, “task oriented language design”, and “embedded language design”, this style of programming has recently been revived as “fluent interface” or “domain specific language (DSL) design ”. Modern languages such as Ruby, Groovy, Clojure (which is a Lisp), and Scala come with features that make them very amenable to this approach.
In this series, we are going to focus on how to write embedded domain specific languages in the Scala programming language.
Show me some code!
Here is some example code written using Akka FSM DSL.
import akka.actor.{Actor, FSM}
import akka.util.duration._
sealed trait ExampleState
case object A extends ExampleState
case object B extends ExampleState
case object C extends ExampleState
case object Move
class ABC extends Actor with FSM[ExampleState, Unit] {
import FSM._
startWith(A, Unit)
when(A) {
case Ev(Move) =>
log.info(this, "Go to B and move on after 5 seconds")
goto(B) forMax (5 seconds)
}
when(B) {
case Ev(StateTimeout) =>
log.info(this, "Moving to C")
goto(C)
}
when(C) {
case Ev(Move) =>
log.info(this, "Stopping")
stop
}
initialize
}
As you can see above, the code reads very naturally. goto and when feel like language keywords – in fact, they’re regular method definitions.
Let’s take a look at some Scala features that make the design of such abstraction possible.
1. Alternative method invocation syntaxes:
Scala’s grammar allows the following alternative method invocation syntaxes that can be put to good use in DSLs.
i. obj meth param is parsed as obj.meth(param).
ii. obj meth (param1, param2, ..) is parsed as obj.meth(param1, param2, ..).
iii. obj meth is parsed as obj.meth.
Note that (iii) can clash with (i) and (ii) leading to ambiguities, and so its usage should be avoided unless the expression using that syntax is delimited in parentheses or a block.
In the Akka FSM example above, goto(B) forMax (5 seconds) is parsed as goto(B).forMax(5.seconds).
Actually that’s not the full story. It’s parsed as goto(B).forMax(someMethod(5).seconds). We will see how that works in the next section.
2. Implicit conversions and the extend-my-library pattern:
This feature lets you add methods (sort of) to existing types. When you invoke a method meth on object obj of type T, and T does not have a method with a matching signature, the compiler starts looking for an implicit conversion from type T to some type S that supports such a method. Let’s have a look at an example.
You might be familiar with this expression from Joda-Time: Period.seconds(5). With help of implicit conversions, we can make it read more naturally as 5 seconds. The Int class does not have a method named seconds. Let us define a type, PeriodHelper, that has seconds method, and then define an implicit conversion from Int to PeriodHelper.
class PeriodHelper(i: Int) {
def seconds: Period = Period.seconds(i)
}
implicit def intToPeriodHelper(i: Int): PeriodHelper = new PeriodHelper(i)
The expression 5 seconds is now valid, and will be parsed as intToPeriodHelper(5).seconds.
3. Symbolic method names:
Scala does not have operators in conventional sense. Its “operators” are just regular methods. This is possible because it allows symbols in its identifiers. (You can find the detailed rules in the language specification.) This lets us have symbolic words in our DSL, where they make sense.
A small self-explanatory example:
case class Foo(i: Int) {
def combine(that: Foo): Foo = Foo(this.i + that.i)
def <+>(that: Foo): Foo = Foo(this.i + that.i)
}
val a, b = Foo(5)
assert(a.combine(b) == Foo(10))
assert(a combine b == Foo(10))
assert(a.<+>(b) == Foo(10))
assert(a <+> b == Foo(10))
4. Lambda expressions, multiple parameter lists, and partial functions:
These three in conjunction allow us to create some really nifty control abstractions. Let’s write one.
We will write a control structure similar to try-catch but which has a special command restart that allows us to restart the computation in case of a failure. Let’s call it attempt-fallback.
This is what it’ll look like:
attempt {
val s = Console.readLine
s.toInt
} fallback {
case ex: NumberFormatException => println(“Invalid string. Try again.”); restart
}
attempt is basically a method that accepts a function block (also called a lambda expression) and returns an object. This object has a method named fallback that accepts a partial function (a bunch of case statements make up a partial function). The logic for this control structure will go inside the definition of fallback. There are certain other implementation details that we won’t be going into in this post. Here is the full implementation:
import util.control.ControlThrowable
case object RestartException extends ControlThrowable
def restart: Nothing = throw RestartException
def attempt[A](f: => A): Attempt[A] = new Attempt(f)
class Attempt[A](f: => A) {
def fallback(catcher: PartialFunction[Throwable, A]): A = {
var stop = true
var result = null.asInstanceOf[A]
do {
try {
result = f
stop = true
} catch {
case t: Throwable if catcher isDefinedAt t => {
try {
result = catcher(t)
stop = true
} catch {
case RestartException => stop = false
}
}
case t: Throwable => throw t
}
} while(!stop)
result
}
}
5. Symbol type:
The idea of Symbol comes from Lisp. Symbols are essentially interned strings, but do not support any string operations. Scala has a special syntax sugar for them; the expression ‘mileage is equivalent to Symbol(“mileage”). This sugar and their distinctive appearance from string literals makes them suitable for use as “pseudo-variables”.
For example, suppose you are implementing a Prolog-like DSL in Scala. You can use Symbols to represent atoms in that DSL.
object MotherChild extends Fact2
object Sibling extends Fact2
MotherChild('vidya, 'keshav)
MotherChild('vidya, 'ashwini)
MotherChild('surabhi, 'jay)
Sibling('X, 'Y) given MotherChild('Z, 'X) & MotherChild('Z, 'Y)
Here is another example some of us are more likely to relate to easily – a database agnostic schema-management DSL. Symbols could be used to represent column names.
createTable('person)(
'name - string,
'age - integer
)
Of course this example is too simplistic and a real world DSL would be far more complex, but you get the idea.
Conclusion:
Hopefully this little post has motivated you to explore the topic of DSL design further.
What you have seen in this post is only a tip of an iceberg. There are many other features of Scala that lend themselves to the design of high level abstractions.
In the subsequent posts in the series, we will see how at Vayana we used these techniques to write a DSL for reading and manipulating structured data from Excel sheets.
Special thanks to Marc Simpson and Hanneli Tavante for reviewing the drafts.
Dhananjay Nene
on January 17th, 2012.
Posted in Miscellaneous
The Vayana Engineering team has been quietly at work for the last 2 years. A small team within the company (team of 6 to be precise), we’ve operated like a startup. And we have been busy like one too. Working on three different services can be hectic. Especially given the breadth of tasks we perform. We interact with product management to understand customer issues and find ways to define features which will best support a diversity of needs across customers and industries. We are the developers, UX designers, and testers. We are also the system administration and operations team. And we still spend a fair amount of time interacting with customers and supporting them operationally.
The last two years we have been rather silent. We’ve been busy enough to not talk much about what we do. We’ve worked on many languages, frameworks, and business problems. We intend to talk about many of the engineering problems we faced and solutions we applied.
A gentle introduction to us is at Engineering @ Vayana. We tweet at @vayana_engg. Through Vayana Engineering Blog we hope to share our experiences, solutions or just opinions. The content here will hopefully be informative. Sometimes creative too, though that’s what our management would rather not have us talk about. Occasionally even entertaining. Hopefully, our passion will be obvious. Passion to make things easy for our customers. Passion to learn and leverage technology to the extent we can. And now passion to share our learnings and experiences with you and in turn learn from you as well.
So, welcome all and stay tuned.