Tags
To give a bit of context on this, on my earlier project Midas, we developed transformation functions (thats what we are called it) that quite closely resemble MongoDB’s aggregation-projection framework functions for Arithmetic operations (add, subtract, …) , String operations (concat, tolower, …) etc… Here is an example.
db.orders.transform('totalAmount', '{ $subtract: ["$totalAmount", { $multiply: ["$totalAmount", 0.2] } ] }')
We parse the above using Scala’s Parser Combinators and convert them to domain model objects which are sub-types of Expression
. Each expression is either a Literal
, a Field
expression or a Function
. We have a fairly wide and sufficiently deep hierarchy of Function
s and is depicted below
+----------+ |Expression| +----------+ ______/ | \_____ / | \ +-------+ +-------+ +--------+ |Literal| | Field | |Function| +-------+ +-------+ +--------+ ________________/ | \_________________ / | \ +------------------+ +--------------+ +------------+ _________|ArithmeticFunction| |StringFunction| |DateFunction| / +------------------+ +--------------+ +------------+ / ______/ | \___ \_____ | |____ \_______ / / | \ \ | \ \ +---+ +--------+ +------+ +--------+ +---+ +-------+ +-------+ +------+ |Add| |Multiply| |Divide| |Subtract| |Mod| |ToLower| |ToUpper| |Concat| +---+ +--------+ +------+ +--------+ +---+ +-------+ +-------+ +------+
In our case, we need to be able to support various arithmetic functions, string functions and other data functions. We also would not know how many more functions we would need to support in the future. Obviously, adding that functionality would manifest as classes added to the hierarchy which in turn means that I have to modify the factory function each time a new class gets added to the Function hierarchy.
the factory function looks something like this.
def fn: Parser[Expression] = fnName~":"~fnArgs ^^ { case "add"~":"~args => Add(args: _*) case "subtract"~":"~args => Subtract(args: _*) case "multiply"~":"~args => Multiply(args: _*) case "divide"~":"~args => Divide(args: _*) case "concat"~":"~args => Concat(args: _*) ... ... }
So, in other words, the factory method is not sealed against these kind of changes. Goal is to get to a place where the case statements don’t exist. As pointed out earlier in one of my blog-posts Achieving Explicit Closure for a Simple Factory using Annotations… many years back, I decided to seal the factory method so that this concern is taken away from the developer.
The approach is to mark behavior-implementing sub-types of the Function
, pick them up from the classpath and store them in cache so that they can then be instantiated when required. As we are on the JVM, we use a custom annotation @FunctionExpression
to mark the Function
sub-types.
@Retention(RetentionPolicy.RUNTIME) @Target(ElementType.TYPE) public @interface FunctionExpression { Class value(); }
Here is how it is applied to a Function
sub-type.
@FunctionExpression final case class ToLower(expression: Expression) extends StringFunction(expression) { def evaluate(document: BSONObject) = { val string = value(expression.evaluate(document)) Literal(string.toLowerCase) } }
Next, we do need to scan the classpath for all the classes annotated with @FunctionExpression
annotation. It turns out that using reflection to check for the presence of the annotation is not only expensive time-wise, but also it loads the class into memory and thus causing the heap to grow.
ASM from Object Web is a bytecode manipulation and analysis library that does not need to load the class into the JVM and is very performant in doing such an analysis. Also, there are many open-source frameworks like Scannotation or Reflections that use ASM to scan annotations in classpath. Scannotation is not under active development and I did not need many features from Reflections for such a small work that I need to do. So instead of using these frameworks, I just wrote a custom AnnotationScanner
that uses ASM’s ClassVisitor
class AnnotationScanner(pkg: String, annotationClass: Class[_]) { private val fsSlashifiedPkg = fsSlashify(pkg) private val slashifiedPkg = slashify(pkg) private val slashifiedAnnotation = slashify(annotationClass.getName) private val classLoader = AnnotationScanner.this.getClass.getClassLoader private val pkgURI = classLoader.getResource(slashifiedPkg).toURI private var startDir: Path = null val pkgURIString = pkgURI.toString if(pkgURIString.startsWith("jar")) { val (jar, _) = pkgURIString.splitAt(pkgURIString.indexOf("!")) val jarUri = URI.create(jar) import scala.collection.JavaConverters._ FileSystems.newFileSystem(jarUri, Map[String, AnyRef]().asJava) } startDir = Paths.get(pkgURI) private val dirWalker = new DirWalker(startDir, Pattern.compile(".*\.class$")) private def fsSlashify(string: String) = string.replaceAllLiterally(".", File.separator) private def slashify(string: String) = string.replaceAllLiterally(".", "/") private def classesInPackage: Set[String] = { dirWalker.walk map { file => val index = if(pkgURIString.startsWith("jar")) file.indexOf(slashifiedPkg) else file.indexOf(fsSlashifiedPkg) val className = file.substring(index) className.replaceAllLiterally(".class", "") } } private def hasAnnotation(annotationClass: Class[_], className: String): Boolean = { val slashifiedClassName = fsSlashify(className) var foundAnnotation = false val cv = new ClassVisitor(Opcodes.ASM4) { // Invoked when a class level annotation is encountered override def visitAnnotation(desc: String, visible: Boolean): AnnotationVisitor = { val annotation = desc.substring(1, desc.length - 1) if (annotation == slashifiedAnnotation) foundAnnotation = true super.visitAnnotation(desc, visible) } } val in = classLoader.getResourceAsStream(slashifiedClassName + ".class") try { val classReader = new ClassReader(in) classReader.accept(cv, 0) } catch { case _: Throwable => } finally { in.close() } foundAnnotation } private def dotify(string: String) = string.replaceAllLiterally("/", ".").replaceAllLiterally("\", ".") def scan = { val classes = classesInPackage classesInPackage.filter(className => hasAnnotation(annotationClass, className)).map(dotify) } }
Here is the DirWalker
implemented using the Java7 Paths that walks down from a start directory down recursively.
class DirWalker (startDir: Path, collectFilesRegex: Pattern) { private val files = scala.collection.mutable.Set[String]() private val visitor = new SimpleFileVisitor[Path] { override def visitFile(path: Path, mainAtts: BasicFileAttributes) = { val file = path.toAbsolutePath.toString val matcher = collectFilesRegex.matcher(file) if(matcher.matches) { files += file } FileVisitResult.CONTINUE } override def visitFileFailed(path: Path, exc: IOException) = { log.info(s"Continuing Scanning though visiting File has Failed for $path, Message ${exc.getMessage}") FileVisitResult.CONTINUE } } def walk = { files.clear Files.walkFileTree(startDir, visitor) files.toSet } }
Finally, I need a place from where I can use AnnotationScanner
. What better place could I have found other than placing this under Function
as a Singleton factory for creating its sub-types.
sealed abstract class Function(expressions: Expression*) extends Expression { override def toString = s"""${getClass.getSimpleName}(${expressions mkString ", "})""" } object Function { lazy val functions = new AnnotationScanner("com.ee.midas", classOf[FunctionExpression]) .scan .map { className => val clazz = Class.forName(className).asInstanceOf[Class[Function]] clazz.getSimpleName.toLowerCase -> clazz } .toMap .withDefaultValue(classOf[EmptyFunction]) def apply(fnName: String, args: Expression*): Function = { val fnClazz = functions(fnName.toLowerCase) val constructor = fnClazz.getConstructor(classOf[Seq[Expression]]) log.debug(s"Instantiating Class $fnClazz...") constructor.newInstance(args) } }
Eventually, the sealed parser now looks
def fn: Parser[Expression] = fnName~":"~fnArgs ^^ { case name~":"~args => Function(name, args: _*) }