While reviewing some bits of the code during the last release of – Tayra, I came across a piece of code, which takes in buffer size as a human readable string and converts it to a long value. Essentially, we needed to allow user to specify it on the command line, say something like:
restore -f fileName.txt --fBuffer=4MB

This code was copy-pasted from Log4J into a MemoryMappedFileReader (an abstraction that we created), that accepted the human readable string and allocated a memory buffer of that size. Clearly, this functionality does not belong to this class and need to be in a separate home of its own. But that is just one of the things.

private long getFileSize(final String value) {
  if (value == null) {
    return DEFAULT_SIZE;
  }

  String s = value.trim().toUpperCase();
  long multiplier = 1;
  int index;

  if ((index = s.indexOf("KB")) != -1) {
    multiplier = ONE_KB;
    s = s.substring(0, index);
  } else if ((index = s.indexOf("MB")) != -1) {
    multiplier = ONE_KB * ONE_KB;
    s = s.substring(0, index);
  } else if ((index = s.indexOf("GB")) != -1) {
    multiplier = ONE_KB * ONE_KB * ONE_KB;
    s = s.substring(0, index);
  }
  if (s != null) {
    try {
      return Long.valueOf(s).longValue() * multiplier;
    } catch (NumberFormatException e) {
      LogLog.error("[" + s + "] is not in proper int form.");
      LogLog.error("[" + value + "] not in expected format.", e);
    }
  }
  return DEFAULT_SIZE;
}

Not only does the above code need a home of its own, but it is a perfect candidate for a Value Object, where its value will be its identity, that is, attributes define the identity of the object. So, I converted the above piece to a Value Object DataUnit. One of the properties of Values is that once defined they stay the same, this is the immutability property. When applied to object that represents value, object state should not change and that gives us immutable objects. Advantage of using immutable objects is that they do not have any side-effects when used in different parts of the system.

Immutability allows us to use Value objects in multi-threaded environments without the need for synchronizaton, thus making them referentially transparent. Referentially transparent means that there is no context associated with these type of objects, they are context-free, I can use them on one context or the other and it will neither alter their meaning nor the meaning of the context in which they are used. There is really a great post that explains in-depth the meaning of the word – Referential Transparency. Also the discussion on Stackoverflow on what is referential transparency is really a good one.

Further, operations on value objects will return new Value Object. So in the example below, if I want to add 3MB and 2MB, it will return a new value object 5MB. Also, your 2MB object is same as my 2MB object.

package com.ee.tayra.utils;

public enum ByteUnit {
  UNIT (1, null),
  B  (1, UNIT),
  KB (1024, B),
  MB (1024, KB),
  GB (1024, MB);

  private final int factor;
  private ByteUnit(final int factor, final ByteUnit byteUnit) {
    if (byteUnit == null) {
      this.factor = factor;
    } else {
      this.factor = factor * byteUnit.factor;
    }
  }

  public int toInt() {
    return factor;
  }

  public static ByteUnit from(final String unit) {
    String unitUpperCase = unit.trim().toUpperCase();
    if (unitUpperCase.contains("G")) {
      return GB;
    }
    if (unitUpperCase.contains("M")) {
      return MB;
    }
    if (unitUpperCase.contains("K")) {
      return KB;
    }
    return B;
  }
}
package com.ee.tayra.utils;

import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public final class DataUnit {
  private static final String regex = "^([0-9]+)(.+)$";
  private static final Pattern numberPattern = Pattern.compile(regex);
  public static final int PRIME = 31;

  private final ByteUnit byteUnit;
  private final int value;

  private DataUnit(final int value, final ByteUnit byteUnit) {
    this.value = value;
    this.byteUnit = byteUnit;
  }

  public int value() {
    return value;
  }

  @Override
  public boolean equals(final Object other) {
    if (this == other) {
      return true;
    }
    if (other == null) {
      return false;
    }

    if (getClass() != other.getClass()) {
      return false;
    }
    DataUnit that = (DataUnit) other;

    return value == that.value && byteUnit == that.byteUnit;
  }

  @Override
  public int hashCode() {
    int result = byteUnit.hashCode();
    result = PRIME * result + value;
    return result;
  }

  public int toBytes() {
    return value() * byteUnit.toInt();
  }

  public static DataUnit from(final String valueWithUnit) {
    if (valueWithUnit == null || valueWithUnit.isEmpty()) {
      throw new IllegalArgumentException("Valid values are B, KB, MB, GB");
    }
    Matcher matcher = numberPattern.matcher(valueWithUnit);
    if (matcher.matches()) {
      String group = matcher.group(1);
      int value = Integer.parseInt(group);
      String unit = matcher.group(2);
      return new DataUnit(value, ByteUnit.from(unit));
    }
    throw new IllegalArgumentException("Don't know how to represent "
            + valueWithUnit);
  }
}

It would be really nice if the value comparison semantics are preserved for equality, that is, I would like to use the == operator on values to be able to compare them and not use equals method. A cache in the form of a map to hold the object references would help us implement it as we already have creation using factory method covered.

The new lines that have been added are highlighted Green.

package com.ee.tayra.utils;

import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public final class DataUnit {
  private static final String regex = "^([0-9]+)(.+)$";
  private static final Pattern numberPattern = Pattern.compile(regex);
  public static final int PRIME = 31;
  private static final Map<String, DataUnit> cache
            = new HashMap<String, DataUnit>();

  private final ByteUnit byteUnit;
  private final int value;

  private DataUnit(final int value, final ByteUnit byteUnit) {
    this.value = value;
    this.byteUnit = byteUnit;
  }

  public int value() {
    return value;
  }

  @Override
  public boolean equals(final Object other) {
    if (this == other) {
      return true;
    }
    if (other == null) {
      return false;
    }

    if (getClass() != other.getClass()) {
      return false;
    }
    DataUnit that = (DataUnit) other;

    return value == that.value && byteUnit == that.byteUnit;
  }

  @Override
  public int hashCode() {
    int result = byteUnit.hashCode();
    result = PRIME * result + value;
    return result;
  }

  public int toBytes() {
    return value() * byteUnit.toInt();
  }

  public static DataUnit from(final String valueWithUnit) {
    if (valueWithUnit == null || valueWithUnit.isEmpty()) {
      throw new IllegalArgumentException("Valid values are B, KB, MB, GB");
    }
    if (cache.containsKey(valueWithUnit)) {
      return cache.get(valueWithUnit);
    }
    Matcher matcher = numberPattern.matcher(valueWithUnit);
    if (matcher.matches()) {
      String group = matcher.group(1);
      int value = Integer.parseInt(group);
      String unit = matcher.group(2);
      final DataUnit dunit = new DataUnit(value, ByteUnit.from(unit));
      cache.put(valueWithUnit, dunit);
      return dunit;
    }
    throw new IllegalArgumentException("Don't know how to represent "
            + valueWithUnit);
  }
}

The above is a flyweight pattern at work and can enhance performance and give space savings when there are a large number of Value objects lying around.

Advertisements