segunda-feira, 1 de dezembro de 2008

Storage Toolbox

Storage Toolbox

Storage Toolbox concept is very simple: we need a toolbox to handle data preservation. By data preservation is meant persistence (store in hard copy location) and/or prevaylance ( store in soft copy location).

Data preservation is a common issue in enterprise systems. Historically data preservation was made on hard disk, i.e. files. Files where not transactional and that implied in the need for data base management systems (DBMS). Nowadays systems interact with DBMS by means of SQL phases. The interaction is mediated by special software know as a Driver. Different technologies exist for those Drivers. In java, the JDBC Drivers are used. Other technologies include de ODBC Drives and the ADO Drivers.

Interactingwith DBMS via SQL is not that simple. Each DBMS vendor incorporates special SQL syntax. These differences in the syntax create different SQL dialects. This is not good for the application programmer. If, on the other hand, you use only standard SQL syntax your program will not be beneficing from the DBMs performance boost features

Dependency on DBMS SQL dialects and features ( other that dialects , like procedures and triggers) really implies in short-lived applications.If you need to change to another DBMS product you cannot do it without the cost of reimplementing the DBMS communication layer (supposing you have one for starters. If do not, the cost will even higher).

The goal for the Storage Toolbox is to provide an abstraction layer that isolates all this problems from the application programmer and application code.

Data Storage

The Storage Toolbox does not assume the data is stored in a DBMS. Today DBMS are commodities and modern development techniques hunger for more lose-ended data repositories. Two forces drive this choice. 1) a more unit testing oriented programming demands the storage features to be pluggable in order to be replaced by stubs or mocks. 2) more RAM available at a lower cost enables applications to perform better if the data objects are maintained in memory for longer periods of time. This new approach of transacional data repositories in RAM memory is being called prevaylance and is know possible and competitive with the DBMS option. At the end of the day the data still needs to be stored in hard copy repositories. Not destroying and creating object all the time from DBMS informations improves the over all application performance.

In memory data stores is really very useful for testing and development as it allows delaying the database design for later after you already defined and tested the data model. Also provides an intuitive option for object cache.

A more modern approach to data models take advantage of Object Oriented techniques and isolates the developer from Table and Column searching issues providing object search and edition techniques instead.

MiddleHeaven Storage Toolbox uses the concept of a logical DataStorage managed by a StoreKeeper Store keepers are mediators for the real data preservation mechanics for either a DBMS, a XML file, a prevaylant system or a common in memory List. MiddleHeaven Storage Toolbox provides total abstraction of the data preservation features.

Storing and Retrieving

For every database system (managed or not) you always have set of basic operations you need. You need to be able to add new data to the storage and have to be able to retrive data data. Further you commonly need to update the data already in storage. Finally you need to be able to delete de data in the storage (even thought deletion operations are considered armful as they may characterize information destruction).

Java being a Object Oriented language all data needs to be place in objects, and we find two classifications of those objects: objects that contain only plain data ( like text, data, logic values, numeric values) and objects that contains aggregations of the first objects. The first objects are classified as primitives ( as they are not composed) and the second are named data aggregates. In java, data aggregates are often objects with a Bean like contract ( a brunch of attributes with no further associated behaviour other than read and write those attributes).

In a Object Oriented application is natural to use beans and primitives to represent data. And even further is natural to think that these beans are the data representations of our application entities. MiddleHeaven Storage Tooblox, thus, chooses to manipulate these beans and primitives directly and not to expose to the programmer the real storage structures (either tables, or maps, etc.)

When retrieving the data from a data storage is often possible to retrieve the same data for different proposes. Normally a query based approach is used. DBMS use SQL phases to build those queries but from a Object Oriented and a Java perspective, SQL is a pour approach. SQL is not OO and it is really more used like a protocol that as a query object (even thought it is one).

The MiddleHeaven Storage Toolbox chooses to use object oriented structures - named Criteria - to provide this query object feature. SQL or any other "protocol"-like query language ( like XPath or XQuery , for XML) is not used by the programmer. Instead, those languages are used by the StoreKeeper in order to communicate with the underlying data storage. What this really means is that you will program your application to talk to a DataStorage and that's it. If you later decide to change the keeper (i.e. the underlying storage technology) you simply can.

Something about entities

What is an entity ? An entity is something whose instances have Identity. Identity is a intrinsic property that is different for each entity's instance (i.e. each instance has its own identity). Think on a persons identity. Witch property you will use as Identity ? The right awnser is: none. None of the properties that characterize a person (height, eye color, finger prints, DNA) are the persons identity. Some, are closely related like DNA and finger prints, but, they are only related, they are not the identity of a person it self. So, identity is something abstract. Mathematics is almost solemnly based on the concept of identity. When you wright 2 = 3 it means "the identity of 2 is the same as the identity of 3". ( This is false because identity is , by definition, different for each number.)

In Java all objects have identity. Each object is different from the other object. You can assert and compare identity with the == operator ( = is the assignment operator, not the assert identity operator). However this JVM enforced identity is to strong for enterprise application proposes. You need a identity that you can define and compare. This means you must allow to have two or more object that are not the same and still represent the same instance of the entity, i.e. they share a common identity.

Accross history several informations about an entity have been used to harness/simulate identity. Names and identification numbers are the most common. However, today, the best property you can use is a property with no meaning, i.e. a "virtual" property that you create specially to decide about the equalness of the identity. Normally an integer number suffices, even thought some times you need a Universal Unique Identifier (UUID)

MiddleHeaven Storage Toolbox use the Identity type as an abstraction for the identity property. Implementations can then chose a suitable implementation of Identity according to the entity at hand.

Model

Ilustration 1: Storage Toolbox main types
The MiddleHeaven Storage Toolbox implements a agnostic domain store. This means it is not limited to DBMS queries but can be used with other technologies that could be used to create databases e.g. XML.

The main type are the DataStorage , Criteria and Query interfaces. DataStorage allows for interaction with the data storage. Criteria objects are implementations for the Query Obejct pattern and can be used to specify complex queries. The DataStorage will convert them into Query objects that represent the query results. Nothinmg is said about at what moment the query is really performed on the datastorage. By design the query should only be perform when one on the Query methods is invoqued.

Extra information about how the query should behave can be passed as second parameter. This hints inform the datastorage how the data will be read and this allows for optimization by using patterns like Fastlane Reader or Flyweight.

Criteria objects can be build by the programmer but the CriteriaBuilder class provides a fluent interface for this task. Also, using the CriteriaBuilder you end up with code that closely resembles a SQL query, being easy to read and change, but the advantage of strong typing. Criteria objects are created by invoking the search method on the criteriaBuilder. You can use static import to further simplify the query as showned here:

1
2 Criteria someCriteria = search ( Subject. class )
3 .and ( "name" ) .not () .eq ( "Jack" )
4 .orderBy ( "name" ) .asc ()
5 .all () ;
6

Code1: CriteriaBuilder example

StoreKeeper

All DataStorage operations are really delegated to a StoreKeeper. StoreKeeper is responsible to really change or retrive the data from the real data sotrage. MiddleHeaven now implements the DataBaseStorageKeeper for access to DBMS via JDBC, a XMLStorageKeeper and a InMemoryStorageKeeper. At this point both this storekeepers are being explored in order to obtain a agnostic enough model for the keepers across different data preservation APIs.

The DataBaseStorageKeeper goal is to able to communicate with any DBMS. In order to accommodate several different dialects a DatabaseDialect type was introduced. It performs all the SQL/JDBC related operations including the generation of SQL statements. DatabaseDialect encapsulates he creation of comands for most of the SQL standard operations, including creating tables and reading and changing the database model. The DataBaseStorageKeeper obtains the commands form the dialect and then performs the operations in a DBMS independent way. Out-of-the box, at the time this is being written, MiddleHeaven Storage Toolbox supports PostgreSQL 8.3, HSQL 1.8 and SQL Server 2005 dialects. DataBaseStorageKeeper uses a Datasource from were to obtain javax.sql.Connection.

None of the StorageKeepers performs any transacional control. However, if the underlying storage is not transactional it may help to support integration with the Transnational Toolbox.

Storable and StorableEntityModel

StoreKeeper handles collections of Storables. Each object passed to the DataStorage is converted to a Storable before being passed to the keeper. Storables allow for control of persistance properties of the object other than the data provieded by the object. The fields and values of the storable are abstracted by a StorableFieldModel. The set of all fields' StorableFieldModels form the StorableEntityModel for the entity. StorableEntityModel is a agnostic abstraction for the entity from the keeper point of view. For now a simple implementations based on the StorableDomainModel is provided. The rational is that the persistence model for the entity is conceptually decoupled from entity model it-self. The goal is to provided the means to implements complex multiplicity relations between entities and the underlying data structures ( tables, files, etc...) thus not relying on a simple one-to-one relation multiplicity.

The model also acts as a factory for entity instances. This is essential to mapping and loading new data objects from the underlying data.

Under the hood

MiddleHeaven's Storage Toolbox is an agnostic Domain Store pattern implementation. Some API already support this patterns like JPA and Hibernate. Conceptually is possible to implement a StoreKeeper to use those API, however they are extremely focusses on SQL and DBMS making direct use of concepts like Table , Primary Key and Automatic Key Generation. DataStorages also have key (identity) generation but is totally decouple from the DBMS. For example, when the DBMS nativly supports sequences, the keeper can return an encapsulation of that on a Sequence object. When not, the keeper can simulate the sequence by other means.

On the other hand MiddleHeaven's Storage Toolbox lacks many of the optimizations performed by Hibernate or JPA, like generational cache. MiddleHeaven's Storage Toolbox is designed to be able to provide the same functionality by decorating (Decorator Pattern) data storages with other data storages enabling the application to use a cached version of any other underlying data storage. This is still a work in progress at this moment.

The trade-off for this toolbox was not to depend on any other API has none of the available is agnostic enough. This implies in a grater effort to implement and test the toolbox, but provides greater flexibility.

Storable is an internal type used to control persistence state. Any object passed to the DataStorage is converted in a Storable. this is archived by means of bytecode manipulation allowing the original object to be mutated to an object that extends the original class and implements the Storable interface. This is one of the reason why the store method returns an object of the same type. This object is not the same object passed to the method, it's now a managed object. If this object is further used and changed by the application those alterations are recorded. When the object is again passed to the store method the data storage can identify the changes and act accordingly. If, for example, no changes where made, the method will simply return. Also, any object returned by the Query interface is a managed object, by the same reasons.

Limitations

As the MiddleHeaven Storage Toolbox is based on entity objects there is less room to work with the storage native data elements themselves. Meaning that you are not supposed to work with tables , columns , rows or xml directly. For that kind of interaction you will use other technologies and toolboxes ( some of which are used internally by the implementation).

A second limitation is that , for now, only a Domain Driven Datastorage is provided. Even thought the structure does limit this design this is the most usefull implementation nowadays. Support for an ad doc datastorage could be implemented in the future according to demand.

quarta-feira, 19 de novembro de 2008

Quantity and Measure: Money Toolbox

Quantity and Measure: Money Toolbox

It's not possible to develop a business application without having to handle money amounts. Ultimately "business" means "making money", right ?. Some businesses have money as the primary product (aka banks) where it is even more important to control it correctly.

Money is - put simple - an amount (a number) of currency. Well, this is the hook we need to integrate money into the Quantity and Measure toolbox. Money is a measure of currency amount.

Money

Money is a well know design pattern. Everybody knows that doubles and floats do not cut it when you want to handle money. Standard java offers the possibility to use BigDecimal as a value object for representing money, but a simpler ( more strong typed) solution exists.
So we start by considering that money has is a finite amount. It is not possible to represent an 1/3 with money. It simply isn't. This show us that all operations with money are integer operations.

Because all money operations with money are integer operations we can use a long to store the amount. Now you may be thinking: "what about the cents?". Well we multiply the real amount by a power of 10, the result will be a integer value. When the amount is asked for, we divide by that same factor before returning. Very simple.

Using the long is very cleaver but need to multiply by a factor that converts fractional amounts into integer amounts. By what power of 10 is that factor ? Well, for dollar, that uses two decimals for cents, it would be 102=100. And for Yen ? Well, Yen doesn't have a fractional part! Yes, not all currencies have the same fractional number of digits. Some have none, and some have more than two. In standard java we use getDefaultFractionDigits() from Currency to discover the fraction digits for each currency.

Currency

Currency is not the unit for money amounts in a way similar as meter is the unit of length. It's a unit. There are, in fact, several possible units as for length there are several. The main difference is that for physical units that are units more used that other. The SI uses meter, and any one using the SI uses the same unit. For money this is not possible. there is no standard currency. So each country adopts it's own currency, but because countries trade with each other they need to convert values between the different currencies.

Currency conversion is not, in essence, different from length unit conversion (say meter to mile), but in practice, all other units have constant conversion rates, currency does not.
Information about the current and historical currency rates is an asset , an so, many companies sell access to query this information. (nowadays mostly over web services). So, another basic operation in the money toolbox is currency conversion, i.e. convert money with amount expressed in a currency,to money with amount expressed in another.

In standard java we use getDefaultFractionDigits() from Currency to discover the fraction digits for each currency , but MiddleHeaven's Currency is an Unit. How can we handle this. MiddleHeaven resolves this by providing Currency as and abstract class. This way we can implement other currencies. Some applications (specially finance related), may establish different sets of properties for they currency object. MiddleHeaven provides ISOCurrency that is in fact based on standard java Currency class.

Currencies are further related to countries, cultures, and treaties (like euro) and change over time. A country can have more that one official currency and can have more that one currency over the years. So handling all these details for currencies is tied to tracking all these changes and possibilities.

Model

Money class represent money amounts and permits arithmetic operations. Some operations are special, like division, and there is no operation to multiply a Money for Money ( there is no Money squared). You can multiply it, however, by any real number. Money is an Amount witch is a Quantity that can perform group operations (addition and subtraction) Currency is the measurable. ISOCurrency is the default implementation for Currency

The MoneyConverter is a simple interface to convert money between currencies that is used together with MoneyConverterService. We will cover services later on and we will be back to this toolbox for examples.

Use

This example demonstrates a unit test that performs operations with money. Money object need a currency to be specified, and as default we can specify it by means of a simple string with the currency ISO code.
01
02 Money a = Money.money ( 100 , "USD" ) ;
03 Money b = Money.money ( 230 , "USD" ) ;
04 Money t = Money.money ( 330 , "USD" ) ;
05
06 Money c = Money.money ( 330 , "EUR" ) ;
07
08 Money m = a.plus ( b ) ;
09
10 assertEquals ( t, m ) ;
11
12 // money are equal if both amount and currency are equal
13 assertFalse ( t.equals ( c )) ;
14
15 // can only add money of the same currency
16 // will throw IncompatibleUnitsException
17 t.plus ( c ) ;
18
19
20 // eu is not an ISO code
21 // will throw IllegalArgumentException
22
23 Money.money ( 330 , "EU" ) ; // eu is not a iso code
24
25
26 // multiply by a real
27 Real n = Real.valueOf ( 3 ) ;
28 Money y = t.over ( n ) ;
29 assertEquals ( Money.money ( 110 , "USD" ) , y ) ;
30
31

Code 1: Example of use for Money

terça-feira, 11 de novembro de 2008

Let us know

Maybe is still early for you to grasp the underlying concept of MiddleHeaven and why I 'm so excited to be developing it, however, I feel its time for some feed back. So feel free to leave a comment and speak you mind.

I know it may look like MiddleHeaven is copying other framework out there, but the truth is that MiddleHeaven development starts from cientific based, independant, world wide accept, concepts. It arrives to models that others already arrived by following the same path. The difference is that MiddleHeaven does have the goal to be a universal hub of API that your application and development team can rely on and not just to provide simple , standalone, isolated API that integrate with each other.

It may look like the API nightmare, and from the developing point of view it is, but MiddleHeaven puts great effort on having a simple, fluent interface for developer that use it, drastically reducing configuration and programming time, specially for those every-day activites that all business oriented systems have, or should have (reports, charts , securtiry, graphical interface , data models and data storage, ... ). Its our hell, not yours.

So, again, fell free to leave a comment on suggestions, expectations, critique , etc.. you have for the MiddleHeaven project.

sexta-feira, 7 de novembro de 2008

Quantity and Measure: Time Toolbox

Quantity and Measure: Time Toolbox

The Time Toolbox is an expansion of the Quantity and Measure Toolbox in order to extend the concept of quantity to time periods and incorporate time related concepts and operations. This toolbox contains a data type API and the foundations for many time related operations that other toolboxes will leverage.

MiddleHeaven understands time as a continuum even thought most operations are discreet to the millisecond. This continuum is commonly know as the time-line. The points in the line are called TimePoints.
Clocks do not measure time, they measure elapsed time, i.e. the "length" in the time-line between two time points. ElapsedTime is, thus, the fundamental quantity for the Time Toolbox. The SI unit is the second.

Reference Frame

Although time is really measured in intervals in practice we need to refer and distinguish the several time points. Because no time point is physically distinguishable from another is required to choose one specific point in the line to be the reference and measure the time elapsed since then. The time point reference will then be the time elapsed since the reference point. This reference time point chosen defines an Epoch.

Java uses Unix epoch (00:00:00 of January 1st 1970) for reference and keeps track of time in milliseconds since then. In Java is possible to obtain the current elapsed number of milliseconds from the epoch invoking System.currentTimeMillis().

Current Time and Clocks

Obtaining the current time directly with System.currentTimeMillis() is a problem. When you need to test the application (e.g. unit test some class) you cannot wait until a certain time to make the test. You need to control the concept of "current time".

MiddleHeaven introduces the concept of Clock. A clock is an object capable of specifying the "current time".

A Clock has three main properties:

  • Current Time - The time point that is considered "now". The current time point it self.
  • TimeZone - not all clocks in every city in the world show the same time. The differences are created by the geographic distance, relative position to the Sun apparent movement and also politic or economical conventions like day light saving policies. Time in different clocks only can be related if a time zone is attached to each one and the different time zones related to each other.
  • Cadence - the rate has time "passes". The time in a "normal" clock elapses at a rate of 1 second per second meaning that for each second of real time elapsed the clock changes the current time it presents by the same amount. Defectuous clocks have different cadences and that is why they slow down or speed up. Providing clocks with different cadences can simulate time events faster ,or slower, without having to wait for the real time period to elapse. A task triggered every hour can thus be triggered every minute , second or day instead.

Ilustration 1: Clocks
MiddleHeaven comes with a variety of clocks. MachineClock is the one to use if you want to mimic System.currentTimeMillis(). The time zone is default and cadence is 1. StaticClock is a clock with cadence 0 (means the time does not change). You can set any time point and time zone for it. Very useful for tests. SpeedyClock utilizes the real cadence of another underlying clock (pattern Decorator) and multiplies it by a configurable cadence factor. With SpeedyClock you can make time run faster, or slower. SNTPUniversalTimeClock is an experimental clock implementation whose propose is to always

be in synch with an external time server. This can be very useful if you system is distributed. Different parts of the application can have its own clock synchronized with a central time server so all timestamps for the applications events are meaningful and correlated.

AlarmClock is a special type of Clock that raises an event for a registered ClockTickListener on a certain Schedule. The mechanics of the clock can be obtain from any another clock implementation (pattern Decorator), thus, you can run the AlarmClock normally embedding a MachineClock, or faster embedding a SpeedyClock. AlarmClock is used for work scheduling in the Work Toolbox and these mechanism allow for test in "not-real time" so you can test several work loads and cycles in an minimum amount of real test time.

Chronology

Once we have a clock and viable means to define "current time" we need to further define the reference time frame. This is because different cultures choose different specific time points to start their evaluation of elapsed time and different concepts in order to group those periods. The chosen reference time points are traditionally related to cultural events and together define the order of subsequent events in the time-line.

Cultures, thus, define certain Chronologys. A Chronology is a sequence of reference events proper to a culture (or set of cultures) that permit the members of that culture determine the relative position of events in the time-line by mapping them to an elapsed time from a specific, pre-define, time point.

Different cultures also devise different models to evaluate, count and refer to time points along with measuring time. Almost every culture has the concept of day, month and year, but their definition is not the same for every one. Additionally political events in the history of the cultures create gaps between subsequent models.

The Chronology object encapsulates all calculation logic needed to the Time Toolbox. Different chronologies can be implement according to different cultures and rules. Also, chronologies can be implemented according to different technologies and/or underlying APIs.

Calendar and Ephemeris

Different sub-cultures also define their own specific categories and classifications of days and groups of days. (i.e. companies can define working days according to their work schedule). To keep track of all this informations the concept of Calendar

was created. A calendar is not a just grouping days but a reminder of events for the different days, the ephemeris.

MiddleHeaven introduces the EphemerisModel to model this daily use of calendars. The EphemerisModel can be used to define the ephemeris for a given day. The EphemerisModel depends on an underlying Chronology for computations and culture related modifications.
MiddleHeaven includes the EasterBasedCalculatedEphemerisModel that calculates several holidays related to Easter. The day for Easter, it self is calculated, by an algorithm, from the Gregorian Calendar year. The other holidays are related to it by a fixed number of days.

EphemerisModel can be used for nay calendar related propose as it can be implemented to interface other calendar systems. Remember that ephemeris is only a fancy name for "event that occurs in specific date". Any business or personal appointment can fill this category.

Model

The next diagram shows the Time Toolbox types and the relation between them

Ilustration 2: Time Toolbox
The blue area is the border for the relation with Quantity and Measure Toolbox. TimeInterval makes the transition to the Time Toolbox data type core. It inherits Interval generic type implementation and the fact TimePoint is a Comparable (it has a natural order given by the time-line). The clocks give meaning to the concept of "current time" and allow to really asset the passage of time. Clock are related to time zones witch are also related to cultures and geographic position. Chronology provides the culture and technological mechanics to count time and correlate it between cultures and eras. The EphemerisModel keeps track of those day-by-day events like Easter, Christmas, Tree Day, or your appointment with your dentist.

A complete time model is only complete by selecting an implementation for each concept. MiddleHeaven supports this by introducing the TimeContext type. A time context if formed by:

  • A Clock - to acknowledge the current time, and its rate of change
  • A TimeZoneTable - to correlate clocks around the world
  • A Chronology - to give meaning to concepts like day an month in a cultural dependent way and compute calculations between them
  • A EphemerisModel - to keep track of the events relevant in the day-by-day life.
TimeContext acts like a register for it self so you can define, for each application, witch time context is relevant.

Use

Here is an extract from the junit test that exemplifies how to use the Time Toolbox API data types and EphemerisModel. EphemerisModel can be used to work with holidays/working days arithmetic. The code tests if 5 working days after 2008-5-28 is 2008-6-4. In the end it test that 2008-6-6 is the 5th working day of June 2008.

01
02 EphemerisModel model = new EasterBasedCalculatedEphemerisModel () ;
03
04 DateHolder start = CalendarDate.date ( 2008 , 5 , 28 ) ;
05 DateHolder end = CalendarDate.date ( 2008 , 6 , 4 ) ;
06
07 assertEquals ( end, model.addWorkingDays ( 5 , start )) ;
08 assertEquals ( start, model.subtractWorkingDays ( 5 , end )) ;
09
10 start = CalendarDate.date ( 2008 , 6 , 2 ) ;
11 end = CalendarDate.date ( 2008 , 6 , 9 ) ;
12
13 assertEquals ( end, model.addWorkingDays ( 5 , start )) ;
14 assertEquals ( start, model.subtractWorkingDays ( 5 , end )) ;
15
16 assertEquals ( CalendarDate.date ( 2008 , 6 , 6 ) ,
model.getOrdinalWorkingDayOfMonth
( Month.ofYear ( 2008 , 6 ) , 5 )) ;
17
18

Code 1: use example

Under the Hood

At this point you may be asking : "Is the Time Toolbox a clone of JSR 310" ? The answer is no. MiddleHeaven Time Toolbox was developed with several goals of unification in mind by expanding Quantity and Measure Toolbox's concept of quantity into the realm of time. Joda Time and the Time and Money API where the main inspirations ( as their where for JSR 310 as well) but an effort as may to dry the data type API and some concepts were aligned (we may see that Joda's Duration concept is named Period in MiddleHeaven, and Joda's Period is named Duration. This alteration was made name of coherence with the rest of MiddleHeaven as periods are physical measurable and duration are just conventions) MiddleHeaven embraced a larger scope from the beginning. MiddleHeaven Time Toolbox aims to be independent of a given time API implementations, an thus can be implemented either with java standard Date and Calendar API, the new JSR 310 API, Joda Time, or any other future API.

Also JSR 310 and Joda Time do not provide means to model time context or ephemeris so an abstraction would be necessary any way. MiddleHeaven only opts to make the trade-off simply by not making it at all. MiddleHeaven currently only defines a Chronology based on the GregorianCalendar from standard Java, however as early as Java 1.7 is released a new chronology based on JSR 310 will be added.

Work Scheduling can be achieved by an API such as Quartz or java standard or EE timers, however on those API only the machine clock can be used to schedule. This is very cumber stone for test and development as not concept of "permutable clock" is given. MiddleHeaven embraces the real, practical ( you may say pragmatically) issue of testing and developing favouring testability over simple execution. We will be back to this issue when we discuss the Work Toolbox later on.

EphemerisModel goal is to be provide a simple, extensible, hook that any last-mile developer can implement with real - important to the application - domain rules. Normally this will be related to holidays and working days has many business features depend upon this information, but also appointment related software can be take advantage of this component.

terça-feira, 4 de novembro de 2008

Quantity and Measure Toolbox

Quantity and Measure Toolbox

The Quantity and Measure Toolbox is conceptually a simple toolbox but with a very complex, cross-cutting concern, implementation. The simple goal is to provide a common interface for all quantities of different mathematical subtypes. Why ? Because business applications deal with quantities all the time (e.g. periods of time, dates, money amounts, lengths, weights ... ). More sophisticated / narrows spectrum applications, may also need to work with coordinates or mathematical structures like vectors and matrices. Engineering oriented applications will need to work with different units for the same quantity and make calculations with them.
All these quantities are expressed in units witch,from an internationalization/localization point of view, vary and must be converted from one to another.

The Quantity and Measure Toolbox is an essential toolbox in MiddleHeaven as it provides the basic data types ,units and conversions. We may think of it as an upgraded version of java.lang basic types.

Quantity , Units and Measurables

When we measure something we are comparing the size of a quantity with another reference size of the same quantity. This reference is called the standard and it size is called unit The standard size for the quantity defines an unit of measure. So, a measure is always a ration between the size of the standard (the unit) and the size of the quantity you are interested in measuring.

In physics if you can measure a certain thing, then you can come up with an unit and a number that relates the size of the measurable to the size of the unit. Not all things that exist are measurable and not all units have the same practical value. The International System of Units ( abbreviated SI) compiles the essential measurables and respective units. For example, second is the unit of time, time is the quantity and time period is the measurable. Kelvin is the unit of (thermodynamic) temperature , temperature is the quantity and temperature interval the measurable.
For simplicity MiddleHeaven calls quantity to the association of the unit with the ratio to the unit; and measurable to the underlying physical property the quantity quantifies. Quantities are thus implementations of Quantity. Quantity only defines a getUnit method that returns a unit of the same measurable than the quantity. So if the quantity is of temperature, the unit also is of temperature.

Quantities can be operable upon by means of operations like sum or multiplication, however some rules apply. The operations made to he numeric part of the quantity must also be made to the unit part of the quantity. Meaning no quantities can be summed together if their units are different and multiplication applies to the units them self (e.g. 2m(meter) x 3m = 6 m2 (square-meter)).Understand how the unit is the same (meter) and only its dimension was incremented.

Model

The Quantity and Measure Toolbox type model is quite vast. It would be confusing to show you all the model at once so we are going to break it minor pieces. Each of these pieces concerns a specific subtype of quantity:

  • Structures - for working with vectors, matrices, rings, fields, and other types of mathematical structures
  • Number - for working with dimensionless quantities like integers, reals and complexes. Operations and relations are based on the real mathematical structures underlying those quantities
  • Measure - for working with measures including error propagation operations (it leverage number operations together with unit and dimension operations)
  • Money - for working with many and currency related amounts. Introduces Currency and a special unit and provides specific integer based operations ( and not decimal operations) to hand money amounts without loss.
  • Time - for working with periods, calendars, days, dates , weekends and other time related quantities. Also includes other objects and concepts like clocks, time zones and ephemeris
  • Coordinates - for working with Coordinate and Coordinate Reference Systems and conversions.
  • Utilities - several generic utility classes like Interval and Range that can be used in different contexts and are completely compatible with the other Quantity and Measure Toolbox types.
One of MiddleHeaven goals is to embraces and implement common applications design patterns and provide them "out-of-the-box". Money is such an examples. Integrated with the Quantity and Measure Toolbox, Money is supported from the beginning and integrated within a larger concept instead of simply implementing a poor-men version. As business commonly implies money operations we hope Money will simplify daily complex money based operations.

Under the Hood

All the concepts in the Quantity and Measure Toolbox may be familiar to you if have some knowledge of science - specially physics - or if you have crossed paths with the JScience project. Some may wonder why Quantity and Measure Toolbox is so close to JScience but MiddleHeaven in not implemented on top of it. Well, this is a twofold trade-off. First JScience is based on the use of double for numeric type and not objects. This is a problem as encapsulation is impossible. You can put a double inside any object, but not an object inside double. On the other hand JScience has goals concerning speed and real-time computing that MiddleHeaven has not. However MiddleHeaven's Quantity and Measure Toolbox is based on factories (NumberFactory) in order to provide the real implementation for the data types, specifically the numeric types. So, it is possible to conceive an engine using JScience under the hood for performance issues. If you really want it you are free to implement and contribute it to the project. This is true for any extension you can think of for MiddleHeaven. This is the main reason it's open source.

As a development insight: MiddleHeaven is a one men efford to encapculate good thrid party API, bind them onto a single simple API and promote design patterns and implementation for extensibility and decoupling. Its really an atlantic job, so the project is bound to be very "curved around it self". The design was carefully thought and analysed - and some times even redone - to cope with the simple objectives.

For those with their eyes on Java 1.7 and JSR 310 - Date and Time API may wonder why MiddleHeaven has its own time API. When MiddleHeaven development started no such JSR existed, only Joda Time , Time and Money and other API existed. On the other hand, Quantity and Measure Toolbox starts from a concept : quantity , so all quantities and units can relate to each other. Time being a very important quantity in business systems ( along with money) was not conceivable to implement the API without support for time operations, specially because no such support exist in standard java. So, we needed integration and abstraction.
Because JSR 310 does not have quantity unification concern in mind it will never be directly compatible with MiddleHeaven. An abstraction on top of JSR 310 will always be necessary in order to use its functionality can be used from MiddleHeaven. Thus the choice was made to create a simple, elegant, practical interface for time manipulation and all calculations where delegated to classes that can be changed and implemented for different underlying technology.
This façade API resulted in one very similar to JSR 310 and this is no coincidence, both, MiddleHeaven Time toolbox and JRS 310 models are brewed from the same inspirations sources ( like Joda Time API and Time and Money API) and baked with he same well establish international standards and other scientific knowledge. The result is bound to be similar. Also if MiddleHeaven is to be used in the daily routines, its syntax and types must be simple and mnenomic.
MiddleHeaven is not designed to supplant third party API but to flourish upon them and feel the gaps. JSR 310 is a great example of this. Before JSR 310 and Java 1.7 MiddleHeaven can leverage Java standard Date and Calendar classes to implements all the time calculations but when JSR 310 comes out, MiddleHeaven can implement a new time engine based upon the new constructs. In the future this can be true for any other not currently foreseen API.
Is important to remember that reutilization and API stability and third party API independence is the main goal for MiddleHeaven. We will see this concept at work in many other toolboxes as we visit them (e.g. Report Toolbox and Chart Toolbox). Keep in mind MiddleHeaven is design to resist time. Leveragering third party API with a clean simple façade API is the most simpler, object oriented, elegant, way to do this. Encapsulation allow for a complete change and modification of the underlying engines without altering the API your application use. so, if, some time in the future some better API spawns you application can take advantage of it without changing any code, just upgrading the MiddleHeaven jar and dependencies.

A Joda Time based implementations was also possible, but because JSR 310 is so near no effort was made to support Joda Time.

segunda-feira, 3 de novembro de 2008

Managed File Toolbox

Managed File Toolbox

I/O operations are many times considered boundary, less important, operations, however they are most frequently used, in an almost omnipresent way in all applications and in more that one flavour. Web applications, for example, often must cope with file download/upload features that are copied to a local filesystem. Commonly these features are introduced with the support of a third party API, and the real work boils down to stream plumbing.

The Managed File Toolbox is an essential MiddleHeaven Toolbox. Many other toolboxes use Managed File Toolbox services to handle file-like data and streams in a transparent way. We will se details on how when we talk about each one of them. Where we will look at Managed File Toolbox's simple core concepts and types.

Java allows for OS file system manipulation as we see fit in a very platform independent way, specially by means of java.io.File class. However Jar/Zip files that also may contain files and file groups like their where folders have a very different treatment. If you think of jar/zip file like a kind of virtual file, java does not support this idea directly. This idea is not new of far-fetched ( e.g. Windows XP and beyond already supports zip files under this this virtual file concept ).

The Managed File Toolbox allows for total abstraction of the underlying file system and allows for the creation of complex virtual filesystem based upon any other file-similar or file handling technology like FTP, HTTP, Zip and of course the OS native file system.

File vs Managed File

A Managed File is a node in the file system similarly to java.io.File. The main difference is on how to obtain an InputStream or OutputStream from the file. Java IO uses the Adapter Pattern approach to create streams. MiddleHeaven's Managed File uses a simple Factory Method that will return a ManagedFileContent

k A Managed File can be copied to another file or folder simply by invoking copyTo. This can be a boring task in standard Java has no copy utility is provided and we end writing it over and over. Even then, we must choose between the possibility to use the NI/O channels or not. MiddleHeaven tries to use the most efficient copy method according to the real underlying file location and implementation.

The root repositories for files and folders can be obtained by abstracting any supported URI or java.io.File. This task in normally done in bootstrap or configuration code where the application "roots" are configured. We will come back to this when we visit the Bootstrap and Service toolboxes

Model

The figure below illustrates the basic types for the Managed File Toolbox in MiddleHeaven.

Ilustration 1: Managed file main types
The ManagedFile is the main type. From it is possible to obtain other children files, or folders, have access to it parent file or to the ManagedFileContent with in turn provides InputStream and OutputStream to data itself.

Almost all operations throw ManagedIOException, an IOException runtime exception counterpart. Exception handling best practices dictate that all exceptions thrown by operations involving a subsystem or communication with external resources (like files, connections ,etc..) must be checked exception. Here MiddleHeaven makes a trade-off. Normally this exception cannot be resolve only by the application itself, is common to have to have someone reconfigure some port, address, firewall, permission , etc... So there is no much point in checking all exceptions where we do not have means to resolve them. However when we do have means to solve them, or work around them, we can still capture the runtime exception.
MiddleHeaven incorporates the Exception Handler pattern in order to translate common I/O exceptions into more meaningful exception according to best practices (e.g. MiddleHeaven provides FileNotFoundManagedException that receives and holds the file path that was not found in order to be read afterwards for finer exception handling)

All ManagedFile provide the method isWatchable with normally returns false. By contract it only can return true if the ManagedFile implementation also implements WatchableRepository. WatchableRepository allows for watch-dog capabilities to be added by means of registering FileChangeListeners. the watch-dog implementation is provided by the underlying ManagedFile implementation.

Although ManagedFile is the main contract for users of the toolbox, ManagedFileResolver is the most relevant contract for the implementation of the toolbox it self. Because the toolbox must be open to incorporate several different file-like systems and corresponding supporting API MiddleHeaven opts for introducing an abstraction super layer called the engine, represented by the RepositoryEngine interface. Different engines provide different ManagedFileResolvers so any virtual file system could be incorporated.

Use

Using a Managed File is very simple. In this example we create an empty file in the current folder, locate the JUnit.jar in a local maven repository and copy it. As said before, normally the programmer will receive a ManagedFile or ManagedRepository from else where in the system, like a service or via injection. For completion sake we also see in the following example who those "root" files can be obtain from standard java.io.File:

01
02 //ManagedFileRepositories is an integration utilities class that allows
03 // for integration with java.io.File
04
05 ManagedFile folder = ManagedFileRepositories.resolveFile ( new File ( "." )) ;
06
07 assertTrue ( folder.exists ()) ;
08
09 // resolving is simply acquiring a location where the file should be.
10 ManagedFile testJar = folder.resolveFile ( "test.jar" ) ;
11
12 // if exists, delete it.
13 if ( testJar.exists ()){
14 testJar.delete () ;
15 }
16
17 // create a new (empty) file
18 testJar.createFile () ;
19
20 // it must exist by now
21 assertTrue ( testJar.exists ()) ;
22
23 // manually obtain the path to the repository
24 ManagedFile rep = ManagedFileRepositories.resolveFile ( new File ( System.getProperty ( "user.home" ) + "/.m2/repository/junit/junit/4.1" )) ;
25
26 // resolve the file
27 ManagedFile junitJar = rep.resolveFile ( "junit-4.1.jar" ) ;
28
29 // copy it to the location previous created
30 junitJar.copyTo ( testJar ) ;
31
32 // this a jar file that contains other files. So listFiles() collection will not be empty
33 assertFalse ( junitJar.listFiles () .isEmpty ()) ;
34
35

Code 1: Managed file use example

This example shows how we can abstract any java.io.File into a ManagedFile. It also shows how simple it is to copy a file simply by invoking copyTo. Finally is present a simple example to ilustrate the complete abstraction of the virtual file system of a Zip file by iterating it as we would had done with any other managed file

Under the Hood

The had use the managed file concept in other applications an frameworks I've develop along the year. During this time I've crossed roads with Apache's Commons VFS that thrives on the same "virtual file" idea. Because MiddleHeaven main goal is to abstract implementations and third party libraries I could no use Commons VFS directly. Instead I came up with the RepositoryEngine concept. A repository engine provides a root ManagedFileResolver in order to resolve the location of any file within.

MiddleHeaven Managed File implementation is a almost direct translation of Common VFS witch already provides a vast variety of virtual systems like FTP and ZIP file , to name a few. Plans exist to incorporate WebDav and others.

However Commons VFS does not integrate natively with Commons Upload very frequently used in web applications to mediate file upload logic. This is where MiddleHeaven Managed File Toolbox architecture shines enabling the toolbox to provide the services we need based upon on the code we see most useful, merging or aggregating different third party API when needed. MiddleHeaven Managed File Toolbox provides a Managed File implementation of a ManagedRepository that can be used to interact with the download files in the same exact way we would interact with a file on the disk. We'll talk more about this we we visit the Web Toolbox.


segunda-feira, 27 de outubro de 2008

Sequence Toolbox

Sequence Toolbox

Let's start with a simple toolbox: the sequence toolbox. A Sequence represents a generator of tokens with and inherent order. Normally we will use a sequence to generate numeric tokens but it can be use to generate almost anything you need.

Properties of a Sequence

A sequence is characterized mainly by three properties:

  • Order - the sequence produces tokens in a specific order (normally ins ascendant order). As a consequence the token must contain an object with a natural order.
  • Gap Resilience - the sequence may accept gaps or not. Accept gaps means not all token values must be used. Example: an integer sequence car run like : 1, 2, 3 , 6, 7,8 ... not using the 4 or the 5 for the token value.
  • Limit - the sequence may stop because no more tokens are possible. Although conceptually a sequence may have an infinite number of tokens, in practice no sequence is infinite. Has so, all sequences in MiddleHeavewn are considered "potentially infinite" , i.e. they would be infinite if there was no computational/environmental limit. thus, limited sequence are those that we know would never be infinite (e.g. a sequence based on the items of an array)

All sequences in MiddleHevaen are considerer ordered, unlimited and allow gaps. Sequences that are not ordered must inherit from RandomSequence. This is mainly a marker interface as no method is added to the interface.

Sequence that are intended to be limited must implement the LimitedSequence interface. LimitedSequence adds an hasNext method to check if there are more tokens in the sequence.
Gap resilience is a more difficult feature to implement has it must be defined over a transaction context. It two transactions A and B acquire tokens from the same sequence and A rolls-back, the dumped tokens must be reused by B, has there can't be gaps. TransactableSequence tries to address this by locking the sequence to a specific transaction until it ends (committing or rolling back)

Model

The figure below illustrates the basic set of sequence types in MiddleHeaven and who the relate to each other.

Illustration 1: Sequence basic type model
Not all sequences are present as some are implement inside other toolboxes. I'll talk about them later.
Following the Separation of Concerns principle all basic types are modelled as interfaces. All sequences are type-generic and have a next() method that return the next token in the sequence. StateEditableSequence allows for manipulation of the sequence state (read/write). This is useful mainly for defining StatePersistableSequence, a sequence whose state can be persisted so it never resets. This will be most useful for identity generation has we will see when we discuss the Storage Toolbox.

RandomCharSequence can be used to generate random sequence of characters (that may be further converted into String) and RandomNumberSequence can be used to generate random sequences of numbers. For both instances of Random can be specified for better control.

Use

Using a sequence is very simple:

1
2 Sequence sequence = new LongSequence () ;
3 Long value = sequence.next () .value () ;

Code 1: Example of sequence use

Sequences can be used per se or in conjunction or within other toolboxes. The Storage Toolbox uses sequences to create different identification tokens for storable objects. For database supported storage MiddleHeaven abstracts the native sequence mechanism present in some databases as a Sequence of Long.

sexta-feira, 24 de outubro de 2008

Starting Point

Hello, I'm Sergio Taborda. I'm the author of MiddleHeaven and this is the first post of a long, I hope enjoyable, serie of posts were I will try do explain what MiddleHeaven is. This is not easy because it can be many things at once.
I will try to explain the main concepts , ideas, goals and trade-off I made along the way.
I'm not trying to convince you to use MiddleHeaven as it is in a very early stage, but I think the process can be very instructive both for you and for me.

Mainly MiddleHeaven is Open Source Java Framework. Some folks may ask "Another framework ?", yes, another framework. Why ? Because I fell current frameworks are very focussed on the technology and little on the business. I wanted a framework that could help me build complicated , distributed , rich applications with little effort by pre-integrating other common frameworks and fill the lacks where need be.

MiddleHeaven is more than a simple framework, is a set of mini-framework working together: the toolboxes.

MiddleHeaven is based on Java 1.6 with eyes on 1.7. and does not despise other frameworks out there. Instead it tries to incorporate the best in each of them and encapsulate it under a common interface so that in the future the underlining framework can be change to a more modern, robust one. MiddleHeaven stands for code library independancy as the JVM stands for OS independant execution (generally speaking).

At this point MiddleHeaven is in an alfa stage of development as it lacks UI abilities. However many toolboxes are already available ,and I will start to explain them in posts to come... Meanwhile you can dig the code.

Tank you