The rest of Boris comments in reply to my point 2. and 3.


2.
...
There's RDF support via Sail kindly contributed by Ian Holsman. And there's a serious (but interrupted) attempt to support XML Schema at the storage level with XSD types being synthesized with bytecode generation.
RDF support through Sail. I will have to look at this. I had thought Sail went through to OWL 2.0 but it must depend on the implementation.
XSD types .. so this must be dynamically generated, this would be better than the solution I found in Python using string interpolation and meta programming. That is if it is reliable. The byte code generation solution would be directly consumed in HGDB. It all shows how tricky and complex these things are since there have been countless implementations of Java consumers of XSD. I assume that once the XSD is represented in HSDB then any XML written to the schema can be read in. What about the strict non-strict verification issue? What about XML with no schema?
This also leads to another thought. As so much of the read write mechanisms exist in Python it is natural for me to ask if there are graph dbs implemented in python too.
So here is the result of that search:-
Looking at nosql-database.org there are some.
My main comments are.
1. Mikio Hirabayashi (Tokyo Tyrant etc.) is awe inspiring. And, also, it has a Lua API. If this proves anything it proves that my slight eclecticism is worth while, and leads me to blog next - as I anyway wanted - about another of my blogs.
2. It is strange that Sesame is not mentioned on this site. Anyways. Perhaps it is not NoSQL as the main way in is through SPARQL? Honestly, I have no idea. There are other omissions too, for same reason? But I notice Coherence is there as a grid database solution. Coherence is a distributed cache and this leads me to another thought that, obviously, which ever layer this solution is placed it may be trying to solve similar problems. This distributed cache is there to try to solve the problem of multiple calls from different servers through to one or more share RDBMS.
Graph Dbs store the data differently to the relational model in the first place. What I imagine is happening is that once the data is in the distributed cache the way it is treated is at least similar to the way a graph data base treats data. But that does not mean that the original data would have been better represented with a non-relational schema.
...
But the interesting part is in coming up with a representation and appropriate metadata that allows you to deal with dependencies across config files. This would be a challenging problem I think.
This is certainly a very sensible re-framing of my problem and I haven't given this any thought so far. It would be far better if I could describe my set of problems as a set of meta data types, such as server URLs, ports, version names and so on. This would avoid ad-hoc code and allow the far more interesting connections to be made that Boris suggests. I am extremely grateful to him for this suggestion.
Actually, when studying the maven 2 POM I did wonder if there was some way to classify segments of it. But I didn't pursue the thought. I believe I thought that I would not be able to find a correspondence between an Ant build file and the Maven pom, so I didn't try. But, thinking about it, this could be done. Here is the idea.
The Maven POM does have its own sections each with its own classification. This far I got and realised that for each section there should exist a set of rules, but I had no idea what the rules might be or how to derive them.
The problem here is that Ant exists using far smaller atoms - to use the term - than Maven. Each Ant atom is pretty much stand alone, a task that may have been defined and inserted at any point in the evolution of the build file. Now necessary for some aspect of the build when following that route. So how to map into maven? Is this a profile, non-default directories, property filtering or some other aspect? Given that maven prefers default configurations and also offers many phases built in, is the ant declaration necessary at all?
Well this is the idea. Go over these issues carefully, one by one and classify them, e.g. default directory structure, implicit lifecycle phase, property filtering.
Notice something about this list. The first two items are meta to the POM, but the last isn't, it is a direct reference to a POM element. So, it seems, that going over this very carefully it would be possible to derive meta descriptions against which arbitrary ant files elements could be mapped.
My intuition is that when ant files are written and added to it is the particular task and the ant build context that are held in mind, rather than any notion of the type of task being undertaken. That is the new task should complete when called in the context of the existing file. It is highly irrelevant that this task has been described a thousand times before in different contexts! Although ant is infinitely extensible, the normal set of tags is limited and indicates that classifying them should not be difficult. The above process should be two way, common ant tags can also be classified and then mapping rules applied that include a guarantee that the build context not be broken (how?) and whether the element should be kept, modified or deleted. One way to provide this guarantee might be to keep all elements and map them in (would that be possible? or not too messy?) and then removed as needed on successive iterations.

I can see that there would be some difficulty in programming this consistently without the guiding held of something like a graph db. A graph db invites the definition and use of properties and their conditional discovery. For instance the context of an ant task needs to be represented. It needs to be sufficiently well described to be able to make decisions about it. So although we know it is a type of task carried out a thousand times elsewhere, in this context what is special about it?
One of the things I have found is that ant builds perform many copy and create directory tasks because, unlike maven, there is no central repository. We need a classifier that can distinguish when this is the case. What other reasons might there be for copying artifacts from one location to another? There is the special case when it is necessary that the artifacts be the latest version in SVN. But this is exactly what maven is for.
There must be something like this:-
When all artifact dependencies are known, and the structure of the project (directory layout) is established all artifact copying, other directory creation and tasks to create the build path can be eliminated.
This isn't meant to be a complete description at this point, I am just pointing out that a lot of what is in the ant file would be reduced out in the maven POM.
I know that ant build files can harbour some exotic tasks. I think it is only with experience that it would become known how much of this process can be automated.
I should point out that the point of a tool like this is
a. to up lift the ant script to maven.
b. to facilitate ongoing changes in a controlled way, ensuring that where changes are made the consequences are understood and
c. that those changes are correctly propagated through the system.
Finally, to highlight, there are rules and there is process. Rules such as these could not be applied without a process that guarantees the new maven build is functionally equivalent to the old ant build.
I believe this is possible and is actually a desirable tool to have.

0 comments:

top