Knowledge Combinotronics


Knowledge Combinotronics
By this I mean a stream of different data feeds relevant to the user, each one manipulated by an appropriate algorithm to increase relevance and then offered for selection by the user. That is the user would select some combination of the presented data and that would then be used in the next phase. Really there is no magic going on but that user selections are amenable to an iterative process of selection and reselection in different or unfolding contexts.
There are many examples of how a context may be created depending on use scenario but one would be where  they are fed back into the data set for other users immediate guidance.
This depends on high throughput infrastructure, typically this has been expensive.
'different or unfolding contexts' this is key here.

posted by Kobrix Software at Kobrix Software, Official Blog - 3 weeks ago
NoSQL has picked up a lot of steam lately. HyperGraphDB being a NoSQL DB *par excellence*, we will be joining the upcomping conference organized by the 10gen, the maker of MongoDB: "NoSQL Live from Boston...
These are the comments I have made and the reply from Boris.
I gather them here for any further comment I may want to make.
See after the last comment.


3 comments:

semanticC said...
As its the 12th I guess the conference has taken place, links I will be following up. There are a few questions I have. 1. Can you point me towards any comparison of NeoDB and Hypergraphdb, do they cover the same ground? How do they differ? 2. The relationship between graph databases and 2.1. OWL, how would OWL be consumed, or would it? 2.2. more generally RDF, and then XML, after all there are XML databases that parse in the XML. How do they compare? I'm sure I have missed something(s), but what? 3. One of the problems I have encountered is in keeping various .properties files aligned. One approach is to use something like magic lenses such as the augeas implementation. But, at the same time, I have wanted to rewrite these properties out of their ANT context into a Maven POM context. A job for hypergraphdb? Ideas? 4. Moving on, I have noticed the fascinating post about using hypergraphdb to create a neural net. 4.1. Would you agree that what is happening here is in line with Rickard Öberg? http://www.qi4j.org/ for background and http://www.qi4j.org/qi4j/351.html where he discusses the relationship between algorithms and OOP. BTW, he also arrives at the need for atoms and mentions the same focus, the business case, that you emphasise in your background paper, Rapid Software Evolution. 4.2. I notice that Neo4J has an example of a spreading activation algorithm (token passing), http://wiki.github.com/tinkerpop/gremlin/pagerank - I expect this means that either db could also be used to implement Random Indexing - sparse matrices - as developed by P. Kanerva and M. Sahlgren Some of this may be touched on in the Disko project. Again, ideas? Sorry for such a long comment, but not sure how/if to email privately.
Kobrix Software said...
Hi semanticC, A good place to discuss HyperGraphDB would be the discussion forum: http://groups.google.com/group/hypergraphdb?hl=en This is a long list of topics raised indeed :) Let me try to cover them one by one, perhaps in separate responses: 1) Such comparison should ideally be done independently and I am not aware of any. For starters, HyperGraphDB has much more general data model than Neo. In fact, the name is maybe a bit misleading from a functionality perspective because now it's being labeled as "another graph database", which it is, but it is also OO database, a relational database (albeit nonsql) etc. In HyperGraphDB, edges point to an arbitrary number of things, including nodes and other edges Neo is a classical graph of nodes and directed edges between any two nodes. In addition, HGDB has a type system while Neo doesn't. So HGDB has in effect a dynamic schema that you can introspect, reason about and change. Besides the data models, the storage models are quite different: HyperGraphDB has a general two-layered architecture where a big part of the storage layout can be customized. Neo uses linked lists to store its graph and claims that this makes faster traversals (probably true) and that this is all you need to do with a graph, you don't need indices, pattern mining etc. (here, I disagree). HGDB relies heavily on a lot of indexing for more complicated graph-related queries & algorithms. In sum, HyperGraphDB has pretty much the most versatile data model I know of, and subsumes Neo and others easily. Weather that sort of generality comes at the expense of performance remains to be seen. As you've probably realized from the neural net post, HGDB gives you more representational choices so performance has to be measured more globally, at an application level, through a design that makes intelligent use of what HGDB has to offer. more on the others later....perhaps at the end I'll sum up my responses in a separate blog.
semanticC said...
Hi Boris, Thanks so much for your reply. It would be great if the other questions inspire a blog post. If anyone is interested the NoSQL conference is previewed and will be written up here http://radar.oreilly.com/2010/02/nosql-conference-coming-to-bos.html - and it is a good discussion. Boris contributes too! There are still many things I cannot get my head around. I can see the 'representational choices' the ability to define functions directly working on the data using the HGDB API. I expect this is a good thing in the way that, for example, annotations are better than XML, everything is in the place where it will be used, which facilitates concentrating on the task. But other benefits? Here I cannot see. Moving on again, I am reminded of the efforts of Henry Story to create a framework to import RDF, inspired by Active Record. I am very unclear about all of this. Did I read somewhere that there is a standardisation of the syntax for the import statements of RDF namespaces? Anyway, the idea would be to make the referenced ontology available in code, presumably it would already be in Sesame as the graph db backend? All of this seems relevant to HGDB. First you have mentioned the type system, so how to model the types? I had thought that OWL was a good way of both modelling and sharing those models. But if so, what of the other aspect of HGDB, its ability to deal with semi-structured data, how to fit the two together? I am thinking about Collective Entity Resolution as perhaps one sort of solution and simply in code, how they might interact, as another area. Moving up towards the goal of evolutionary software, I have long thought that it must be possible to describe software using OWL. I assumed that reasoning would take the place of a lot of code when there is a well constructed model. Of course that brings me back to what role reasoning in NoSQL. I know it is build in to AllegroGraph. As I say, many thoughts, but I don't really understand the ramifications of NoSQL at the moment. Perhaps I am missing the point altogether?
top