Domain Modeling
May 19, 2010
"Conjuring" as a Linked Data design pattern
Everywhere I look there are blobs laying around that nobody quite knows what to name except "database", "record", or 1, 2, 3, 4, 5... Oh sure, some of these blobs are XML and they're indexed. Some of them are even accessible via a Web API using hackable query string parameters. Still, though, nobody's quite sure what to name the blobs. If you find yourself in a similar situation, try this:
http://example.org/database/1 (303 redirect to...)
http://example.org/database/1/ (2XX returning the blob)
Got those URIs working? Now go into the code for the latter and add this:
if (request.getHeader("Accept") == "application/rdf+xml") {
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#">
<owl:Thing rdf:about="http://example.org/database/1" />
</rdf:RDF>
} else {
blob
}
That unnamable thing you've been calling "1" now has a useful globally-unique HTTP identifier that establishes its presence on the Semantic Web as Linked Data.
Now comes the conjuring trick I saw my colleague Andrew Houghton performing. Free your mind of the idea that this blob is one "thing" and write an elevator speech describing the use cases for the blobs. If your blob is XML, look at the element names to help you write the speech. Circle the important nouns in the speech and peer into the blob looking for such a thing. If you think you see one, append the noun to the latter URI with a "#" like so:
http://example.org/database/1/#noun1
http://example.org/database/1/#noun2
etc.
Now that these meaningful names have actionable HTTP identifiers, we need to announce their existence inside the <rdf:RDF> like so:
<owl:Thing rdf:about="http://example.org/database/1/#noun1" />
<owl:Thing rdf:about="http://example.org/database/1/#noun2" />
You've just conjured up meaningful things out of the blob and given them Linked Data names using semantically-rich HTTP URIs. Granted, some of the semantics implied in the URI aren't reflected in the RDF, but that is easily remedied. And if somebody says you are lying about the existence of noun7 for record 53, update your code and/or the blob to fix the bug (it's not a "lie" since it wasn't intentional). Apologize and explain to them that conjuring always has been and always will be an inexact science. ;-)
This "Conjuring" pattern is a trustworthy starting point for making the important things lurking inside your blobs discoverable and available for unexpected reuse.
March 20, 2010
"Information" is not very helpful
If you buy into the "information resource vs. non-information resource" conception of Linked Data, the answer to the question "is this identified thing an information resource?" is either "yes" (2XX), "maybe" (303 or hashURI) or "I don't know" (4XX).
The second answer changes if you buy into the "Web Document vs. Real World Object" conception of Linked Data: "is this identified thing a Web Document?" "yes" (2xx), "no" (303 or hashURI) or "I don't know" (4xx).
As a client looking at response codes, I can't really know which interpretation the Linked Data URI designer intended. Nevertheless, as a coiner of Linked Data URIs, we need to choose one or the other (and hopefully not waffle and use both.)
Which interpretation would you choose? Keep in mind that "document" can be modeled as an abstract concept and thus identified as a "Real World Object" using 303 links to a variety of "Web Document" representations.
March 19, 2010
Who invented "information"?
I've been ranting on the OpenURL listserv lately about the potential of Linked Data and occasionally making a fool of myself by referring to "information resource" when I really meant "non-information resource". What on earth was the person who originally coined the word "information" thinking? Didn't they know that Linked Data would come along someday and transform our understanding of reality based on the novelty of "non-information"? Couldn't they have invented a term signifying the negation of "information" instead so that people like me wouldn't get confused by people like me?
March 18, 2010
Life is like a box of use-cases
Finding new use cases is easy.
Understanding new use cases is hard.
Understanding new use cases is easy.
Finding names for new things is hard.
Finding names for new things is easy.
Understanding how new things fit in with old things is hard.
Understanding how new things fit in with old things is easy.
Figuring how to move old things into new things is...
(now where did I leave that thesaurus?)
March 12, 2010
You can't argue with use cases...
You can't argue with use cases. That's a pity, so I started a list of things I like to argue with instead:
- the words (names) people use when talking about use cases
- how those named things need to be related to one another in order to satisfy a set of use cases
- how existing names and relationships can best be adapted to accommodate new use cases
- the value of extending, constraining, instantiating existing abstractions to maximize unexpected reuse
- whether or not things should be named and behave on a wire
Hmm. What am I forgetting?
March 5, 2010
Hype Cycles, Efficient Frontiers, and Linked Data
In the technology domain, "things" follow a Hype cycle. I don't fully understand why, but something tells me this is unavoidable.
In the investing domain, the assumption is that every investment portfolio has a risk-return profile that can be optimized in line with an "efficient frontier". Again, I don't fully understand why, but something tells me this makes sense in spite of (or more likely especially in) these tough economic times.
Regardless of the times, it seems to me that these principles are somehow related. There must be an efficient frontier for any given hype cycle. It's not clear how anyone can measure this, but without having a handle on it we are making wild guesses about resource allocations.
OTOH, I don't want anyone to get the impression that I sit around thinking about the connection between hype and efficiency all day long. It's just an idea that popped into my head. What I sit around all day thinking about is ways of producing and consuming Linked Data efficiently. This is the hype cycle that I hope without hope is being managed efficiently.
December 2, 2009
Linked Data and Cool URI Patterns
MVC scaffolding frameworks like Grails and Ruby on Rails prove that the identity and behavior of Web resources can easily be generalized when they are based on a domain model. What Grails and presumably most other frameworks fail to account for is the fact that things named in a domain model identify real world objects. It is time to correct this oversight.
Recall that the primary things named in a domain model fall into a handful of categories: class, instance, attribute, relationship, operation, and the model itself. The Grails scaffold automatically provides HTTP URIs for Web document representations for some of these things. Here are examples using the default Grails URI mapping:
- Model Web Document
- http://example.org/
- Class Web Document
- http://example.org/{className}/{operationName}
- Instance Web Document
- http://example.org/{className}/{operationName}/{instanceName}
[Beware that Grails names these path segment tokens based on analogous MVC concepts:
{className}={controller}
{instanceName}={id}
{operationName}={action}
Also beware that the default Grails URI patterns are deficient in other ways, but it is difficult to change them. As a result, the URI patterns below are reluctantly forced into the default mold.]
The first enhancement for Linked Data compliance is real world object identifiers support for everything in the model. For some domain model categories 303 (See Other) redirect behavior is appropriate:
- Real World Model
- http://example.org/{modelName}/rwo
- Real World Class
- http://example.org/{className}/rwo
- Real World Instance
- http://example.org/{className}/rwo/{instanceName}
These can be implemented by creating a special controller for the {modelName} and adding a new content-negotiable "rwo" action to it and the default scaffold controller. Real world object URIs for attributes and relationships can then piggy-back on Real World Class as hash URIs:
- Real World Attribute
- http://example.org/{className}/rwo#{attributeName}
- Real World Relationship
- http://example.org/{className}/rwo#{relationshipName}
Now that real world object identifiers are defined for everything in the domain model, the only thing lacking is an RDF representation alongside the scaffold's HTML representation. This will be examined in a subsequent post.
December 2, 2009
Domain Modeling and Linked Data
As a principle of object-oriented design, most things worth naming in a domain fall into one of several categories: class, instance, data type, attribute, relationship, or operation. Here is a Grails (domain) class example illustrating the patterns (excluding instance):
class Person {
String name
Organization employer
String toString() {
"${name}"
}
}
Whether we realize it or not, the names assigned in this way form an ontology that uniquely identify everything in the domain. Also note that MVC frameworks like Grails automatically inject all domain classes with machine-level create, read, update, and delete (CRUD) operations. This accounts for the naming and persistence of instances as well.
Now that we know how to systematically name everything of interest in a domain, we need to realize that every one of those things identifies a real world object (RWO). Conversely, every RWO of interest in a domain can and should be named according to object-oriented principles.
The next trick is to take this machine-local ontology and project it onto the Web as Linked Data. Grails scaffolds provide a glimpse of how this can be automated by creating a parallel controller like so:
class PersonController {
def scaffold = true
}
This effectively forces Grails to inject the domain class with globally-unique HTTP URI identifiers and CRUD behaviors. Unfortunately, the default Grails scaffold only provides HTTP URIs for Web documents that represent the real world objects. In contrast, Linked Data requires separate HTTP URIs for the RWOs themselves. The other problem is that it does not automatically provide an RDF representation.
Both problems can be solved by customizing the scaffold Controller with new actions to support content-negotiable RWO URIs and scaffold views that produce RDF. Details will be examined in subsequent posts.
Jeff
June 11, 2009
A Modest Proposal
Object-oriented modeling is taking over the world. The interesting thing about "objects" is that it makes sense to talk about "instances" of them. The interesting thing about "instances" of objects is that you can "classify" them. The interesting thing about "classifying" objects is that you can "subclassify" them. But where does classification/subclassification end? In Java, it ends with java.lang.Object. Similarly, Web standards effectively say that classification ends with "resource".
One of the interesting differences between these two ultimate classes is that Web standards allow for direct instantiation of "resource" whereas Java requires you to subclass java.lang.Object before you do so. Java is also nice, compared to say C++, because it restricts multiple inheritance. If the Web had a similar constraint, the Semantic Web would be a whole lot simpler because a singular rdf:type could be required instead of optional or duplicitous.
My modest proposal, then, is to change Java and Web standards ever so slightly. In the case of Java, let's forbid developers from subclassing java.lang.Object and force them to subclass java.lang.Goop or java.lang.Concept instead. My sense is that most experienced developers would proudly choose to subclass java.lang.Concept and would make a genuine effort to justify this claim.
Likewise, in the case of Web standards, let's deny developers the right to directly create HTTP URIs and force them to choose between GOOP and CONCEPT URIs instead. The protocols associated with the identifiers should be identical. Looking around the Web, I suspect we could solve the problem of world hunger by eating Web developers who confuse the two.
June 5, 2009
AtomPub, AtomPub, AtomPub
In the beginning, HTTP defined a create, read, update, and delete (CRUD) model for managing and using Web resources. The fact that these operations map to HTTP methods named POST, GET, PUT, and DELETE obfuscates things a little, but not significantly. And although it may not be obvious, anything can be identified as a resource on the Web. More will be said on the significance of this below.
Since a CRUD model is integral to the HTTP specification, the need for AtomPub may seem counterintuitive. In fact, the AtomPub CRUD model does not replace the HTTP CRUD model, it just explains how to apply it to Web resources that exist in the context of "collections". This need originated in the blogosphere where developers wanted to manage Web documents known as "blog entries" in conformance with the HTTP CRUD model. Fortunately, the AtomPub specification did not couple tightly with this type of Web document and developers are slowly realizing it can be used to manage any collection of Web documents.
[As an aside, a create, update, and delete model also lies at the heart of SRU Update. The key difference is that AtomPub chose to build on Web standards whereas SRU Update chose to build on SOAP standards. If you are wondering if SRU Update is for you, consider the fact that the SRU Update CrUD model will end up getting tunneled over HTTP despite the competing models. Why add the overhead?]
Getting back to an earlier point, people are realizing that AtomPub is effective at managing collections of Web documents other than blog entries. What may not be obvious is that there are other types of Web resources besides the familiar Web documents we all know and love. Specifically, Linked Data tells us that Real World Objects can also be identified as Web resources. Just as AtomPub accommodated the mental migration from blog entries to Web documents, it is equally effective for managing collections of Real World Objects. If it's not clear how this can be done, start with this earlier blog entry and watch this blog space for followups.
|