OpenURL
August 13, 2008
Faith in HTTP
In general, OpenURL 1.0 does not assume an HTTP-based Transport. As stated in the preamble to Z39.88-2004:
The definitions of all concepts are separated from their representation and the protocol by which the representations are transported.
Unfortunately for the adoption of OpenURL, HTTP is not just a Transport mechanism. It is also a messaging system that defines a set of concepts that frequently overlap with OpenURL concepts. In practice, OpenURL concepts are tunneled through HTTP concepts with many of the latter equivalents being side-lined in favor of the former. I'm working on a mapping between the two, but it will take time.
This bifurcation of concepts was bound to cause confusion. Sometimes it is best to represent a concept in the OpenURL formulation, and sometimes it is best to represent it in the HTTP formulation. The two are not mutually exclusive, as I suggested in an earlier entry on "OpenURL and Trust". Unfortunately, Z39.88 only recognizes the abstracted Transport aspects of HTTP, so it provides no clues to help sort out the messaging overlap.
IMO, OpenURL should abandon the theoretical possibility of other types of transports and embrace HTTP concepts rather than reinvent them. This doesn't mean that OpenURL must buy into HTTP's conventional serialization of these concepts, e.g. as HTTP headers, but an alternative KEV or XML serialization of them might make OpenURL attractive to a broader audience.
OTOH, I've argued that OpenURL is founded on the intuitive principles of Q6: what, why, who, where, when, and how. I believe that OpenURL has demonstrated the efficacy of the Q6 principle and that the HTTP world would also benefit from this insight. How OpenURL can lead the way given its current transport agnosticism and competing conceptual model, though, is worrisome to me.
July 21, 2008
The Relationship Between OpenURLs and URIs
My desire to reconcile OpenURL jargon with the rest of the world led me to make an awkward assertion in an earlier blog entry.
An OpenURL identifies a set of assertions.
I just read the Uniform Resource Identifiers section of the HTTP/1.1 spec, and it occurs to me that this statement is unnecessarily warped. To quote HTTP/1.1,
As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings which identify--via name, location, or any other characteristic--a resource.
It occurs to me now that OpenURL falls into this "other characteristics" category.
I am fortunate that my colleague Stu Weibel broke me of the misconception that identity and resolution are synonymous many years ago. In my mind, I conquered this riddle by imagining that every URI, no matter how complicated, was a name and that resolution was orthogonal to this. In effect, the concept of "location" is factored out. Any common identity behind these names, meanwhile, exists only in our imaginations unless aided by some, possibly unknown, resolution mechanism(s).
If you can live with ambiguous identities, treating all URIs as names with orthogonal resolution mechanisms helps clarify some troublesome issues. And this trick works pretty well as long as you believe that URIs fall into two categories: names and locations. When you throw in the possibility of "other characteristics", though, the model where "every URI is a name" gets tortured and you say insightful, to me at least, but weird things like "an OpenURL identifies a set of assertions". Yeah, this is a true statement if you don't mind thinking in terms of absurdly complicated names, but it is so much easier when you think about OpenURL as a formatted string that identifies a resource via "other characteristics'.
So the question remains, what are these "other characteristics" that OpenURL reveals as being usefully represented as URIs? The answer is Q6: who, what, where, why, when, and how.
July 17, 2008
OpenURL Conformance
In a comment to an earlier post, Karen Coyle raises the issue of "conformance for Resolvers". I want to repeat the last half of her first quote with the emphasis that is included in the spec:
Resolver behavior and usage are outside of the scope of this Standard. However, a community may use a Community Profile to define conformance for Resolvers that operate in its application domain.
A Community Profile like SAP1 can be both constrained and open-ended at the same time with uncertainty about conformance criteria being the result. As far as I can tell, a server that recognizes even the tiniest fraction of a ContextObject, even by accident, entitles someone to assert that it is an OpenURL SAP1 Resolver. Even if a Requester gets back a 404 NOT FOUND for every ContextObject that it throws at this server, it does not imply that it is not a SAP1 Resolver or does not conform. Maybe that Resolver just operates in a SAP1 domain, such as "lawn services", where the Referrer's Entities do not apply. (Have I mentioned lately that I still have hope for OpenURL extending beyond the scholarly information domain someday?)
July 14, 2008
An OpenURL Identifies a Set of Assertions
In my last post, I questioned the trustworthiness of the assertions being tunneled in OpenURLs and suggested that HTTP/1.1 provided more reliable and agent-friendly alternatives for some of them. While this is true, it misses a vital point. Many assertions related to the OpenURL domain model, which I claim can account for everything that happens on the Internet, deserve and need to have unique identifiers. These identifiers can be most conveniently formulated as an absolute or relative URI. The same can be said of a given set of assertions. Identifying a set of assertions is the essential and valuable purpose of OpenURLs. The question of the trustworthiness of these assertions is orthogonal to OpenURL. Generally speaking, the inclusion of Descriptors that are better served natively using HTTP/1.1 tend to pollute the identity of the essential assertions.
[2008-07-21] Please see this updated entry for more information.
July 14, 2008
OpenURL and Trust
Karen Coyle posted a comment on an earlier post that has me scratching my head. What follows may not answer her question, but it seems to have some relevance.
A ContextObject Representation may contain a set of ReferringEntities that assert directed reference relationships from the identified ReferringEntity resources to the specified Referent Entity. In fact, these assertions may not be true at all, and even if they are true it does not necessarily prove that the request originated from such a source.
The problem of tunneling assertions inside an OpenURL ContextObject is even more acute in relation to the optional Requester Entity, which is easy enough to spoof. If trust is important, OpenURL presumably defers to the Transport mechanisms to supply solutions, although this implication did not reveal itself on a quick glance through the spec. Since all of the existing Transports are HTTP-based and HTTP/1.1 provides native alternatives for the ReferringEntity (the Referer header) and the Requester Entity (e.g. the IP address, Authorization, From, Proxy-Authorization, and User-Agent headers), it is not obvious why a particular OpenURL community (e.g. SAP1) would encourage agents to use these relatively untrustworthy alternatives that only OpenURL-aware agents can comprehend. Just because you can map Entities from the OpenURL domain model to a URI query does not necessarily mean that you should.
The reason hinted at in the Preamble of the spec is that a community might want to leave the door open for some future non-HTTP-based Transport. If this is true, the loss of trustworthiness implied by this type of tunneling seems like a high price to pay for two birds in the bush.
July 9, 2008
What is OpenURL and why should I care?
The first thing I want to say here is that OpenURL's legacy, specification, jargon, and Registry limitations conspire to obfuscate the natural domain model that lies at the heart of OpenURL 1.0. One of the missions of this (unofficial) Q6 blog is to try to rescue the truth of this domain model from the myths. Unfortunately, there are a lot of them.
In my opinion then, the key importance of OpenURL is that this domain model is capable of describing every operation that happens on the Internet, as they exist today, at a fundamental level. Something akin to this model is floating around in the back of every programmer's mind, but not in a systematic way. Since programmers don't deal with this model in a systematic way, they miss important opportunities for automation and interoperability.
For example, our claim that "All Operations are CRUD" is revealed in OpenURL by the practical (and currently exclusive) use of HTTP-based Transports which are founded on the POST, GET, PUT, and DELETE methods. The claim that "Everything is a Resource" can inferred from OpenURL's concept of an Entity.
Deep inside this domain model, though, is a smaller and more intuitive data model that the rest of the programming world seems to overlook. In essence, every single data element in every single request that gets transported on the Internet (and beyond) can be classified according to Q6: who, what, where, why, when, and how. Unfortunately, OpenURL gave these simple concepts obfuscated labels and registered a set of ContextObject Formats to enshrine them. Fortunately, the Registry makes OpenURL extensible so the hope for simpler Formats isn't lost.
Above, I argued that the OpenURL domain model can be used to model every operation that happens today on the Internet. According to the "Trademarks, Service Marks" section of Z39.88-2004, OpenURL 1.0 is up for review in 2009. That might be a good time to bring the concepts in the OpenURL domain model into alignment with other web standards so that the benefits of OpenURL will be more apparent to others.
Jeff
BTW, Tony Hammond shared this poem with me: I Keep Six Honest Serving Men ... by Rudyard Kipling. It seems apropos.
July 3, 2008
Agent-Friendly Responses
The history of OpenURL biases us to believe that Requesters tend to be humans sitting at a web browser and that Resolvers should cater to them primarily. After all, neither OpenURL nor any of the existing Community Profiles define response criteria. Requesters unknowingly reap the benefits of well-modeled OpenURL requests, but their agents generally can't do much with the response except display it to a human.
My first thought for solving this was for Referrers to put (aka "tunnel") ServiceType Descriptors in the ContextObject that Resolvers could use to vary the response (e.g. MARCXML vs. Dublin Core XML). By using HTTP Identifier descriptors this wouldn't require any changes at all to existing OpenURL Components. The worst that should happen is that Resolvers would ignore the unrecognized Descriptors. At least then the Requesters would have some hope of using the response in an automated way.
It quickly sunk in that Requesters, not Referrers should have control of this. Furthermore, HTTP already supplies a mechanism that is OpenURL-compatible. By specifying a variety of HTTP header fields, Requester can have a great deal of control over the response. If OpenURL Resolvers could be convinced to pay attention to this information, agents (and not just OpenURL-aware agents) could build sound automated infrastructures.
Jeff
July 2, 2008
The Potential of OpenURL
According to Wikipedia "The most common application of OpenURL is to provide appropriate copy resolution". This can be modeled like so:

OpenURL has effectively cornered the niche market where Referrers need to be decoupled from Resolvers. Meanwhile, the rest of the Internet hums along with this model:

If an OpenURL ContextObject Format was created to embrace this unified relationship, which is basically a constraint on the existing data model, the extended benefits of OpenURL might be more apparent to others.
I'm reluctant to give examples because you probably won't believe how simple a resolvable OpenURL could be in this scenario. It is a risk I have to take. Here are two possible examples:
http://referrer.resolver.org/rft_dat_value
http://referrer.resolver.org/rft_dat_value?req_id=mailto:foo@bar.com&rfe_id=http://referringentities.org/baz&svc_id=http://servicetypes.org/quux
Many applications don't need to deal with a confusion of Identifier, Metadata, and Private Data Descriptors. In this case another ContextObject Format could result in something like this:
http://referrer.resolver.com/what?who=mailto:foo@bar.com&where=http://referringentities.org/baz&why=http://servicetypes.org/quux
Jeff
July 2, 2008
OpenURLs are analogous to Relative URIs
Please note that I have no desire to "pick sides" here. I firmly believe that OpenURL has merits that are somehow separate from this analysis and difficult to express in a short blog entry. In part, the Q6 blog exists in general to examine the merits and demerits of OpenURL. I'm tempted to say that the differences outlined here are "structural", but that doesn't really put a finger on it.
- Relative URIs and OpenURLs both have structure. In the case of OpenURLs this structure conforms to an explicit or implicit ContextObject Format, but only OpenURL-aware agents (i.e. relatively few) will be aware of this
- Relative URIs must(?) begin with a path component from the URI, whereas the current crop of OpenURL ContextObject Formats excludes this possibility
- OpenURL ContextObjects can be related, via Community Profiles, to zero or more Resolvers. Referrers have two options when constructing an OpenURL ContextObject:
- Leave the Resolvers unspecified
- Encoded one or more Resolvers in the ContextObject
Downstream, OpenURL-aware agents constructing a ReferringEntity on behalf of Requesters have these options:
- Bind the OpenURL to a Resolver listed in the ContextObject
- Bind the OpenURL to their Resolver of choice (e.g. an institutional Resolver)
- Encode the OpenURL in a COinS and thus pass the problem along to a COinS-aware agent (some listed here)
- Bind the OpenURL to a gateway service (e.g. OCLC's Resolver Registry) to outsource the ultimate binding
Note that the agent constructing the ReferringEntity may be the Referrer.
In contrast, relative URIs are generally(?) bound (e.g. in HTML) to a single base URI. The uncertainty of constructing a resolvable URI never comes up.
June 30, 2008
The Fundamental Mysteries of OpenURL
My colleague Andrew Houghton and I have been comparing notes lately and have come to some surprising realizations. Below are three mysteries that everyone who loves and/or hates OpenURL (not to mention the unwashed masses who have never even heard of it) should ponder deeply:
Everything is a Resource
Believe it or not, everything is a resource. I mean everything. The set of resources identified by URIs is merely an infinitesimal subset. Furthermore, every single URI that gets minted implies a unique resource. Even for a single resource (e.g. an image of a dog), the set of possible URIs (and thus implied unique resources) could easily stretch out to infinity (e.g. by appending a query string with a floating point rotation value).
Since OpenURLs are ultimately URIs, each of these likewise implies a unique resource. Confronted with a potentially infinite variety of OpenURLs, most Requesters will have no hope of recognizing that a single resource (aka Referent) stands behind them all. If this doesn't bother you, it should. Stay tuned for the solution...
All Operations are CRUD
I have long wondered why OpenURL doesn't require at least one ServiceType Entity the way it requires a Referent Entity. At the very least, why not mention the option of an implied default ServiceType? After all, Requesters are going to expect some non-random task to be performed when they resolve an OpenURL.
Suddenly, this makes perfect sense to me. An OpenURL can be understood by any agent (and not just those that are OpenURL-aware) to be a protocol-based absolute or relative URI for a web resource. In HTTP terms, the common operations performed on resources are POST, GET, PUT, and DELETE. Note that these operations are commonly referred to as Create, Read, Update, and Delete (CRUD). It would be odd and unnecessary to construct an OpenURL with a create, read, update, or delete ServiceType Descriptor that gets invoked via HTTP GET. This would amount to "tunneling" operations over HTTP that HTTP can perform natively. Even a search operation boils down to an HTTP GET operation on a resource URI. After all, a search service is "something" and, as argued above, "Everything is a Resource". HTTP GET is just the right operation to make it return a useful response.
Any belief that OpenURL supports operations other than CRUD betrays an OpenURL-bias that the non-OpenURL world will never buy into. If this bothers you (and it should), try to imagine instead that it is a good thing. Stay tuned for the solution...
Q6 "Binds" Resources and CRUD Operations
Since "Everything is a Resource", and "All Operations are CRUD", where does OpenURL fit in and what are its limits? The answers lie hidden in the OpenURL domain model. Stay tuned to find out who, what, where, why, when, and how...
Implications
These three principles can be applied to everything that happens on the Internet (and far beyond). For example, random agents can understand SOAP to be a set of CRUD operations on resources that are tunneled to a SOAP service resource via HTTP POST. A demonstration of this will have to wait, though. OpenURL already lies closer to the truth, so we'll start there...
Jeffrey A. Young
Andrew Houghton
June 29, 2008
OpenURL 1.0 Definitions
It is very difficult to write about OpenURL without hotlinking all the jargon. The Z39.88-2004 specification doesn't provide convenient hotlinks for its terminology definitions, so I took the liberty of doing so here. I will convert these links to PURLs as soon as I can set it up.
[2008-06-30] The URIs for these definitions have been changed to the openurl.info domain, which is preferable to the purl.org domain I originally planned to use.
The picture of this can be found here.
- (OpenURL Framework) Application
- A networked service environment for the transportation of ContextObject Representations. The core characteristics of an Application are specified in a Community Profile.
- By-Reference Metadata
- A Descriptor that details properties of an Entity by the combination of: (1) a URI reference to a Metadata Format and (2) the network location of a particular instance of metadata about the Entity, the metadata being expressed according to the indicated Metadata Format.
- By-Reference OpenURL Transport
- A Transport that uses either the HTTP or the HTTPS network protocol for conveying over a network the reference to a ContextObject Representation. This reference is contained in the value associated with a single key within a query string, which is transported using either a GET or POST method.
- By-Value Metadata
- A Descriptor that specifies properties of an Entity by the combination of: (1) a URI reference to a Metadata Format; and (2) a particular instance of metadata about the Entity, expressed according to the indicated Metadata Format.
- By-Value OpenURL Transport
- A Transport that uses either the HTTP or the HTTPS network protocol for conveying over a network ContextObject Representations. The Representation is contained in the value associated with a single key within a query string, which is transported using either a GET or POST method.
- Character Encoding
- The combination of a character repertoire and an encoding form; a core component of the OpenURL Framework.
- Community Profile
- The definition of an Application as a list of selections for all core components of the OpenURL Framework; a core component of the OpenURL Framework.
- Context
- The network environment in which a Referent is referenced and in which a service request pertaining to the Referent takes place. In the ContextObject, the Context is expressed by five Entities: the ReferringEntity, the Requester, the ServiceType, the Resolver, and the Referrer.
- ContextObject
- An information construct that binds a description of a primary Entity — the referenced resource — together with descriptions of Entities that indicate the Context.
- ContextObject Format
- A Format to represent ContextObjects; a core component of the OpenURL Framework.
- ContextObject Representation
- The Representation of a ContextObject according to a ContextObject Format.
- Constraint Definition
- A Constraint Definition specifies syntactic and semantic constraints for the representation of a given class of resources. The constraints are specified using a Constraint Language.
- Constraint Language
- A formalism used to specify syntactic and semantic restrictions on information constructs of a given class; a core component of the OpenURL Framework.
- Descriptor
- A Descriptor specifies information about an Entity using one of the following four methods: Identifier, By-Reference Metadata, By-Value Metadata, or Private Data.
- Entity
- One of the six possible constituents of a ContextObject: Referent, Requester, Referrer, Resolver, ReferringEntity, or ServiceType.
- Format
- A concrete method of expression for a class of information constructs. It is a triple comprising: (1) a Serialization, (2) a Constraint Language, and (3) a Constraint Definition expressed in that Constraint Language.
- Identifier
- A Descriptor that unambiguously specifies an Entity by means of a URI.
- Inline OpenURL Transport
- A Transport that uses either the HTTP or the HTTPS network protocol for conveying over a network the Representation of one, and only one, ContextObject. This Representation consists of multiple key/value pairs within a query string, which is transported using either a GET or POST method.
- KEV ContextObject Format
- A ContextObject Format to represent one, and only one, ContextObject as a string of ampersand-delimited pairs, each pair consisting of a key and an associated value that is URL encoded.
- KEV ContextObject (Representation)
- A Representation of a ContextObject that conforms to the KEV ContextObject Format.
- KEV Metadata Format
- A Metadata Format to represent an Entity as a string of ampersand-delimited pairs, each pair consisting of a key and an associated value that is URL encoded.
- KEV Metadata (Representation)
- A Representat ion of an Entity that conforms to a KEV Metadata Format.
- KEV Serialization
- A method to hold in storage, or transmit over a network, the values within an information construct as a string of ampersand-delimited pairs, each pair consisting of a key and an associated value that is URL encoded.
- Metadata Format
- A Format to create a By-Reference Metadata Descriptor or a By-Value Metadata Descriptor of an Entity; a core component of the OpenURL Framework.
- Namespace
- The set of all Uniform Resource Identifiers that comply with a specific URI scheme or a specific URN namespace; a core component of the OpenURL Framework.
- Private Data
- A Descriptor that specifies information about an Entity using a method not defined in this Standard.
- Referent
- A resource that is referenced on a network, and about which the ContextObject is created; an Entity of the ContextObject.
- Referrer
- The resource that generates the ContextObject; an Entity of the ContextObject.
- ReferringEntity
- The resource that references the Referent; an Entity of the ContextObject.
- (OpenURL Framework) Registry
- The (OpenURL Framework) Registry provides a mechanism to record and publicize details of the core components of the OpenURL Framework: Namespaces, Character Encodings, Serializations, Constraint Languages, ContextObject Formats, Metadata Formats, Transports, and Community Profiles.
- Registry Identifier
- A unique name assigned on registration to specific Namespaces, Character Encodings, Serializations, Constraint Languages, ContextObject Formats, Metadata Formats, Transports, and Community Profiles.
- Representation
- A sequence of bytes that represents a resource according to a Format.
- Requester
- The resource that requests services pertaining to the Referent; an Entity of the ContextObject.
- Resolver
- The resource at which a service request pertaining to the Referent is targeted; an Entity of the ContextObject.
- Serialization
- A method to hold in storage or transmit over a network the values within an information construct; a core component of the OpenURL Framework.
- ServiceType
- The resource that defines the type of service (pertaining to the Referent) that is requested; an Entity of the ContextObject.
- Transport
- A network protocol and the method in which it is used to convey ContextObject Representations; a core component of the OpenURL Framework.
- XML ContextObject Format
- A ContextObject Format to represent one or more ContextObjects as an XML Document.
- XML ContextObject (Representation)
- A ContextObject Representation that conforms to the XML ContextObject Format.
- XML Document
- A sequence of bytes that satisfies the validity requirements of the Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation.
- XML Metadata (Representation)
- A Representation of an Entity that conforms to an XML Metadata Format.
- XML Serialization
- The method of using an XML Document and XML Format to represent a ContextObject.
June 27, 2008
The OpenURL Domain Model
Time flies. It's been a year and a half since my last posting to the Q6 blog, but I haven't forgotten about OpenURL by a long shot. Below is a (draft) UML Domain Model for OpenURL 1.0.
[2008-6-29] I took the liberty of copying the definitions for these terms from the Z39.88-2004 spec and assigning unique URIs to them. These definitions can be found here. This diagram is derived almost exclusively from these definitions and the linkages mentioned there.

September 30, 2006
Foreign Keys Revisited
Earlier, I dismissed Foreign Keys as an illegitimate part of the OpenURL model because the essence of OpenURL (Q6) should be adequate to express any imaginable web service request. For future reference, though, I want to point out that the OpenURL spec defines other abstractions beyond the Q6 principle and the inspiration of their standardization isn't adequately appreciated yet.
I am coming to the conclusion, though, that the potential of these abstractions is currently in bondage to the limitations of the OpenURL Registry. Let's take OpenURL 0.1 as an example. According to Appendix A of the KEV guidelines, a 0.1 "sid" key maps to a 1.0 "rfr_id" key. In other words, a "sid" IS a Referrer Identifier Descriptor in the conceptual sense, but for some strange reason OpenURL 1.0 isn't able to recognize it as such. I call this "strange" because the KEV Serialization specification itself places no such restriction on which keys are allowed or how they should be interpreted and neither does the MTX Constraint Language. It seems, then, that the KEV guidelines document implicitly assumes the Key/Encoded-Value ContextObject Format, without acknowledging the possibility of other KEV formats. From what I can tell, the OpenURL 1.0 spec itself makes no such assumptions, and yet on this basis 0.1 keys are relegated to the rusty bucket called "Foreign Keys" where they are expected to be ignored by legitimate resolvers.
This pickle seems to stem from the Key/Encoded-Value ContextObject Format. Even here, though, the binding of KEV keys to specific entities is only expressed conceptually. For example, there is no clue that would allow an automated process to recognize the key/entity relationship. Since ContextObjects are a component of the registry, why can't I create a new KEV ContextObject Format that uses a different set of keys? I tried doing just that in my initial OAI-PMH v2.0 exercise, but with unsatisfactory results. Nevertheless, this is the logical place to do so despite (or as demonstrated by) my flawed attempt at a workaround.
To state this differently, OpenURL 0.1 and OAI-PMH v2.0 can be understood conceptually according to the existing abstractions, but this conception can't be expressed in the registry without a complete (and inconvenient) reformulation of their keys. Unfortunately, people's conception of OpenURL seem to bend and even stop based on the registry's limited capabilities. This is extremely unfortunate and could account for Karen Coyle's comment that the committee considered separate standards for ContextObjects and OpenURL. I think this posting demonstrates they were on the right track, but I also think a better dividing line would have been between the OpenURL abstractions and the registry instead. To be specific, I want to use all of these wonderful abstractions in OOM, but biases in the registry mechanism seem to be getting in the way.
September 28, 2006
OAI-PMH v2.0 Revised
This is a revision to the OAI-PMH v2.0 ContextObject Format exercise I posted about a couple days ago. Thanks to some helpful suggestions, I think I now see a much more natural way to structure this effort.
Instead of defining an OAI-PMH v2.0-specific ContextObject Format, this time I created a reusable Query String ContextObject Format. This new format would open the door to many other protocols like SRU. The problems and criticisms levied against the earlier solution are mostly replaced by a new set of issues (which I believe are less offensive.)
- To make this work with MTX, each OAI verb needs to have its own Metadata Format (e.g. ListRecords) so that the key combination constraints can be adequately specified. Admittedly, this solution would add clutter to the OpenURL Registry, but it does fit the model and there is precedent for using a Metadata Format to convey things other than traditional metadata (viz. info:ofi/fmt:kev:mtx:sch_svc.) I believe that this clutter can be blamed on MTX and if it is too offensive, a new Constraint Language could be defined to avoid it.
- The querystring format isn't very intelligent. Basically, all the keys must be assumed to be Metadata By Value and assigned to the Referent. Over on the OpenURL Object Model side, though, the Transport class selected to process the request can afford to be much more intelligent about placing these keys in appropriate Entity containers for distribution to the Services.
Keep in mind that this is just an exercise to determine the limits of OpenURL and its registry mechanism. I don't necessariily expect these components to be added to the registry for real someday. OTOH, an implementation of OAI-PMH v2.0 is well within the capabilities of OOM, so it would be nice if the registry was (at least in theory) equally accommodating.
September 25, 2006
Retrospective OpenURL: OAI-PMH 2.0
I have claimed that any web service already in existence could be represented in the OpenURL Registry. Unfortunately, certain aspects of this are subject to interpretation and would benefit from some community feedback. To start out easy, I will choose an example that shares some ancestry with OpenURL: OAI-PMH v2.0. Here, then, is an OAI-PMH v2.0 ContextObject Format to represent that protocol. Once a few details are ironed out with this, I should be able to create a Transport and Community Profile to go with it.
Here are some issues to be negotiated:
- The identifier I chose for this format is info:ofi/fmt:kev:mtx:oai2.0
- It's not obvious to me how (or even if) to specify an enumerated set of values for the "verb" key in MTX.
- It's not obvious to me how to indicate that certain keys can or must be used in combination with one another in MTX (and how this impacts the min/max occurrences columns.)
- My descriptions could probably be better
- I probably took too great a liberty by claiming that the "identifier" value was of type <id> with the "info:oai/" portion being implied
- I utilized info:oai even though it hasn't been registered in the "info" URI Registry yet.
- MTX doesn't have any place to indicate which OpenURL Entity a key is associated with
I used the MTX constraint because it got me pretty close to where I wanted to go. It may turn out, though, that MTX isn't adequate for this purpose, and another constraint language will need to be invented. Comments, criticisms, and suggestions are welcome.
September 23, 2006
Foreign Keys Revealed
According to the Q6 interpretation, all of the information manifested in a web service request can be mapped to one of the "wh" interrogatives: what, who, why, where, and when (in contrast, the "how" interrogative determines the structure used to convey the other five.) Since these "wh" interrogatives all map to OpenURL Entity types (well, all except "when"), we have every right to believe that a ContextObject containing such Entities would be all we need to represent the entirety of a web service request. If this is true, what's up with Foreign Keys?
IMO, foreign keys shouldn't be considered to be a legitimate part of the OpenURL model. I believe that the only reason they exist at all is because the San Antonio Community Profiles wanted to be backward compatible with OpenURL 0.1 but they felt funny cluttering up their ContextObject Formats with application-specific legacy keys. I appreciate that they wanted to create reusable ContextObject Formats, but in hindsight, there are too many problems to believe that other applications will build on them. Under the circumstances, I wish they had bitten the bullet and created a variant format that went ahead and included these application-specific keys rather than tempt and confuse the rest of us with this idea of general-purpose foreign keys.
September 23, 2006
Three Levels of Interoperability
Earlier I suggested that the OpenURL models could be applied at two levels: over the network and within an application. I would like to consider this again from a slightly different angle. This time I imagine three levels of interoperability that OpenURL facilitates. I will call these levels Transport, Descriptive, and Code interoperability.
Transport interoperability refers to the fact that service mashups are easier to create if service providers conform to standard invocation patterns. For example, the existing OpenURL Transports were designed to support any imaginable service. Even if you are skeptical of this particular list (as I am these days), everyone would agree that the fewer the patterns, the better for mashups.
As an aside, I used to believe that OpenURL 1.0 was (rather than provided a framework to define) a set of protocols. (This error was reinforced by their decision to manifest "foreign keys" into the heart of the standard to accommodate OpenURL 0.1 keys but not to define a separate Community Profile for it.) I imagined that these OpenURL protocols could be used in conjunction with the other popular protocols (e.g. OAI-PMH and SRU) to lead us to a golden age of interoperability. Not only was I wrong about OpenURL being a protocol, but I was also naive about the effectivness of those particular Transport and ContextObject Formats (as explained here, here, and here). Fortunately for my interest in OpenURL, the current crop of Transports and ContextObject Formats aren't really as important as I had imagined.
Descriptive interoperability refers to the fact that any web service in existence can be made 100% OpenURL-compliant without changing a single line of code. This can be done by registering its components in the OpenURL Registry. In theory, these descriptions could be used by automated processes to dynamically build and connect services. In practice, though, there are some problems (e.g. with ContextObject Formats) that will limit this potential.
Code interoperability relates to the fact that the OpenURL model can be also applied at the code level according to the OpenURL Object Model (OOM). OOM (which is still in an alpha state) is an application framework that should allow developers to download components written by others (e.g. Transports, Services, and Metadata Formats) and easily integrate them into their own applications.
I believe that the confusions and concerns I expressed in the first two cases explain why OpenURL hasn't realized its full potential yet. Those problems aren't relevant to the third case, though, which is why I think it holds the greatest potential for interoperability. I have enough experience working with OOM now that I can say with confidence it could be used to efficiently implement any imaginable web service. I can appreciate the fact, though, that people might want to see some proof of this before they buy into the idea.
Can anyone think of additional issues or levels of interoperability that I've overlooked?
September 22, 2006
Authentication/Authorization
OpenURL doesn't define any abstractions for authentication/authorization. Presumably, this information would be represented as descriptors in the Requester Entity (Q6:"Who"). If we imagine an application that uses email addresses to identify users, we would expect that value to be passed as an identifier (e.g. req_id="mailto:foo@bar.com"). If the application expects a string instead, we are forced to use PrivateData (e.g. req_dat="userID:foo"). Passwords would need to be passed in as PrivateData (e.g. req_dat="passwd:87YEGN7DF"). We might also imagine passing this stuff in as "metadata by value".
The preceeding analysis assumes we are using the existing KEV Serialization. We might want to consider registering a new Serialization that isn't so procrustean (e.g. ?userID=foo@bar.com&passwd=87YEGN7DF). At this point, though, the limited OpenURL categories for descriptors are a stumbling block since the assignments in this case aren't obvious (are these identifier, private data, or metadata-by-value descriptors?) My interpretation is that the registered Serialization definition gets to decide which of these procrustean categories will be used to represent each in the CommunityProfile definition.
So far, we've only covered authentication. Authorization data (e.g. a list of "roles") isn't something you would expect the client to deliver as part of the ContextObject Representation. In the case of the OpenURL Object Model, however, the Transport class has the option of enriching the ContextObject with additional information (e.g. from session data or cookies). We can imagine, then, that the Transport class would obtain authorization information (e.g. stored in the session data) and add it as descriptors to the Requester Entity (Q6:"Who") along with the authentication descriptors. All this data would then be available to the Service classes to use as needed.
This still doesn't address what authorization descriptors might look like. Keep in mind, though, that OOM isn't limited to the descriptor categories defined by the OpenURL spec. We can imagine, then, a descriptor class analogous to metadata-by-value that can be used to represent authorization data. The most obvious example of this might be an array of Strings representing authorized roles for the user. We might imagine other ways to model authorization data other than with "roles".
If an application wants to use "roles" to model authorization data, everything should work fine. The challenge comes in with the idea that you can mix and match modular OOM services to create/expand an application. If such services require authorization, there needs to be some level of agreement about the class(es) used to represent that information (e.g. the list and meaning of various roles.) This is something I am currently strugging with. If anyone has thoughts or suggestions on this, let me know. Better still, it would be nice to see this discussed on the OpenURL listserv.
September 16, 2006
Core vs. Advanced OpenURL Features
Looking at the OpenURL spec, it is extremely difficult to tell the bells and whistles from the chassis and drive train. IMO, the Q6 interpretation reveals the heart and soul of OpenURL. Everything else serves to emit a "ding" or a "toot" and can be safely ignored by the vast majority of potential applications.
Unfortunately, it is the cacophony of dings and toots that strike people with their first impression of OpenURL. I think the biggest offenders are the Referrer and Resolver Entities, ContextObject Formats, and, worse of all, the current crop of awkward and cryptic Transports. Sure, sometimes a judicious ding or toot can save a life, so I'm glad they are there. Nevertheless, I think people should be able to build OpenURL applications in total ignorance of these advanced features.
The good news is that I think it is possible to demonstrate this using a few simple examples. The bad news is that I will want to clutter these examples up with an analysis of how they relate to the dings and toots, lest anyone think I'm playing fast and loose with the standard.
We can dispense with the Referrer and Resolver Entities right off the bat. These are great when you want to decouple a ContextObject from any particular Resolver, but how many existing web applications (other than the San Antonio Profiles themselves) depend on a similar feature? (I'm thinking of web applications in the broadest possible sense here, not just OpenURL-based applications.) We can all imagine the magic that could happen if this feature catches on, but we shouldn't cram it down the throat of every developer that crosses our path. First, let's show them how OpenURL can make their lives easier developing the kinds of applications they've been building all along.
September 15, 2006
Transports Revisited
Karen Coyle points out an example in the OOM javadocs were I was careless with the meaning of "Transport" as in saying "they represent the class responsible for taking an HTTP request and transforming it into OpenURL terms". A more accurate statement would be "the Transport class is responsible for taking a request of some form and transforming it into an OpenURL ContextObject." Note, however, that this is strictly OOM's view of Transports. Clients will perceive them from the opposite direction.
Karen also indicates that she would consider HTTP to be a transport, but that isn't right either. HTTP is a network protocol for which the registry currently defines three separate transports (http:openurl-by-ref, http:openurl-by-val, and http:openurl-inline).
I think that some of this confusion originates with OpenURL's decoupling of Transport components from ContextObject Format components. I accept that this is a useful level of indirection for some applications (e.g. citation linking), but I would claim that most potential applications won't take advantage of this ability to mix and match Transports and ContextObject Formats. For most applications, these two components will be tightly bound, meaning this distinction just adds to the confusion. I'll give an example of such a binding in another post.
Rather than thinking of ContextObject Formats as a peer component of Transports, I think it is better to think of them as an "aspect" of Transports. From this POV, Transports can be thought of as an umbrella for the request while remaining true to the definition in the spec of being "a network protocol and the method in which it is used to convey ContextObject Representations". Getting back to the themes of Q6, it might be less confusing to imagine the "how" interrogative in place of "transport" as an aid to conceptualizing this: "How did the user express the request?"
|