The term “Semantic Web” has been around for a while; the concept or hope that the entire web’s resources, data, and applications would be able to speak to each other has been around for even longer. However, only a relative few of us are familiar with what it can be used for, and how it is being used.
First, let’s start off with some light reading. Now that you’re done with that, and with my apologies to anyone that tried to get through it, let’s dive into how the semantic web differs from, yet builds on, the web that most of use on a regular basis. HTTP and HTML were designed with some very ambitious goals: provide a language that can be used to contact and send information to and from nearly any computer on the planet and provide a method of representing that information that can be displayed on nearly any computer on the planet. More importantly, those technologies have been, and continue to be, quite successful in those (and many other) goals. Web services, REST, and a host of other modern implementations produce a rich and interactive environment in our web browsers.
However, each application, and more importantly all of the data, is still very much a silo. Certainly we can create programs or routines to allow one specific system to utilize information provided by another one, but each time we are doing that we are resolving many problems that have been solved before. This is inherent to non-semantic integrations. Semantic web technologies, however, provide a path to reuse work done before (by other people), build or add on our own domain-specific needs, and allow that to be openly shared with the world.
Perhaps even more importantly, the semantic web specifications allow for interoperability when two organizations or development teams have gone different directions. A simple property of a semantic web ontology (which is a collection of modeled concepts or objects) allows two similar but separately defined concepts to be considered the same. For example, imagine two separate institutions that have implemented InfoEd have created UDFs for a similar reason, but of course have named them differently. In a scenario that all of this data were in presented in semantic form, semantic web ontologies would allow each institution to define that similarity in their own ontology, allowing data compatibility and interoperability between the two, without requiring any change beyond declaring “Property A” is the same as “Property B”.
One critical need is the ability to reuse work produced by other groups. Semantic web allows for this by allowing the inclusion or usage of multiple ontologies. For example, the VIVO application utilizes approximately 20 different ontologies for its purposes. Only a handful of them are VIVO specific, while basic modeling of relationships between people, bibliographic metadata, and other more commonly used concepts are reused based on standard ontologies produced by other organizations. These ontologies are retrieved in real-time. What that means is that the folks at Dublin Core Metadata Initiative can continue to work on and refine their ontology, and new versions can be accessed nearly immediately, with a simple adjustment of the namespace inclusion.
What it all means is that, instead of perusing endless technical documents for integration with each system, development staff and technical analysts can spend their time deciding what their data looks like, map it to work that is already done, and extend that work as needed for their own unique needs. All of that data is exposed to the rest of the web through SPARQL endpoints (the semantic web equivalent of a SQL database connection).
All of the above is great, and it sounds like it will save some time and effort in the end, but what can we now do that we couldn’t before? For one, we can perform analytics and comparisons across continents. Taking one set of data that has been put into semantic web form that models, for example, health sciences faculty members’ research interests, and another set of data, housed on a different web server, that models a hospital networks’ clinicians’ specialties and experiences, we could draw some conclusions as to which faculty could consider working with which clinicians. In fact, the modeling work to allow this particular example is already being worked on.
Overall, semantic web technologies are an extension of the principles of HTTP and HTML. That is, allowing meaningful communication across networks by nearly any device. If you are feeling a bit uncertain, that’s not a problem. The truth is that semantic web principles are being put to use, but only by a small fraction of the work being produced. However, the capabilities can begin to be discovered if we begin thinking in terms of the similarities of otherwise different concepts. Those points of similarity will be the critical factor in determining mutual benefit, collaboration, and market needs.