Linked Data uptake

Linked Data is a universal approach for naming, shaping, and giving meaning to data using open standards. It was meant to be the second big information revolution after the World Wide Web. It was supposed to complement the web of documents with the web of data so that humans and machines can use the Internet as if it is a single database while enjoying the benefits of decentralisation1This is the balance between autonomy and cohesion – essential for any socio-technical system..

Today, we have 1495 linked open datasets on the web, according to the LOD cloud collection. Some among them, like Uniprot and Wikidata, are really big in volume, usage, and impact. But that number also means that today, 15 years after the advent of Linked Data, LOD datasets are less than 0.005% of all publicly known datasets. And even if we add to that the growing amount of structured data encoded as JSON-LD and RDFa in the HTML, most published data is still unavailable in a self-descriptive format and is not linked.

That’s in the open web. Inside enterprises, we keep wasting billions attempting to integrate data and pay the accumulated technical debt, only to find ourselves with new creditors. We bridge silos with bridges that turn into new silos, ever more expensive. The use of new technologies makes the new solutions appear different, which helps us forget that similar approaches in the past failed to bring lasting improvement. We keep developing information systems that are not open to changes. Now, we build digital twins, still using hyper-local identifiers, so they are more like lifeless dolls.

Linked Enterprise Data can reduce that waste and dissolve many of the problems of the mainstream (and new-stream!) approaches by simply creating self-descriptive enterprise knowledge graphs, decoupled from the applications, not relying on them to interpret the data, not having a rigid structure based on historical requirements but open to accommodate whatever comes next.

Yet, Linked Enterprise Data, just like Linked Open Data, is still marginal.

Why is that so? And what can be done about it?

I believe there are five reasons for that. I explained them in my talk at the ENDORSE conference, the recording of which you’ll find near the end of this article. I was curious how Linked Data professionals would rate them and also what I have missed out on. So I made a small survey. My aim wasn’t to gather a huge sample but rather to have the opinion of the qualified minority. And indeed, most respondents had over seven years of experience with Linked Data and semantic technologies. Here’s how my findings got ranked from one to five:

Continue reading

  • 1
    This is the balance between autonomy and cohesion – essential for any socio-technical system.

Wikipedia “Knows” more than it “Tells”

When pointing out the benefits of Linked Data, I’m usually talking about integrating data from heterogeneous sources in a way that’s independent of the local schemas and not fixed to past integration requirements. But even if we take a single data source, and a very popular one, Wikipedia, it’s easy to demonstrate what the web of data can bring that the web of documents can’t.

In fact, you can do it yourself in less than two minutes. Go to the page of Ludwig Wittgenstein. At the bottom of the infobox on the right of the page, you’ll find the sections “Influences” and “Influenced”. The first one contains the list (of links to the Wikipedia pages) of people who influenced Wittgenstein and the second, those that he influenced. Expand the sections and count the people. Depending on when you are doing this, you might get a different number, but if you are reading this text by the end of 2017, you are likely to find out that, according to Wikipedia, Wittgenstein was influenced by 18 and influenced 32 people, respectively.

Now, if you look at the same data source, Wikipedia, but viewed as Linked Data, you’ll get a different result. Try it yourself by clicking here1If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint.

The influencers are 19, and the influencers are 95 at the moment of writing this post, or these numbers if you click now2If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint(most likely bigger). Continue reading

  • 1
    If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint
  • 2
    If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint