The Art of Form as a Form of Art

In Brussels, at the southeastern end of the Mont des Arts garden, there are stairs leading to Rue de Musée. Climbing up one of the stairways, you see a wall on your right. A few months ago, a form of art started spreading on that wall. I don’t know if it was spontaneous or organized.  And it doesn’t matter. Every organization was at some point spontaneous, and everything spontaneous is worth talking about if it has led to some organization.

When approaching it, all you see is empty frames.

Getting closer, they (actually, you) start to make sense, but the name, given by the artist, accelerates the process. The name and the image enter into a loop, the name confirming the image, and the image confirming the name. Continue reading

SASSY Architecture

SASSY Architecture is a practice of combining two seemingly incompatible worldviews. The first one is based on non-contradiction and supports the vision for an ACE enterprise (Agile, Coherent, Efficient), through 3E enterprise descriptions (Expressive, Extensible, Executable), achieving “3 for the price of 1”: Enterprise Architecture, Governance, and Data Integration.

The second is based on self-reference and is a way of seeing enterprises as topologies of paradoxical decisions. Such a way of thinking helps deconstruct constraints to unleash innovation, reveal hidden dependencies in the decisions network, and avoid patterns of decisions limiting future options.

As a short overview, here are the slides from my talk at the Enterprise Architecture Summer School in Copenhagen last week.

Continue reading

QUTE: Enterprise Space and Time

Here’s another pair of glasses with which to look at organisations. It can be used either together with the Essential Balances or with the Productive Paradoxes, or on its own. For those new to my “glasses” metaphor, here’s a quick intro.

The Glasses Metaphor

As I’m sceptical about the usefulness of methodologies, frameworks and best practices when it comes to social species, my preference is to work with habits and instead of using models, to use organisations directly as the best model of themselves.

The best material model of a cat is another, or preferably the same, cat.

N. Wiener, A. Rosenblueth, Philosophy of Science (1945)

What I find important in working with organisations is to break free from some old habits, by changing them with new ones. And most of all, cultivating the habit of being conscious about the dual nature of habits: that they are both enabling and constraining; that while you create them they influence the way you create them. Along with recipes and best practices, I’m also sceptical about KPIs, evidence-based policies, and all methods claiming objectivity.

Objectivity is a subject’s delusion that observing can be done without him. Involving objectivity is abrogating responsibility – hence its popularity.

Heinz von Foerster

Instead of “this is how things are”,  my claim is that “it’s potentially useful to create certain observational habits”. Or – and here comes the metaphor – the habit of observation using different pairs of glasses. “Different” implies two things. One is that you are always wearing some pair of glasses, regardless of whether you realize it or not. And the other is that offering a new pair is less important than creating the habit of changing the glasses from time to time.

I prefer the “glasses” to the “lens” metaphor, and here’s why. Glasses indeed have lenses, and lenses are meant to improve vision or, at any rate, change it.  Quite often, the glasses I offer bring surprises. Where you trust your intuition, you might see things that are counter-intuitive, and where you’d rather use logic, they might appear illogical. It’s not intentional. It just often happens to be the case.

The first reason I prefer glasses metaphor to just lens is that glasses have frames. That should be a constant reminder that every perspective has limitations, creates a bias, and leaves a blind spot. Using the same glasses might be problematic in some situations or in all situations if you wear them for too long. And the second reason is that glasses are made to fit, they are something designed for our bodies. For example, they wouldn’t fit a mouse or even another person. This has far-reaching implications, which I’ll not go into now.

QUTE

QUTE stands for “Quantum Theory of Enterprise”. Continue reading

Wikipedia “Knows” more than it “Tells”

When pointing out the benefits of Linked Data, I’m usually talking about integrating data from heterogeneous sources in a way that’s independent of the local schemas and not fixed to past integration requirements. But even if we take a single data source, and a very popular one, Wikipedia, it’s easy to demonstrate what the web of data can bring that the web of documents can’t.

In fact, you can do it yourself in less than two minutes. Go to the page of Ludwig Wittgenstein. At the bottom of the infobox on the right of the page, you’ll find the sections “Influences” and “Influenced”. The first one contains the list (of links to the Wikipedia pages) of people who influenced Wittgenstein and the second, those that he influenced. Expand the sections and count the people. Depending on when you are doing this, you might get a different number, but if you are reading this text by the end of 2017, you are likely to find out that, according to Wikipedia, Wittgenstein was influenced by 18 and influenced 32 people, respectively.

Now, if you look at the same data source, Wikipedia, but viewed as Linked Data, you’ll get a different result. Try it yourself by clicking here1If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint.

The influencers are 19, and the influencers are 95 at the moment of writing this post, or these numbers if you click now2If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint(most likely bigger). Continue reading

  • 1
    If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint
  • 2
    If the live DBpedia is down, just delete live from the address bar where the full query will appear, and you’ll run towards the regular DBpedia endpoint

Productive Paradoxes in Projects

When I started this blog in 2011, I wanted it to be a place for undistracted reading. The initial theme was not much busier than this one. I didn’t go that far, but you still don’t see categories, tag clouds, and my Twitter feed. Only recently have I added sharing buttons and started putting more images, and because I am keeping it minimal, you might have been reading this blog for some time without knowing about its tagline, as it is simply not visible in the blog. But it’s been there, and when the blog appears in search results, you can see it.

The theme of paradoxes appeared only a few times, for example, in  From Distinction to Value and Back and previously in Language and Meta-Language for EA. I haven’t focused on it in a post so far. It was even more difficult to start talking about it to an audience of project managers. First, claiming that projects are produced and full of paradoxes might appear a bit radical. Second, project managers are solution-oriented people, while in paradoxes, there is nothing to solve. There is a problem there, but its solution is a problem itself, the solution of which is the initial problem. Third, talking about paradoxes is one thing, but convincing others that understanding them is useful is another. Continue reading

How I use Evernote, Part 3: Classification and Wishlist

This is the third and final instalment about Evernote. You may want to check out the previous ones first:

How I use Evernote, Part 1 – Note Creation,

How I use Evernote, Part 2 – Kanban Boards

What is left for this post, is to go over the way I look at and use tags and notebooks and to share the top seven features I miss in Evernote.

Classification

Currently, I have over six thousand notes in Evernote. To manage them, I classify them. This means I apply certain criteria to make a note a member of a set of notes. The capabilities of Evernote supporting this are tags, notebooks and search. There are other ways to think about them, not as just being different means for classification, but I find this perspective particularly useful.

The nice thing about tags is that they can be combined. I see a note tagged #A as belonging to set {A}, and a note with tag #{B} as belonging to set {B}. I can find both the intersection, {A} AND {B} and the union, {A} OR {B}, by selecting as search principle “Any” or “All”. Continue reading

How I use Evernote, Part 2: Kanban Boards

This is the second part of the sequel on my way of using Evernote. The first one was about the creation of notes and the third will be about notebooks and tags and my overall approach to organising the content inside Evernote. In this part, I’ll describe how I use Evernote for task management.

My tool of choice for task management had been Trello. It still is for collaborative work on projects and for strategic flows, but I “migrated” my personal task management entirely to Evernote.

How? I simply use the way reminders appear on top of all notes with the ability to rearrange them by dragging, as Kanban board. Continue reading

How I use Evernote, Part 1: Note Creation

I’ve been using Evernote a lot in the past few years. My membership started in January 2010. I had been using Zim for note-taking before that and I kept using both in parallel for a while. It was the synchronisation capability that made me move entirely to Evernote.

After using Evernote for a few years, I decided to “migrate” more content and workflows to it. One trigger was the trouble of managing information on several platforms and failed attempts to link them. Another was the inspiration from the Luhmann’s Zettelkasten. I had no intention to imitate the latter and no illusion that I could solve the former, and yet there was a considerable improvement in my personal information management. Now I can quickly find what I’m looking for and what it is related with. More importantly, I can be surprised with new relations and discoveries, the important difference between using and working with.

I haven’t read any of the Evernote books, nor I participate in the fora. Yet, it seems that the way I use the tool is worth sharing, at least this was the feedback from people that had a glimpse of my note-taking practice.

I was using Trello for managing tasks, various note-taking apps – mainly Zim and Google Keep, Adobe Cloud for PDFs, shortly also Mendeley for research, and Pocket for reading articles. For website bookmarking and highlights I used Diigo, and Dropbox and Drive –  for storing and syncing documents. My main requirements were two: to quickly retrieve information from all these places and link resources within and across. That included the need to consolidate and link my notes and highlights from Kindle books and PDFs.

Evernote completely replaced Zim, Keep, Adobe Cloud, Mendeley and Diigo and partially – Trello, Dropbox and Google Drive. I miss many of the capabilities of these tools, but I don’t regret leaving them. When consulting enterprises, I always support diversity and demonstrate ways to experience a landscape of heterogeneous applications as a single system. However, a corporate approach to information management is not fully applicable for a personal one.

Most of what I do is probably common for regular Evernote users, or likely to be only related to my specific needs, but I guess there are a few tweaks that some might find useful, either to apply or as insights for a better solution. Continue reading

Do We Still Worship The Knowledge Pyramid?

There are not many models that have enjoyed such a long life and wide acceptance as the Knowledge Pyramid. Also known as the DIKW pyramid, it features data as the basis and shows how information is built on it, then knowledge, and finally wisdom. Each layer refers to the lower one, using it but adding something more. It tells the story of how data is transformed into information and information into knowledge. And being a pyramid, it implies that the higher you go, the better things get, that there is more value but less quantity1There are variations. In some it is not actually shown as a pyramid; in others, wisdom is skipped and in at least one popular version enlightenment is put on top of wisdom or added in another way..

The model goes together with a set of conventions about the meaning of each concept and its relations. These definitions vary, but the logical sequence is rarely questioned. The most popular narrative in business is that Data are individual facts that need to be processed to generate information. When data are categorised, or interpreted, or put in context or better all of that, they turn into information. There is a greater divergence of what knowledge is, but most sources seem to suggest that if what is done to data, is done once again to information, you’ll get knowledge. It all sounds like a recipe for a delicious cake. What’s not to like?

Well, just about everything. Continue reading

  • 1
    There are variations. In some it is not actually shown as a pyramid; in others, wisdom is skipped and in at least one popular version enlightenment is put on top of wisdom or added in another way.

The Scissors of Science

Three centuries ago the average life expectancy in Europe was between 33 and 40 years. Interestingly, 33 was also the average life expectancy in the Palaeolithic era, 2.6 million years ago. What is that we’ve got in the last three centuries that we hadn’t got in all the time before? Well, science!

Science did a lot of miracles. But like all things that do miracles, it quickly turned into a religion. A God that most people in the Western world believe in today. And like all believers, when science fails, we may think it has not advanced in that area yet, but we don’t suspect there is anything wrong with science itself. Or, when the data doesn’t match, we think it’s because those scientists are not good at statistics. Or, if not that, then simply the problem is with failed control over scientific publications, as it was concluded three years ago when Begley and Ellis published a shocking study that they were able to reproduce only 11 per cent of the original cancer research findings.

Well, I believe that the problem with science is more fundamental than that.

The word science comes from skei, which means “to cut, to divide”. The same is the root of scissors, schizophrenia and shit. “Dividing” when applied to science, comes in handy to explain some fundamental problems with it. It has at least the following six manifestations. Continue reading

The Mind Of Enterprise

I should have shared this presentation in November 2015 but anyway, better late than never. Here it is as static slides…

… and if your browser allows, you can play the original:

There is also a video, but due to a technical problem, only the first few minutes were recorded.

Don’t buy ideas. Rent them.

Some ideas are so good. An idea can be good at first glance or after years of digging and testing, or both. When it looks instantly convincing, it just resonates with your experience. There are so many things that such an idea can help you see in a new light, or help you fill some old explanatory gaps. Or, it can sound absurd when you first hear it, and then later it gets under your skin. Either way, you buy it. You buy it on impulse or after years of testing. You invest time, emotions, and sometimes reputation. And once you buy it, you start paying the maintenance costs. It’s quite like buying a piece of land. You build a house on it, then furnish it, and then you start repairing it. Somewhere along this process, you find yourself emotionally attached. And that’s where similarities end. The house you can sell and buy a new one. With ideas, you can only buy new. But sometimes the previous investment doesn’t leave you with sufficient resources. And then you can’t just buy any new idea. It needs to fit the rest of your narrative.

Once you buy an idea, it can open a lot of new paths and opportunities. But it can also obscure your vision and preclude other opportunities. One thing you learn can sometimes be the biggest obstacle to learning something else, potentially more valuable.

Instead of buying ideas, wouldn’t it be better just to rent them? But not like renting a house, more like renting a car. With that car you can go somewhere, stay there, or go further, or elsewhere, or – if it is not good for a particular road or destination – come back and rent another car.

Do you like the idea of renting, instead of buying ideas? If yes, don’t buy it.

If you are not tired of metaphors by now, here’s another one. I often present my ideas as glasses. Not lenses but glasses. First, they should be comfortable. They should fit our head, nose and ears. Not too tight, so that we can easily take them out. Not too loose, so that we don’t drop them when shaken. Second, when we put on new glasses, we don’t just put on new lenses, but also new frames. Being aware of that is being aware of limitations, and of the fact that there are hidden choices. It would also help to realise when it’s time to try on a new pair of glasses.

From Distinction to Value and Back

I tried to explain earlier how distinction brings forth meaning and then value. Starting from distinction may seem arbitrary. Well, it is. And while it is, it is not. That wouldn’t be very difficult to show, but let’s first take a closer look at distinction. As the bigger part of that work is done already by George Spencer-Brown, I’ll first recall the basics of his calculus of indications, adding my interpretations here and there. Then I’ll quickly review some of the resonances. Last, I’ll come back to the idea of re-entry and will apply it to the emergence of values.

Recalling

If I have to summarise the calculus of indications, it will come down to these three statements:

To re-call is to call.

To re-cross is not to cross.

To re-enter is to oscillate.

In the original text, the “Laws of Form”, only the first two are treated as basic, and the third one is coming as their consequence. Later on, I’ll try to show that the third one depends on the first two, just as the first two depend on the third one.

The calculus of indications starts with the first distinction, as “we cannot make an indication without drawing a distinction”. George Spencer-Brown introduces a very elegant sign to indicate the distinction:

mark

It can be seen as a shorthand of a rectangle, separating inside from outside. The sign is called “mark”, as it marks the distinction. The inside is called the “unmarked state”, and the outside is the “marked state”. The mark is also the name of the marked state.

This tiny symbol has the power to indicate several things at once:

  1. The inside (emptiness, void, nothing, the unmarked state)
  2. The outside (something, the marked state)
  3. The distinction as a sign (indication)
  4. The operation of making a distinction
  5. The invitation to cross from one side to the other
  6. The observer, the one that makes the distinction

That’s not even the full list, as we’ll see later.

Armed with this notation, we can express the three statements:

LoF

They can be written in a more common way using brackets:

()() = ()

(()) =   .

a  =  (a)

The sign “=” may seem like something we should take for a given. It’s not. It means “can be confused with”, or in other words: “there is no distinction between the value on the left side of the equation and the value on the right side”. Again, a form made out of distinction.

The first statement is called the law of calling. Here’s how it is originally formulated in the “Laws of Form”:

The value of a call made again is the value of the call.

If we see the left sign as indicating a distinction, and the right sign as the name of the distinction, then the right sign indicates something which is already indicated by the left sign.

That is a bit abstract, so maybe an example would help. If you are trying to highlight a word in a sentence and you do that by underlining it, no matter how many times you would draw a line below that word, at the end, the word will be as distinguished as it was after the first line. Or if you first underline it, then make a circle around it, and then highlight it with a yellow marker and so on, as long as each of them doesn’t carry special meaning, all these ways to distinguish it, make together just what each of them do separately – draw attention to the word.

Or when somebody tells you, “I like ice cream”, and then tells you that again in 10 minutes, it won’t make any difference unless you’ve forgotten it in the meantime. In other words, making the same announcement to the receiver will not change the already changed state of awareness. That has important implications for understanding information.

The second law is originally stated as follows:

The value of a crossing made again is not the value of the crossing.

One more way to interpret the mark is as an invitation to cross from the inside to the outside. As such, it serves as an operator and operand at the same time. The outer mark operates on the inner mark and turns it into the void.

If the inner mark turns its inside, which is empty, nothing, into outside, which is something, then the outer mark turns its inside, which is something, due to the operation of the inner mark, into nothing.

Picture a house with a fence that fully surrounds it. You jump over the fence and then continue walking straight until you reach the fence and then jump on the other side. As long as changing your state of being inside or outside is concerned, crossing twice is equal to not crossing at all.

The whole arithmetic and algebra of George Spencer-Brown are based on these two equations. Here is a summary of the primary algebra.

The third equation has a variable in it.

a = (a)

It has two possible values, mark or void. We can test what happens by trying out the two possible values on the right side of the equation.

Let a be void, then:

a = ( )

Thus, if a is void, then it is a mark.

Now, let a be mark, then:

a = (()) =   .

If a is a mark, then substituting a with a mark on the right side will bring a mark inside another, which according to the law of crossing, will give the unmarked state the void.

This way we have an expression of self-reference. It can be seen in numerical algebra in equations such as x = 1/x, which doesn’t have a real solution. It could only have an imaginary one. It can be traced in logic and philosophy with the Liar paradox, statements such as “This statement is false”, or the Russell’s set of all sets that are not members of themselves.

However, in the calculus of indications, this form lives naturally. The way a distinction creates space, a re-entry creates time.

There is no magic about it. In fact, all software programs not just contain self-referential equations; they can’t do without them. They iterate using expressions such as n = n + 1.

Recrossing

The Laws of Form resonate in religions, philosophies and science.

Chuang Tzu:

The knowledge of the ancients was perfect. How so? At first, they  did not yet know there were things. That is the most perfect knowledge; nothing can be added. Next, they knew that there were things, but they did not yet make distinctions between them. Next they made distinctions, but they did not yet pass judgements on  them. But when the judgements were passed, the Whole was destroyed. With the destruction of the Whole, individual bias arose.

The Tanakh (aka Old Testament ) starts with:

In the beginning when God created the heavens and the earth, the earth was a formless void… Then God said, ‘Let there be light’; and there was light. …God separated the light from the darkness. God called the light Day, and the darkness he called Night.

That’s how God made the first distinction:

(void) light

And then, in Tao Te Ching:

 The nameless is the beginning of heaven and earth…

Analogies can be found in Hinduism,  Buddhism and Islamic philosophy. For example, the latter distinguishes essence (Dhat) from attribute (Sifat), which are neither identical nor separate. Speaking of Islam, the occultation prompts another association. According to the Shia Islam, the Twelfth imam has been living in a temporary occultation, and is about to reappear one day.  Occultation is also one of the identities in the primary algebra:

c4-occultation-LoF

In it, the variable b disappears from left to the right and appears from right to left. This can be pictured by changing the position of an observer to the right until b is fully hidden behind a, and then when moving back to the left,  b reappears:

Occultation

Another association, which I find particularly fascinating, is with the ancient Buddhist logical system catuskoti, the four corners. Unlike the Aristotelian logic, which has the principle of non-contradiction, and the excluded middle, in catuskoti there are four possible values:

not being

being

both being and not being

neither being or not being

The first three correspond quite well with the void, distinction, and re-entry, respectively. That is in line with Varela’s view that apart from the unmarked and the marked state, there should be a third one, which he calls the autonomous state.

The fourth value would represent anything which is unknown. If we set being as “true”, and not-being as “false”, then every statement about the future is neither true nor false at the moment of uttering. And we make a lot of statements about the future, so it is common to have things in the fourth corner.

The fourth value also reminds me of the Open World Assumption, which I find very useful in many cases, as I mentioned here, here, and here. It also tempts to add a fourth statement to the initial three:

To not know is not to know there is not.

Catuskoti fits naturally into the Buddhist worldview, while being at odds with the Western one. At least until recently, when some multi-valued logics appeared.

George Spencer-Brown, Louis Kauffman, and William Bricken demonstrated that many of the other mathematical and logical theories could be generated using the calculus of indications. For example, in elementary logic and set theory, negation, disjunction, conjunction, and entailment, can be represented respectively with (A), AB, ((A)(B)), and (A)B, so that the classical syllogism ((A entails B) and (B entails C)) entails (A entails C), can be shown with the following form:

SyllogismInLoF

If that’s of interest, you can find explanations and many more examples in this paper by Louis Kauffman.

“Laws of Form” inspired extensions and applications in mathematics, second-order cybernetics, biology, cognitive and social sciences. It influenced prominent thinkers like Heinz von Foerster, Humberto Maturana, Francisco Varela, Louis Kauffman, William Bricken, Niklas Luhmann,  and Dirk Baecker.

Reentering

Self-reference is awkward: one may find the axioms in the explanation, the brain writing it’s own theory, a cell computing its own computer, the observer in the observed, the snake eating its own tail in a ceaseless generative process.

F. Varela, A Calculus for Self-reference

Is re-entry fundamental or a construct? According to George Spencer-Brown, it’s a construct. Varela, on the other hand, finds it not just fundamental but actually the third value, the autonomous state.  He brings up some quite convincing arguments. For Kauffman, re-entry is based on distinction, just as the distinction is based on re-entry:

the emergence of the mark itself requires self-reference, for there can be no mark without a distinction and there can be no distinction without indication (Spencer-Brown says there can be no indication without a distinction. This argument says it the other way around.). Indication is itself a distinction, and one sees that the act of distinction is necessarily circular.

That was the reason I presented three statements, and not only the first two, as a summary of the calculus of indications.

Similar kind of reasoning can be applied to sense-making. It can be seen as an interplay between autonomy and adaptivity. Autonomy makes the distinctions possible and the other way around. Making distinctions on distinctions is, in fact, sense-making, but it also changes the way distinctions are made due to adaptivity. At this new level, distinctions become normative. They have value in the sense that the autonomous system has an attitude. It has re-action determined by (and determining) that value. The simplest attitudes are those of attraction, aversion and neutrality.

This narrative may imply that values are of a higher order. First, distinctions are made, then sense, and then values, in a sort of linear chain. But it is not linear at all.

As George Spencer-Brown points out, a distinction can only be made by an observer, and the observer has a motive to make certain distinctions and not others:

If a content is of value,  a name can be taken to indicate this value.

Thus the calling of a name can be identified with the value of the content

Thus values enable distinctions and close the circle. Another re-entry. We can experience values due to the significance that our interaction with the world brings forth. This significance is based on making distinctions, and we can make distinctions because they have value for us.

But what is value and is it valuable at all? And if value is of any value, what is it that makes it such?

Ezequiel Di Paolo defines value as:

the extend to which a situation affects the viability of a self-sustaining and precarious network of processes that generates an identity

And then he adds that the “most intensely analysed such process is autopoiesis”.

In fact, the search for calculus for autopoiesis was what attracted Varela to the mathematics of Laws of Form in the first place. It was a pursuit to explain life and cognition. Autopoiesis was also the main reason for Luhmann and Baecker’s interest but in their case for studying social systems.

The operationally closed networks of processes in general, and the autopoiesis in particular show both re-entry, and distinction enabled by this re-entry and sustaining it. For the operationally closed system, all its processes enable and are enabled by other processes within the system. The autopoietic system is the stronger case, where the components of the processes are not just enabled but actually produced by them.

Both are cases of generating identity, which is making a distinction between the autonomous system and the environment. The environment is not everything surrounding it, but only the niche which makes sense to it. This sense-making is not passive and static. It is a process enacted by the system which brings about its niche.

Identity generation makes a distinction which is also what it is not, a unity. That is how living systems get more independent from the environment, which supplies the fuel for their independence and absorbs the exhaust of practising independence. And more independence would mean more fuel, hence bigger dependence. The phenomenon of life is a “needful freedom”, as pointed out by Hans Jonas.

Zooming out, we come back to the observation of George Spencer-Brown:

the world we know is constructed in order to see itself.[…] but in any attempt to see itself, […]it must act so as to make itself distinct from, and therefore false to, itself.

Closing the circle from distinctions to sense-making through value-making to (new) distinctions, solves the previous implication of linearity, but it may now be misunderstood to imply causality. First, my intention was to point them out as enabling conditions, not treating, for now, the question if they are necessary and sufficient. Second, the circle is enabled by and enables many others, the operationally closed self-generation of identity being of central interest so far. And third, singling out these three operations is a matter of distinctions made by me as an act of sense-making, and on the basis of certain values.

Same as Magic

When I started my journey in the world of Semantic Web Technologies and Linked Data, I couldn’t quite get what was all that fuss about the property owl:sameAs. Later I was able to better understand the idea and appreciate it when actively using Linked Data. But it wasn’t until I personally created graphs from heterogeneous data stores and then applied different strategies for merging them, when I realised the “magical” power of owl:sameAs.

The idea behind “same as” is simple. It works to say that although the two identifiers linked with it are distinct, what they represent is not.

Let’s say you what to bring together different things recorded for and by the same person Sam E. There is information in a personnel database, his profile and activities in Yammer, LinkedIn, Twitter, Facebook. Sam E. is also somebody doing research, so he has publications in different online libraries. He also makes highlights in Kindle and check-ins in Four-square.

Let’s imagine that at least one of the email addresses recorded as Sam E’s personal email is used in all these data sets. Sam E. is also somehow uniquely identified in these systems, and it doesn’t matter if the identifiers use his email or not. When creating RDF graphs from each of the sources,  URI for Sam E. should be generated in each graph if such doesn’t exist already. The only other thing needed is to declare that Sam E’s personal email is the object of foaf:mbox, where the subject is the respective URI for Sam E from in each of the data sets.

The interesting thing about foaf:mbox is that it is “inverse functional”. When a property is asserted as owl:inverseFunctionalProperty, then the object uniquely identifies the subject in that statement. To get what that means, let’s first see the meaning of a functional property in OWL. If Sam E. “has birth mother” Jane, and Sam E. “has birth mother” Marry, and “has birth mother” is declared as functional property, a DL reasoner will infer  that Jane and Marry are the same person. The “inverse functional” works the same way in the opposite direction. So if Sam.E.Yammer has foaf:mbox “sam@example.com”, and Sam.E.Twitter has foaf:mbox “sam@example.com”, then Sam.E.Yammer refers to the same person as Sam.E.Twitter. That is because a new triple Sam.E.Yammer–owl:sameAs–Sam.E.Twitter is inferred as a consequence of foaf:mbox being owl:inverseFunctionalProperty. But that single change brings a massive effect: all facts from Yammer about Sam E are inferred for Sam E from Twitter and vice versa. And the same applies for LinkedIn, Facebook, online libraries, Four-square and so on.

Now, imagine you don’t do that for Sam E, but for all your Twitter network. Then you’ll get a graph that will be able to answer questions such as “From those that tweeted about topic X within my network, give me the names and emails of all people that work within 300 km from here”, or “Am I in a same discussion group with somebody that liked book Y?”. But wait, you don’t need to imagine it, you can easily do it. Here is for example one way to turn Twitter data into an RDF graph.

Of course, apart for persons, similar approaches can be applied for any other thing represented on the web: organisations, locations, artefacts, chemical elements, species and so on.

To better understand what’s going on, it’s worth reminding that there is no unique name assumption in OWL. The fact that two identifiers X and Y are different, does not mean that they represent different things. If we know or if it can be deduced that they represent different things, this can be asserted or respectively inferred as a new triple X–owl:differentFrom–Y. In a similar way a triple saying just the opposite X–owl:sameAs–Y can be asserted or inferred. Basically, as long as sameness is concerned, we can have three states: same, different, neither same nor different. Unfortunately, a fourth state, both same and different, is not allowed, and why would that be of value will be discussed in another post. Now, let’s get back to the merging of graphs.

Bringing the RDF graphs about Sam E, created from the different systems, would link them in one graph just by using foaf:mbox. Most triple stores like Virtuoso, would do such kind of basic inferencing at run time. If you want to merge them in an ontology editor, you have to use a reasoner such as Pallet if you are using Protégé, or run inferencing with SPIN, if you are using TopBraid Composer. Linking knowledge representation from different systems in a way independent from their underlying schemas, can bring a lot of value, from utilising chains of relations to learning things not known before linking.

The power of “same as” has been used a lot for data integration both in controlled environments and in the wild. But let’s not forget that in the latter, the web,   “Anyone can say anything about anything”. This was in fact one of the leading design principles for RDF and OWL. And then even with the best intentions in mind, people can create a lot of almost “same as” relations that would be mixed with the reliable “same as” relations. And they did and they do.

The problems with “same as” have received a lot of attention. In one of the most cited papers on the subject, Harry Halpin et al. outline four categories of problems for owl:sameAs: “Same Thing As But Different Context”; “Same Thing As But Referentially Opaque”, “Represents”, and “Very Similar To”. Others worn about problems with provenance. Still, almost all agree that the benefits for the owl:sameAs for Linked Data by far outnumber the risks, and the latter can be mitigated by various means.

Whatever the risks with owl:sameAs in the web, they are insignificant, or non-existent in corporate environments. And yet, most of the data stays in silos and it gets integrated only partially and based on some concrete requirements. These requirements represent local historical needs and bring expensive solutions to local historical problems. Those solutions typically range from point-to-point interfaces with some ETL, to realisation of REST services. They can get quite impressive with the means for access and synchronisation, and yet they are all dependant on the local schemas in each application silo and helpless for enterprise-wide usage or any unforeseen need. What those solutions bring is more complicated application landscape, additional IT investments brought by any change of requirements and usually a lot of spendings for MDM software, data warehouses and suchlike. All that can be avoided if the data from heterogeneous corporate and open data sources is brought together into an enterprise knowledge graph, with distributed linked ontologies and vocabularies to give sense to it, and elegant querying technologies, that can bring answers, instead of just search results. The Semantic Web stack is full of capabilities such as owl:sameAs, that make this easy, and beautiful. Give it a try.