Modelling ReBAC Access Control System with Ontologies and Graph Databases
Contents
1. Introduction
2. Background
3. Implementation Process
— 3.1. Simple Approach using Neo4j
— 3.2. Limitations and the need for schema
— 3.3. Developing Ontology
— 3.4. Introducing axioms and rules
— 3.5. Testing Ontology in Protege
4. Testing Ontology with Python unit tests
5. Making Use of Ontology in Off-the-Shelf Graph DBs
— 5.1. Neo4j
— 5.2. StarDog
— 5.3. OntoText GraphDB
— 5.4. AllegroDB
6. Results and Discussion
7. Conclusion and Future Work
Introduction
Recently, I became very interested in the topic of knowledge engineering — specifically, knowledge mining, representation, and usage/inference. One of the main ways to represent knowledge is through a semantic net or knowledge graph. I assumed it would be beneficial to choose a real-world problem and solve it using knowledge graphs. In one of my projects, I used OpenFGA as an access control system and was astonished by how easy it was to model the access relations between different entities in my domain using this system. OpenFGA’s store can be represented as a graph, so I decided to model and implement the functionality of the OpenFGA access control system using a graph database to learn more about both domains.
TLDR;
This task involves three main levels:
1. Creating a model or ontology with a set of rules.
2. Creating a database with entities and relationships.
3. Querying the database in a way that respects the rules from (1).
Initially, I skipped (1) and tried to implement OpenFGA functionality in Neo4j directly.
After intensive research, I managed to create an ontology but couldn’t find a ready-to-use graph database that supports my ontology out of the box. I ended up creating a simple Python program that emulates implementation of (2) and (3) using the owlready2 framework.
The research was incredibly valuable — I progressed from a straightforward solution using Neo4j to creating an ontology with a set of rules in Protege and testing them using Python scripts. I explored several ready-to-use graph databases and evaluated their ability to not only save graph data but also make inferences using predefined rules.
This article describes my research and may be useful for those interested in knowledge representation and inference, creating ontologies, and solving real-world tasks using graph databases.
Background
In this research, we are modelling an access system. An access system is part of a security system concerned with safety. One way to provide safety is to restrict access to certain resources or assets to a limited number of actors (users, services, agents).
An access system has two main parts:
1. Access Management System: Decides which actor should have what kind of access to which resource.
2. Access Control System: Enforces the access rules made by the access management system.
For the access control system, the main question it asks the access management system is, “Does this particular actor/user have a particular access to a particular resource?” How the access management system gets the answer depends on the model it uses to represent the access system.
The simplest way to represent it is by making a list of triples (similar to ACL): actor — access type — resource, and then providing it to the access control system. This method works fine when there are a small number of actors and resources, and the change rate of the list is not high. However, if the number of actors and resources increases or the change rate is high, we need more sophisticated models. Many models have been invented, as detailed in the “Access control models” section on this wiki, with the most popular being RBAC and ReBAC.
These models try to infer actor — access type — resource by uniting actors into groups and organizations, creating hierarchies that reflect real-world organizational structures. This way, access types can be assigned to groups instead of individuals. Similarly, resources are organized into hierarchies, so if an actor has access to a parent resource, they automatically acquire access to all descendants in the hierarchy. This approach simplifies access management.
I believe ReBAC (Relationship-based access control) is the most advanced model that can serve as a foundation for almost any other model. In ReBAC, a subject’s (actors, users) permission for a given access type to a given resource is defined by the presence of a specific relationship between the subject and the resource. Relationships can be defined and inferred. The definition of a relationship is expressed as a triple: subject-relationship-resource, which is exactly the same as triples in RDF, the W3C standard for directed graph descriptions.
OpenFGA is a ReBAC system based on the principles Google used to create their own ReBAC system, Zanzibar, as described in this paper.
In OpenFGA, both parts of the access system are implemented:
1. Management Access System:
1.1. This system can be modelled by creating a custom schema that includes the definition of meaningful (for your domain) types of resources, actors/subjects/users, and relationships between them. Additionally, simple rules for relationships inference can be added, such as defining access to a hierarchy of resources.
1.2. At runtime, relationships can be added to the system in the format subject — relationship — resource/object.
2. Access Control System:
2.1. At runtime, the main function of any control access system is to check if a particular relationship exists between a subject and a resource/object. The relationship could denote an access type.
2.2. Additionally, many useful functions are implemented, such as returning the list of subjects that have a particular relationship/access to a particular resource.
I started with the most straightforward implementation of some OpenFGA examples in Neo4j, using Neo4j DB as a management access system and Cypher queries to answer the main question of the control access system: “Does a given user have a particular type of access to a particular resource?”
Implementation Process
Simple Approach Using Neo4j
At first, I took one of the OpenFGA schemas from their examples and tried to implement it directly in Neo4j by adding users, documents, and allowed relations between them.
Downloaded and installed their DB locally from here and used their Cypher language to manipulate the data: both adding new entries and querying it.
The OpenFGA schema I used as an example:
model
schema 1.1
type user
type document
relations
define owner: [user]
define writer: [user] or owner
define commenter: [user] or writer
define viewer: [user] or commenter
Let’s create some entities that satisfy this schema in Neo4j. I created two documents with distinct IDs and three users with different types of access (relations) to the documents:
merge (doc:Document { id : "id111"})
merge (anna:User { id : "Anna"})
merge (john:User { id : "John"})
merge (anna)-[:WRITER]->(doc)
merge (doc1:Document {id: "id222"})
merge (john)-[:READER]->(doc1)
merge (artem:User { id : "Artem"})
merge (artem)-[:owner]->(doc)
It looked very easy, so I took a more advanced OpenFGA schema that includes a new class or type domain:
model
schema 1.1
type user
type document
relations
define owner: [user, domain#member]
define writer: [user, domain#member] or owner
define commenter: [user, domain#member] or writer
define viewer: [user, domain#member] or commenter
type domain
relations
define member: [user]
Let’s create Neo4j nodes and relationships that satisfy the above schema:
merge (doc:Document { id : "id333"})
merge (d:Domain { id : "SalesDomain"})
merge (bill:User { id : "Bill"})
merge (bill)-[:MEMBER]->(d)
merge (d)-[:COMMENTER]->(doc) // it's actually wrong as it's not a domain that can comment but a member of a domain
merge (lis:User { id : "Lis"})
merge (lis)-[:COMMENTER]->(doc)
This is how the graph looks:
Now we can emulate the access control system and get the answer to the question: what users have commenter access to the document with id=id333?
match (doc:Document { id : "id333"})
match (u:User)-[:MEMBER*0..3]-()-[:COMMENTER*..3]->(doc)
return u
And this is the result of the query:
Both Lis and Bill have commenter access to the document. Lis has a direct relationship, and Bill has a transitive relationship as he is a member of the domain SalesDomain, which has a relationship with the document.
Lastly, I tried to implement the most advanced schema that includes hierarchical relationships between documents. This means if a user has access to a parent document, they automatically have access to all descendants of this document:
model
schema 1.1
type user
type document
relations
define owner: [user, domain#member] or owner from parent
define writer: [user, domain#member] or owner or writer from parent
define commenter: [user, domain#member] or writer or commenter from parent
define viewer: [user, domain#member] or commenter or viewer from parent
define parent: [document]
type domain
relations
define member: [user]
Created Neo4j nodes and relationships to illustrate this schema:
merge (doc:Document { id : "id444"})
merge (docParent:Document { id : "id555"})
merge (docParent)-[:PARENT]->(doc)
merge (chris:User { id : "Chris"})
merge (chris)-[:VIEWER]->(docParent)
merge (doc1:Document { id : "id666"})
merge (chris)-[:VIEWER]->(doc1)
The graph:
Now, we can use the following query to find out which documents Chris can view:
match (chris:User { id : "Chris"})
match (chris)-[:VIEWER]-(doc1)-[:PARENT*0..]->(doc)
return doc1, doc
The result of the query:
Chris can view all three documents. With two of them, there is a direct relationship, and the third one is accessible as a child of one of the other two through the transitive Parent relationship between documents.
Limitations and the Need for Schema
All of that was rather easy, but the problem is that this approach is not scalable and not very maintainable because there is no clear schema of what types of objects are available, what kinds of relations are possible, and the rules are kept inside queries, making it hard to reason about the underlying logic.
We can compare that bunch of queries with the OpenFGA schema, where it’s obvious from the first glance what the possible types of entities and relationships between them are, and also what the inference rules are (like if a user is an owner of a document, then they have write access as well).
This concise and expressive syntax allows us to add new types and relations easily in one place and reason about the overall schema without inspecting queries and database content.
Thus, I decided that I needed to extract the schema with rules and thereby split the data and the schema, making it possible to work with the schema without touching data and queries.
This script creates a set of classes and relationships between them in Neo4j. We created three classes (types in OpenFGA terminology) — user, document, and domain — and a set of six relationships.
merge (doc:Class { label : "Document"})
merge (user:Class { label : "User"})
merge (domain:Class { label : "Domain"})
merge (doc)-[:PARENT]->(doc)
merge (user)-[:OWNER]->(doc)
merge (user)-[:WRITER]->(doc)
merge (user)-[:COMMENTER]->(doc)
merge (user)-[:VIEWER]->(doc)
merge (user)-[:MEMBER]->(domain)
The schematic view of the schema:
The schema looks really nice, but it lacks complex relationships (or rules), such as:
1. The transitive nature of the Parent relationship, so if (A)-[:PARENT]->(B)-[:PARENT]->(C ), then (A)-[:PARENT]->(C ).
2. The piece of schema `or owner from parent` in the owner definition (and the same for writer, viewer, and commenter), which means that if a user is an owner of a document that has children, then the user owns all those children and their descendants.
3. The piece of schema `, domain#member]` in all document relation definitions.
Let’s take (3) — how to express the ownership relationship between a member of a domain and a document. Actually, only a user can own a document, but this user can simultaneously be a member of a domain. However, what we want is to be able to create this kind of permission rule — `(domain.member)-[:COMMENTER]->(doc)` — it’s a pseudo-code, of course. It doesn’t seem possible to do on a schema level. We would need to write specific queries for that.
So, the next phase of my research is finding a way to abstract the schema and all those complex relationships or rules from data and allow them to exist and evolve separately.
Why might we need that?
There are multiple reasons: segregation of responsibility, reusability, better testability, better maintainability, and extensibility. One simple way of thinking about that is like this: we can split the creation of the Management access system and the creation of the Access control system. One team, which creates the Management access system, will be creating a model (schema + rules) and then applying it to the database, which happens to be at the intersection of two teams. The second team will be working on the Access control system — just querying the shared database to answer the only question that interests them: “Does a given user have a given type of access to a given resource?” (or some derivatives, like a set of users who have access, or a set of resources a given user has access to, etc.). The Access control system team doesn’t have to know about all those complex rules; they should use a simple query and get simple results.
When I explored the subject, it turned out that the role of the schema in graph databases (and knowledge graphs in general) is played by Ontology. If we imagine that knowledge graphs and graph databases contain knowledge, then Ontology is a model or schema of this knowledge.
Thus, the next step is building an ontology for one of the OpenFGA schemas.
Building Ontology
I used the tool Protege to visually create the ontology (it’s the most popular tool for that purpose as far as I know) and Turtle as the syntax and file format for the exported artifacts, as it is easily human-readable in contrast to the usual XML-based RDF format.
The most popular concept or schema of knowledge representation is RDF, a very popular standard that defines knowledge representation as a directed graph of triples: subject-predicate-object. It’s the most general way to represent knowledge. It doesn’t impose many constraints — just defines a handful of types like `class`, `property`, `predicate`, and so on.
To describe domain-specific ontologies, we need a more expressive language based on RDF. This is RDF Schema, a semantic extension to RDF. Another famous semantic extension is OWL. It’s possible to use multiple extensions when building your own ontology, as they are all based on RDF. However, there could be some overlaps we need to pay attention to.
I used both RDFS and OWL concepts to build my ontology, as it’s a common approach and is supported by the Protege tool.
Let’s again take this schema:
model
schema 1.1
type user
type document
relations
define owner: [user, domain#member] or owner from parent
define writer: [user, domain#member] or owner or writer from parent
define commenter: [user, domain#member] or writer or commenter from parent
define viewer: [user, domain#member] or commenter or viewer from parent
define parent: [document]
type domain
relations
define member: [user]
There are three obvious classes: `User`, `Document`, `Domain`. Creating them in Protege was easy.
The next step is to create relationships (relation in OpenFGA notation). There is no such term `Relationship` in OWL. To do that, we need to use properties. There are two main types of properties in OWL: Object property and Data property. An Object property connects an object with a subject where both are different entities (basically our relationship), and a Data property connects a subject with its attribute type (what we usually call property). We don’t need the latter, as our schema supports only relationships between instances of classes. An Object property (relationship) has two important parameters: domain and range. Domain defines the class of entities the relationship starts from, and range defines the class of entities the relationship ends with. For instance, the relationship `OWNER` (let’s use capital letters for relationships) connects User (domain) to Document (range), meaning that in our data, there will be instances of this relationship connecting specific instances of User and Document.
When I started creating relationships (Object properties), I found out that there is no easy way to implement the `define owner: [domain#member]` relationship, as the range can only be a class, but in the schema, it’s a relationship.
Thus, I decided to unpack this tricky relationship like this:
model
schema 1.1
type user
type document
relations
define owner: [user] or owner from parent or member from owner_domain
define writer: [user] or owner or writer from parent or member from writer_domain
define commenter: [user] or writer or commenter from parent or member from commenter_domain
define viewer: [user] or commenter or viewer from parent or member from viewer_domain
define parent: [document]
define owner_domain: [domain]
define writer_domain: [domain]
define commenter_domain: [domain]
define viewer_domain: [domain]
type domain
relations
define member: [user]
The above schema works exactly like the initial one but is a bit more verbose. However, what is very clear here is that there are relationships between documents and domains, which wasn’t as easily understandable in the initial schema. We can use this trick to implement four additional relationships between Document and Domain classes.
Also, I ticked the box `Transitive` on the `PARENT` relationship that links instances of Document, as it will help us later to implement the assumption that if a user has some type of relationship with a parent document, they should have the same relationships with all its descendants (children and grandchildren, and so on).
Finally, the graph for this particular schema looks like this:
It looks great, but unfortunately, it doesn’t work well because even though we added relationships between Domain and Document, it doesn’t mean that if a user is a member of a domain that has, say, an `OWNER_DOMAIN` relationship with a document, this user will have an `OWNER` relationship with this document.
Introducing Axioms and Rules
This looks like a relationships chain, something like `MEMBER` + `OWNER_DOMAIN` = `OWNER`. It turned out we can express this in two ways:
- As a `propertyChainAxiom` as part of the OWL2 standard.
- As an SWRL rule.
The first way, as an axiom:
:OWNER rdf:type owl:ObjectProperty ;
rdfs:domain :User ;
rdfs:range :Document ;
owl:propertyChainAxiom ( :MEMBER :OWNER_DOMAIN ) .
Here, the `OWNER` relationship is added whenever there is a chain of two relationships: `MEMBER` followed by `OWNER_DOMAIN`. In the picture above, we can imagine that there are no direct links between User and Document, but once there is a chain of `User->MEMBER->Domain->OWNER_DOMAIN->Document`, a new relationship `User->OWNER->Document` is added. Of course, this new relationship should be computed at some point — it can’t just magically appear. In Protege, this computation is done by the Reasoner (Inference engine). Once we add such an axiom and some individuals (like users, documents, and domains), we should go to the menu Reasoner and then Start Reasoner or Synchronize Reasoner if it has already been started. Computed relationships will be visible with a yellow background.
Like in the screenshot below, where I added some individuals, and John is a member of the Admins domain, which has a `WRITER_DOMAIN` relationship with Article-1–1. Thus, John should have a `WRITER` relationship with the same article and all its descendants (Article-1–1–1). Actually, all John’s relationships are computed. Some of them will be explained later in this article.
The second option is to use SWRL rules. It’s a bit harder to do as it requires knowledge of the syntax and more actions to be done in Protege, but it is easier to maintain as all rules are located in one place, so they can be reasoned about as a whole without going back and forth searching for axioms in classes. Also, it provides additional separation of structure and logic. The rule will look like this:
User(?u) ^ Document(?d) ^ Domain(?dom) ^ MEMBER(?u, ?dom) ^ OWNER_DOMAIN(?dom, ?d) -> OWNER(?u, ?d)
This states that if there exists a User `u`, a Document `d`, and a Domain `dom`, and `u` is a `MEMBER` of a domain that is `OWNER_DOMAIN` of `d`, then `u` is `OWNER` of `d`.
The syntax is pretty obvious and easy to grasp. Also, some logic can’t be expressed with property axioms, as I will show later on. Thus, I decided to implement all logic as SWRL rules.
In Protege, to use SWRL rules, you need to activate the SWRL tab in the menu Window/Tabs, as shown in the screenshot:
Then, use the New button to add rules.
After adding rules, you need to tap the OWL+SWRL->Drools button, then the Run Drools button, and finally Synchronize Reasoner to see the new inferred relationships.
In the same way, we can implement the transitivity of all access relationships over the document hierarchy. This means that a user who owns a document should own all its descendants (user->OWNER->document1->PARENT->document2 means user->OWNER->document2).
As a `propertyChainAxiom`:
:OWNER rdf:type owl:ObjectProperty ;
rdfs:domain :User ;
rdfs:range :Document ;
owl:propertyChainAxiom ( :OWNER :PARENT ) .
And an SWRL rule:
User(?u) ^ Document(?d) ^ OWNER(?u, ?d) ^ Document(?child) ^ PARENT(?d, ?child) -> OWNER(?u, ?child)
The last thing to implement is the `OR` syntax in the OpenFGA schema, or more specific logic that defines that if a user is an `OWNER` of a document, then they should be a `WRITER` of this document as well. Similarly, if a user is a `WRITER` of a document, they should be a `COMMENTER` of this document, and finally, if they are a `COMMENTER`, then they should be a `VIEWER` too.
This is a very easy task for SWRL rules:
User(?u) ^ Document(?d) ^ OWNER(?u, ?d) -> WRITER(?u, ?d)
User(?u) ^ Document(?d) ^ WRITER(?u, ?d) -> COMMENTER(?u, ?d)
User(?u) ^ Document(?d) ^ COMMENTER(?u, ?d) -> VIEWER(?u, ?d)
I really like the expressiveness of these rules, even better than in OpenFGA notation. It’s so clear that owner is writer, writer is commenter, and commenter is viewer.
Once we add these rules and synchronize the Reasoner, we can see all those new relationships inside the individuals, as shown in one of the previous screenshots with John.
There is one tricky relationship that exists in OpenFGA, which is not possible to express without SWRL rules. It’s `user:*`, which means that there can be documents viewable by all users.
Let’s update the OpenFGA schema by adding `user:*` userset in the OpenFGA (Zanzibar actually) terminology to the viewer relation:
model
schema 1.1
type user
type document
relations
define owner: [user, domain#member] or owner from parent
define writer: [user, domain#member] or owner or writer from parent
define commenter: [user, domain#member] or writer or commenter from parent
define viewer: [user, domain#member, user:*] or commenter or viewer from parent
define parent: [document]
type domain
relations
define member: [user]
First, I tried to use OWL to express this relationship, but the problem is that this relationship makes sense only for a particular instance of a document. The only way to express it using a restriction is:
:User rdf:type owl:Class ;
rdfs:subClassOf [ rdf:type owl:Restriction ;
owl:onProperty :VIEWER ;
owl:hasValue :Article-1–1
] .
This can be read as any instance of User having an inferred relation `VIEWER` with document instance `Article-1–1`. It looks like a workaround because here we mix the schema/ontology with the individuals.
To overcome this, I decided to create a new class `PublicDocument`, which should be a subclass of the `Document` class. Then I tried to find a way to add a restriction to the `User` class so any user will have a `VIEWER` relationship with any document from the `PublicDocument` class. However, it turned out to be not possible (at least I failed to find a way).
Thus, I switched back to using SWRL rules and added this very simple and easy-to-understand rule:
User(?u) ^ PublicDocument(?d) -> VIEWER(?u, ?d)
Testing ontology in Protege
This is what I’ve got in Protege: a fully working ontology, which I tested by adding multiple individuals and checking inferred relationships.
There is a link to a GitHub repo where I saved a project in both Turtle and XML/RDF formats so you can open them in Protege and see the same.
Individuals:
Users
1. Artem (OWNER of Article-1)
2. John (MEMBER of Admins)
3. Kira
Documents
1. Article-1 (PARENT of Article-1–1)
2. Article-1–1 (PARENT of Article-1–1–1)
3. Article-1–1–1
Domains
1. Admins (WRITER_DOMAIN of Article-1–1)
Testing the Ontology with Python Unit Tests
Once I finished the implementation, the developer inside me started thinking about tests — how can we make sure that our ontology works as expected so that once we add many individuals and relationships, the inferred relationships will be computed correctly?
Of course, there is a manual way to test it using Protege itself. I created some individuals and relationships, synchronized the reasoner, and checked that the computed/inferred relationships marked with a yellow background were correct.
However, this example is simple, and I can easily imagine an ontology with hundreds of classes, thousands of relationships, and dozens of rules, which will not be possible to test manually after each change.
Another way to test the ontology is to write some tests in Java using the OWL API or in Python using the owlready2 library. I chose Python and wrote a simple script that loads the ontology from a file, creates the same set of individuals and relationships as I created in Protege, and then runs several tests, each of which tests a certain set of inferred relationships.
Piece of code to create individuals:
def setUp(self):
# Create test individuals
self.article1 = onto.Document("Article-1")
self.article11 = onto.Document("Article-1–1")
self.article111 = onto.Document("Article-1–1–1")
self.public_article = onto.PublicDocument("PublicArticle-1")
self.admins = onto.Domain("Admins")
# Article-1 is parent of Article-1–1
self.article1.PARENT.append(self.article11)
# Article-1–1 is parent of Article-1–1–1
self.article11.PARENT.append(self.article111)
One of the tests:
def test_admin_member_can_write_documents(self):
self.assertIn(self.article11, self.john.WRITER)
self.assertIn(self.article111, self.john.WRITER) # transitively over documents hierarchy
self.assertNotIn(self.article1, self.john.WRITER) # no way to go upwards of the documents hierarchy
There is a huge output, so I copied just the last lines in case of a successful pass:
And the output for a failure (I tested that Kira should have WRITER access to the public document):
I found it rather easy to write tests and to debug the errors, at least in Python using `owlready2`.
The link to the Python file with tests is on GitHub. In the beginning, I converted my ontology from Turtle to RDF/XML format as owlready2 doesn’t support Turtle; however, you can export the ontology in RDF/XML directly from Protege if needed and spare this step.
Making Use of Ontology in Off-the-Shelf Graph DBs
Now I have a well-tested and perfectly working ontology, and the next step is to make use of it — upload it to a real graph DB and make the reasoner (inference engine) work on any data change so I can easily query if a given user has a particular type of access to a particular resource. At least, that was the plan.
In reality, none of the graph DBs I tested support this functionality. For each graph DB, I installed it or used the free cloud version, then tried to upload the ontology, create individuals, and make a simple query to test whether the inference engine worked and added new relationships.
Neo4j
I started with Neo4j. I downloaded and installed their DB locally from here. There was a way to upload the ontology using their additional toolkit n10s, which needs to be installed separately. Once installed, I could upload the ontology using this command:
CALL n10s.onto.import.fetch("file:///Users/xxx/openfga-final.ttl", "Turtle");
Unfortunately, the uploaded ontology was not treated as a schema by Neo4j but as just another set of entries in their data graph. This means I couldn’t see any classes and relationships, just a bunch of nodes — some were classes and some object properties (not relationships in Neo4j terminology). This means that the data I add will not be validated using the ontology, and inference will not work unless I implement everything manually using their scripts and SHACL constraints language. I didn’t want to reimplement everything again as I already had a perfectly working ontology in the most popular format.
With Neo4j, I had the worst experience as uploading ontology is not directly supported, and even after using the toolkit, there is no way to check the ontology and use it at least to query the data using classes, not to mention inference.
StarDog
I used their free cloud solution. I created a project and uploaded my ontology from a file — it was pretty easy, and I didn’t have to use any add-ons or workarounds. All classes and relationships were recognized, and there is a way to display them as a diagram:
All my individuals were uploaded as well. The only problem was that SWRL rules are not supported by StarDog, so when I made the following query, only relationships that were defined in the file were displayed, even though I turned on reasoning.
I liked working with StarDog, and I’ll probably make another try using a different approach.
OntoText GraphDB
I downloaded GraphDB from their official website, installed it, and then uploaded my ontology using this short manual. It was easy, and after this, I was able to see my ontology, but in a very weird way:
I would prefer GraphDB hides all the types related to RDF and OWL and leave only my types and relationships. Querying was really smooth; GraphDB even recognized my namespace, so I didn’t have to add it myself as in StarDog:
Unfortunately, GraphDB doesn’t support SWRL rules, the same as StarDog and other DBs. In the query, there is only one result, even though I turned inference on.
AllegroDB
I used their free cloud solution, which should be enough for this research. I just imported the Turtle file with my ontology and individuals, and it worked straight away, as you can see in the screenshot:
Yes, it doesn’t support SWRL rules either, so there is only one result returned to this query (the reasoning was on).
Results and Discussion
The initial idea of this research was to take the OpenFGA schema, model it, and implement it in a graph DB so it’s fully usable. It turned out that this is possible but only in a very ad hoc way, which will be specific for each graph DB. For instance, in Neo4j, we would need to create a set of SHACL constraints to force the data to be compliant with our schema (which should be created in Neo4j as well), and also a lot of logic would need to be implemented using special queries in their language Cypher. In other DBs, it would be possible to upload the universal ontology created in Protege, so we wouldn’t need special constraints. However, none of the DBs I tested supported SWRL rules, so we would need to create rules programmatically or use special SPARQL queries to check for inferred relationships.
I’m not saying that the ad hoc way of implementing an access control system is wrong or worse than a way with Ontology and a set of SWRL rules. I just wanted to extract the schema and related logic and make them independent of the DB used to store the real data (individuals) so that we split the design of the system and its runtime implementation.
Actually, the way I chose may not be very suitable for the task at hand. Running the reasoner on each change of the DB may be very expensive in terms of computation (running the rules for each individual) and from a storage perspective (storing inferred relationships) for big systems with thousands of users and resources. I think the key factor here would be the complexity of the ontology, including the number of rules and the change rate of the system. So, for systems with a rather simple ontology (like we explored — three classes and a dozen relationships) and a high change rate (multiple changes per second), the way with the reasoner is not suitable at all. But for systems with a very complex ontology and a low change rate, it makes perfect sense to have a universal ontology with a huge set of rules and run the reasoner rarely, even if it would take a long time to compute all the absent relationships.
One way to implement the system with ontology and rules using graph DBs that don’t support SWRL rules would be to implement middleware between the ontology and a graph DB in Java or Python (something like the CQRS pattern). This middleware would be responsible for ensuring any change in the DB complies with the ontology and running the reasoner on each change so the DB will always be in the most consistent and correct form. Meanwhile, the querying part will still be handled by the internals of the graph DB (SPARQL or Cypher) and done in the most efficient way from a response time perspective.
Conclusion and Future Work
The main result of my research, as I see it, is the demonstration of a theoretical possibility to extract some part of the knowledge-related structure and logic from a common software development process into a separate ontology development process. This process involves the creation and testing of the ontology (which serves as the knowledge schema).
This approach can potentially help move some complex and bug-prone code into a medium where it’s much easier to create and test logic related to knowledge processing.
I really enjoyed working with ontologies and logic in the form of axioms and rules. Therefore, I’m keen to continue exploring how these knowledge-related concepts and tools could be used as part of the common software development process for different use cases.
If you’ve made it this far, it means you’re really interested in the topic of ontologies and knowledge engineering. In that case, I would be happy to connect with you on LinkedIn and perhaps collaborate on some interesting projects in this area.