AMZ DIGICOM

Digital Communication

AMZ DIGICOM

Digital Communication

Graph databases: graph-oriented databases

PARTAGEZ

A graph database is a type of database that stores data in the form of nodes and edges. This method allows for efficient modeling and interrogation of complex relationships. Graph databases are therefore particularly suited to applications dealing with highly interconnected information.

What is a graph-oriented database made of and what is it used for?

A graph-oriented database graph database) is a graph-based representation. These graphs make it possible to represent in a readable manner, and to store in a large coherent data set, information that is connected to each other in a complex way, as well as their relationships with each other.

The graphs are composed of knotswhich are designated and uniquely identifiable entities or data objects, andedgesalso called Edges. These represent the relationships of objects to each other. Visually, these two elements are respectively represented by points and lines. The edges therefore each have a starting point and ending point, while each node always has a certain number of relations to other nodes, whether incoming, outgoing or undirected.

Graph-oriented databases are used in particular to analyze user relationships in social networks or purchasing behavior in online stores. By storing the relationships, it is then possible to propose product and friend recommendations and create personalized networks of people and products.

Note

Relational databases store data in tables and use SQL to query it. In contrast, graph databases are part of the NoSQL family and offer a more flexible structure to efficiently manage complex relationships between data.

Examples of graph databases

There are different concepts that describe the structure of these graph-oriented databases. The best known are the Labeled Property Graph (LPG) and the Resource Description Framework (RDF).

Labeled Property Graph

In a Labeled Property Graph, each node and edge of the graph is assigned specific propertiescalled propertiesas well as labels (labels). These elements store specific information about entities or relationships. Labels are used to categorize items, for example, a node can be labeled as « Person » or « Business », while properties can contain additional attributes such as name, age or geographic coordinates.

This structure allows for very flexible and high-performance data queries, because relationships and properties are directly stored in the database and can be retrieved by simple queries. LPGs are particularly suited to modeling complex networks, where entities and their connections are described in different contexts.

Resource Description Framework

In the Resource Description Framework, the information is organized in triples composed of a subject, a predicate and an objectthus providing a simple structure for representing relationships between entities. Each triple constitutes an assertion, where the subject designates the resource, the predicate describes the property or relationship, and the object represents the value or another resource.

With RDF, data can be linked in a standardized way, allowing it to be combined and retrieved across different systems. This flexibility makes RDF particularly useful for applications that rely on connecting data from diverse sources, such as in the case of knowledge graphs.

When using a graph database, you can use different types of queries. This is mainly because there is no standard, uniform query language. Unlike traditional models, graph databases also implement specific algorithms to accomplish their essential mission: to facilitate and accelerate complex data queries.

Among the most important algorithms, we find the depth first and the width-first route. In a depth-first scan, the next lower node is explored first, while a breadth-first scan explores the graph from one level to the next. The algorithms make it possible to find patterns (called Graph Patterns) as well as direct and indirect neighboring nodes. Other algorithms make it possible to calculate the shortest path between two nodes and to identify cliques (subsets of nodes) and hotspots (very strongly connected information). One of the strengths of the graph database is that relationships are stored in the database itself and that they should therefore not be calculated on request. This gives high performance speed even for complex queries.

Advantages and disadvantages of graph-oriented databases

The strength of a database can be mainly measured according to four criteria: integrity, performance, efficiency and scaling up. Querying data should be faster and simpler, which is how we can roughly summarize the main goal of graph databases. When, for example, relational databases reach their performance limits, the graph-based model operates in a very agile manner. The complexity and amount of data does not negatively affect the query process in this model.

Moreover, real facts can be stored in a natural way with the graph-oriented database model. The structure used is very similar to human thinking and therefore makes the links particularly evocative. However, these databases are not universal solutions. For example, they reach their boundaries as for the scaling up. Since they are primarily designed for a single server architecture, growth poses a (mathematical) problem. Additionally, they do not yet have a standard query language.

Here is an overview of the advantages and disadvantages of graph databases:

Benefits Disadvantages
Query speed depends only on the number of concrete relationships and not on the amount of data Poor scalability, because designed for a single-server architecture
Real-time results
Clear and readable representation of relationships
Flexible and agile structures

In principle, graph databases should not be seen as a systematic replacement for traditional relational databases. Relational databases remain reliable standard models, ensuring strong data integrity and stability, while allowing flexible scaling. As is often the case, it all depends on the desired goal!

Graph databases: comparison

There are several graph-oriented databases suitable for different use cases. Here are four popular models:

  • Neo4j: Neo4j is the most popular graph database, designed according to an open source model.
  • Amazon Neptune: This graph database is available on the Amazon Web Services public cloud and was released in 2018 as a high-performance database.
  • SAP Hana Graph: with SAP Hana, the SAP developer has created a platform based on a relational database management system and supplemented by the integrated graph-oriented model SAP Hana.
  • OrientDB: the database combines document-oriented and graph-based approaches, and is considered one of the fastest models currently available.

In direct comparison, it appears that databases offer different features, which may be useful depending on the specific use case:

Neo4j Amazon Neptune SAP HANA Graph OrientDB
Kind Native Managed (Cloud) Graph Extension Multi-model
Query languages Cypher SPARQL, Gremlin, OpenCypher SQL based Close to SQL, Gremlin
Data model(s) Property Graph Property Graph, RDF Relational, graph model Graphs, documents
Typical use cases Social networks, fraud detection, recommendation services, network management Knowledge graphs, identity and access management, cloud-native applications Business analytics, IoT, financial analysis, SAP applications Content management, complex relationships between data, distributed systems

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

Web Marketing

Localhost: how to connect to 127.0.0.1?

When you call an IP address, you are usually trying to contact another computer on the Internet. However, if you call the IP address 127.0.0.1,

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact