Data Model

From SONIVIS:Wiki

Jump to: navigation, search

The data model is the core concept of the SONIVIS:Tool for dealing with social network data. The persistence layer is the heart of the application where all other components are centered around.

Contents

Goal

The idea of the data model is to represent all components of a social network as generic as possible. This means any kind of social network should be representable in the data model - independent of its appearance and dedication. This led to the point where we had to define the most generic components for the model which can be borrowed from graph theory but also from database modeling: node and edge - or entity and relation.

While this clearly has advantages it also comes with some drawbacks. The more generic a model is set up to model reality - this reality strikes back on other levels, i.e. runtime requirements when working with the data. While holding close to the principles we tried to account for the possible drawbacks. Therefore, the model is not perfectly generic but it is somewhat more user friendly - not just from a programmer's point of view but also from that of a human being.

So, the model moves up one level of abstraction and defines two different types of entities and all possible relations between these which sum up to three. The entities represent what can be found in any social network - acting things (Actors) and things that are acted on (Content). The relations between entities of the same or different kind can be grouped as

  • knowledge relations (Actors <-> Content),
  • contextual relations (Content <-> Content), and
  • interactions (Actor <-> Actor).

These five items (2 entities and 3 relations) are still a very strong abstraction of reality and will as themselves not sufficiently cover all content of a social network. Therefore, these items have a type associated to provide for an even finer granularity.

The SONIVIS:Tool is intended as an analysis tool for social networks and should be able to graphically display as many apsects of a network as one can think of. This comes down to two more requirements:

  • qualities and (their) quantities have to be represented in the model
  • a graphical abstraction must be included in the data model.

The first one is straight forward such that properties can be connected to the entities as well as to the relations. The second one must be a very flexible interface of a graph representation to all the components of a social network. We guarantee this flexibility in that we divide the model into a real world part and a graph part. The two parts are tied together by a central bridge between one abstract representative for each side. While the real world part was covered above there is not very much to tell on the graph part - as expected one will find nodes and edges. Alike the properties of entities and relations an equivalent is included in the graph part.

A social network is almost like a living thing. As long as it exists there will be action and change. The dynamic nature of the a social network must be represented in the data model. Each entity and relation can be associated a timestamp.

Components of the Model

InfoSpace

The InfoSpace represents a social network. It contains both the social network's real world part as well as its graph part. It has a uniquely identifying name and a fully qualified name (fqn).

InfoSpaceItem

All entities and all relations of an InfoSpace are InfoSpaceItems. It is their generalization. The InfoSpaceItem is the abstract representative of items of the real world part and the bridgehead into the graph part. Each InfoSpaceItem is unique within its InfoSpace. Therefore it has an ID. It also has a collection of Properties associated with the item and a collection of GraphItems that represent it.

InfoSpaceItemType

This is some kind of helper item to allow for a fine-grained distinction between InfoSpaceItems further than the five general types given. Each InfoSpaceItem is associated an InfoSpaceItemType. It is given by a unique identifying string (and an associated ID).

ContentElement

The InfoSpaceItem ContentElement encloses all social network components that have a somehow generated content. One might think of the text of an e-mail or a commentary in a blog. A ContentElement has a unique ID because it is an InfoSpaceItem and, thus, belongs to an existing InfoSpace. It may keep an external ID from the source it was extracted from. It must be of a certain InfoSpaceItemType, is required to have a creation date and may have a name (or title) which is does not need to be unique. A ContentElement can have textual content but does not actually have to.

Actor

The InfoSpaceItem Actor is quite similar to a ContentElement. It has a unique ID, an optional external ID, belongs to an existing InfoSpace, is of a certain InfoSpaceItemType, must have a name and a registration date.

Context

The InfoSpaceItem Context is meant to describe a relationship between two ContentElements. As an InfoSpaceItem it belongs to an existing InfoSpace where it has a unique ID. It has a source and a target ContentElement although this does not generally imply that the relationship is directed. Since one can think of different forms of contextual relationships, i.e. contains, links-to, is_contained, an InfoSpaceItemType must be associated with each Context. (This might actually tell, if the relation is directed or not.)

Interaction

The InfoSpaceItem Interaction is the abstraction of an Actor to Actor relationship in a social network. As an InfoSpaceItem it belongs to an existing InfoSpace where it has a unique ID. It has a source and a target Actor although this does not generally imply that the relationship is directed. There is also a field to optionally hold a reference to a related ContentElement. Since one can think of different forms of interaction relationships, i.e. knows, likes, is_parent_of, an InfoSpaceItemType must be associated with each Interaction. (This might actually tell, if the relation is directed or not.)

Knowledge

If an [#Actor|Actor]] is related to a ContentElement he/she has Knowledge of it. As an InfoSpaceItem it belongs to an existing InfoSpace where it has a unique ID. It has an Actor and a ContentElement. Since one can think of different kinds of knowledge relationships, i.e. created, read, an InfoSpaceItemType must be associated with each Interaction.

GraphItem

The GraphItem is the counterpart of the InfoSpaceItem in the graph part of the model. Nodes, Edges, and Graphs are the more concrete GraphItems. It provides a unique ID and a collection of Properties for its descendants. Since it is always related to exactly one InfoSpaceItem it is also related to the according InfoSpace.

Node

The GraphItem Node does not provide anything additional according to a GraphItem except the information that it is a special type of GraphItem, a node.

Edge

An Edge is a little more complex a GraphItem than a Node is. It contains the Nodes that make it existent and possible information on its direction.

Graph

A 'Graph is a GraphItem that is made up of a collection of Nodes and the Edges between these. Optionally it may be given an identifying name or title.

Property

A Property is actually anything that attributes an item in an InfoSpace. It contains a unique ID, a non-unique textual identifier, fully qualified name matching the type of the value stored in its value field. We suppose this is very flexible way to store different types of data that is associated with an item. It comes in two flavours, InfoSpaceItemProperty and GraphItemProperty.

InfoSpaceItemProperty

The InfoSpaceItemProperty is a Property related to some InfoSpaceItem.

GraphItemProperty

The GraphItemProperty is a Property related to some GraphItem.

Personal tools