Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Cache
indextrue
refresh19s
showRefreshfalse

Page Contents

4true30pxPage Contentsnone


ProductAbbrevName;ProductFullName;ModelTypes;ModelType;AorAnModelEDG;Enterprise Data Governance;Reference Datasets;Reference Dataset;aINLINE

Overview of $ModelTypes

Reference datasets contain standardized data or codes, which typically are used by various applications as lists or tables. In fact, they are often called "code tables." An individual code table may seem like a simple thing, but a well-managed collection of code tables and related reference data spread across an enterprise is a resource that can bring great value to that enterprise—or cause great problems if it is not well maintained. $ProductAbbrevName lets you control your reference data so that you can put it to work for you as efficiently as possible.

$ProductAbbrevName datasets are much more than just flat code tables. Reference data in different datasets can have relationships. For example, as currencies are associated with countries, currency codes have a relationship (connection) to country codes. Reference datasets can also model structural relationships in data, such as hierarchies of industrial categories, locations, or product types. Finally, you can capture any additional information you need to have about each code. And reference datasets themselves provide a lot of rich information or metadata such as the source of a dataset, how it is managed, where it is being used, and the meaning of each data field.

The tabular-editors of EDG collections (for searching, viewing, and editing assets) requires the underlying schema to be backed by SHACL. To migrate a collection's included ontologies to a SHACL basis, see: Ontology Utilities > Convert OWL Axioms to SHACL Constraints.

Reference datasets no longer allow classes without a primary key property to be used as their main entity.

For additional perspectives and details on reference data management and related topics, see these TopQuadrant whitepapers.

Reference datasets are used with ontologies, which define the data schema (classes, properties, relationships, constraints) of the reference dataset items. For example, you might define a class (or entity) called Gender in an ontology and then, in a reference dataset that uses this ontology, enter the values Male and Female as instances of this list. Ontologies thus define the data attributes for each entity and the relationships between entities.

TopBraid $ProductAbbrevName makes it possible for you to:

  • Reduce independent maintenance of code tables: If different departments use the same code table, they may be maintaining individual copies of it on spreadsheets being emailed around to each other. When they all use the same copy, changes are coordinated, and they can be confident that they're using the right codes.

  • Reduce data quality problems due to coding errors: Workers who don't have access to recent, correct codes can't always enter the proper values, and improper values can lead to lost revenue.

  • Reduce the cost of designing code tables for databases: When new code tables have similarities or other relationships to other tables, these relationships can be leveraged in the design of the new tables. Well-organized, searchable metadata about which applications use which code tables also makes it easier to coordinate new and legacy tables.

  • Reduce data integration issues due to inconsistent codes: The inconsistencies caused by maintaining multiple copies of the same code tables, or by using copies that were updated at different times, can lead to problems when combining datasets that reference these tables. Consistent tables mean easier data integration.

  • Make informed decisions based on code table data: Code table entries are often cryptic abbreviations, leaving people to guess about their meaning and appropriateness for which ones to use when. Metadata such as definitions and provenance information ensure that people will use the right codes in the right places.

The Create dialog box asks for the $ModelType's Label (name) and, optionally, a Description.

Create New $ModelType

This creates a new $ModelType with yourself as the manager.

If using Search the EDG with Lucene indexing (the default option), an option exists on create to add this collection to the index. This is the same as selecting it in Search the EDG configurations with the default property selectors.

The ontology for the main entity (ME) class

 

Each reference dataset needs an ontology class to act as its main entity , which will be the class of the dataset's reference instances. From the existing ontologies listed for Ontology to Include, select the ontology that contains the class to be used as the new main entity.

After submitting the creation form, the ME class itself can be designated either via (1) the dataset's utilities: Settings > Metadata > Edit > Overview > main entity (class) drop-down selection or via (2) a form prompt that appears when the dataset is first edited. The primary key of the ME classThe main entity class must have exactly one property designated as the primary key , which could be inherited. If the class lacks such a designated property, then it (or a superclass) must be edited to define the primary key (- see Ontology View or Edit: Setting a primary key for a class).