feat: refresh catalog docs
All checks were successful
Publish to docs.datacontroller.io / Deploy docs (push) Successful in 1m31s

This commit is contained in:
allan
2025-07-15 14:45:39 +01:00
parent 4c45779312
commit 2ec7d35342
8 changed files with 87 additions and 99 deletions

View File

@@ -1,7 +1,25 @@
# Data Controller for SAS: Data Catalog
Data Controller collects information about the size and shape of libraries, tables, columns catalogs, and objects. The Catalog does not contain information about the data content (values).
---
layout: article
title: DC Data Catalog
description: Catalog the Libraries, Tables, Columns, SAS Catalogs, and associated Objects in your SAS estate
og_title: DC Data Catalog Documentation
og_image: /img/catalog.png
---
The catalog is based primarily on the existing SAS dictionary tables, augmented with attributes such as primary key fields, filesize / libsize, and number of observations (eg for database tables).
# DC Data Catalog
In any SAS estate, it's unlikely the size & shape of data will remain static. By running a regular Catalog Scan, you can track changes such as:
- Library Properties (size, schema, path, number of tables)
- Table Properties (size, number of columns, primary keys)
- Variable Properties (presence in a primary key, constraints, position in the dataset)
- SAS Catalog Properties (number of entries, created / modified datetimes)
- SAS Catalog Object properties (entry name, type, description, created / modified datetimes)
The data is stored with SCD2 so you can actually **track changes to your model over time**! Curious when that new column appeared? Just check the history in [MPE_DATACATALOG_TABS](/tables/mpe_datacatalog_tabs).
The Catalog does **not** contain information about the data content (values). It is based primarily on the existing SAS dictionary tables, augmented with attributes such as primary key fields, filesize / libsize, and number of observations (eg for database tables).
Frequently changing data (such as nobs, size) are stored on the MPE_DATASTATUS_XXX tables. The rest is stored on the MPE_DATACATALOG_XXX tables.
@@ -35,3 +53,34 @@ The following assumptions are made:
If you have duplicate librefs, specific table security setups, or sensitive models - contact us.
## Refreshing the Data Catalog
The update process for INDIVIDUAL libraries can be run by any user, and is performed in the VIEW menu by expanding a library definition and clicking the refresh icon next to the library name.
![](./img/catalogrefresh.png)
Members of the admin group may run the refresh process for ALL libraries by clicking the REFRESH button on the System page.
When doing a full scan, the following LIBREFS are ignored:
* 'CASUSER'
* 'MAPSGFK'
* 'SASUSER'
* 'SASWORK
* 'STPSAMP'
* 'TEMP'
* `WORK'
Additional LIBREFs can be excluded by adding them to the `DCXXXX.MPE_CONFIG` table (where `var_scope='DC_CATALOG' and var_name='DC_IGNORELIBS'`). Use a pipe (`|`) symbol to seperate them. This can be useful where there are connection issues for a particular library.
Be aware that the scan process can take a long time if you have a lot of tables!
Output tables (all SCD2):
* [MPE_DATACATALOG_CATS](/tables/mpe_datacatalog_libs) - SAS Catalog list
* [MPE_DATACATALOG_LIBS](/tables/mpe_datacatalog_libs) - Library attributes
* [MPE_DATACATALOG_TABS](/tables/mpe_datacatalog_tabs) - Table attributes
* [MPE_DATACATALOG_VARS](/tables/mpe_datacatalog_vars) - Column attributes
* [MPE_DATASTATUS_LIBS](/tables/mpe_datastatus_libs) - Frequently changing library attributes (such as size & number of tables)
* [MPE_DATASTATUS_TABS](/tables/mpe_datastatus_tabs) - Frequently changing table attributes (such as size & number of rows)