Download all tutorial files

Introduction

xMart supports foreign key relationships between tables. A foreign key relationship enforces data integrity by limiting the values in a column in one table (the “fact” or “data” table) to the values in another table (the “foreign key”, “reference”, “master” or “lookup” table).

A foreign key relationship is created like in the picture below:

In the picture, you are seeing a part of the model file that you’ll use later in the example of this tutorial.

You can see that in the SHANOLA_CASE_COUNTS table, we are adding a field called COUNTRY_FK which references the REF_COUNTRY table. We provide the code of the fact table (SHANOLA_CASE_COUNTS), the code of the field in the fact table (COUNTRY_FK), the data type of the field in the fact table (FOREIGN_KEY) and finally the code of the foreign key table (REF_COUNTRY).

Note that the field in the foreign key table is not mentioned. This is because, in xMart, the field in the foreign key table is configurable. To define the field in the foreign key table, set the “Is Row Title” property to true on the field in the foreign key table.

PUT PICTURE HERE OF IS_ROW_TITLE IN FK TABLE

Internally, xMart creates the true, physical foreign key relationship using a system ID field (Sys_ID). In other words, the Sys_ID field from the foreign key table is stored in the fact table.

The field in the foreign key table that is defined as the “row title” is what will be displayed when viewing the fact table (the internal Sys_ID values would not be shown). This is also what is exported and also, when importing, what is used to match values in the foreign key table.

We will go through some examples to make it more clear.

Examples of Foreign Key

General Info

You are provided with a dataset (Shanola Cases), a pipeline, the model and the countries table which populates REF_COUNTRY_T7.

In the folder T7_ForeignKeys (you may download this by clicking the link at the top of this page) there will be all the files you need. The files in the main folder are the starting files, and for each of the subfolders you will receive instructions on what to do in each of the cases.

The purposes of this example are:

A) To demonstrate how to create a Foreign Key, how to perform the lookup in a pipeline and how to re-populate the REF tables if needed, after you’ve received a missing value error.

B) To explain the different ways to perform a lookup in a pipeline according to the different scenarios based on which column is set as SysRowTitle, which is the deciding factor for the approach we are going to use.

In general, there are two ways,the first, is to ask the system to produce automatically the lookup, which will be explained in case 2, but has some limitations which we will show how the user can write manually the lookup which overcomes them.

Note: We strongly suggest to follow the following examples in the provided sequence, in order for the provided files to work without any adjustments needed from your end.

Model

Below, is the base model we are going to use for all the cases we will go through. Of course, there will be adaptations which will be explained in each section.

In order to create a Foreign Key in xMart, you need to process your mart’s model file. In which you have to add a field in the source table, in this case SHANOLA_CASE_COUNTS which will reference the REF table, in this case REF_COUNTRY.

Afterwards you need to set the Field Data Type to FOREIGN_KEY and the Lookup Table to REF_COUNTRY_T7.

Note: Due to the naming convention the Foreign Key field, has to follow the following format ABCD_FK, where ABCD is the name you desire to use, followed by _FK, all in capital letters e.g. COUNTRY_FK.

Provided Dataset

Here is the base dataset we will use in this example. If there are any changes for each of the cases below, we will highlight them.

As you may notice, row 8, is in red font, because ISO3 XYZ doesn’t exist in the REF_COUNTRY for the purposes of the example.

Note: Each case will be a build up upon the previous one.

Case 1 - Add Missing Values

In this case we will go through how to add a missing value in a REF table. First you need to upload the model, populate the REF_COUNTRY_T7 using the coultries.xlsx and transfer the pipeline 1, place in the subfolder “Case 1” in your mart’s pipelines.

Model and Dataset

We will use the base versions for this case, stored inside T7_ForeignKeys.

Pipeline

This is the pipeline for this example. We are going to refer to this pipeline as LOAD_CASE1, so you can use this name as the code when you’ll create it in the mart.

<XmartPipeline IsStructure="false">
  <Extract>
    <GetExcel TableName="Cases" FindStartingRow="iso3, suspected, confirmed" OutputTableName="data" />
  </Extract>
  <Transform>
    <CleanTable RemoveBlankRows="true" TrimValues="true" RemoveBlankColumns="false" />
      <!-- The Renaming Commands will be explained in Case 2-->
    <RenameColumn From="ISO3" To="COUNTRY_FK" /> 
	<RenameColumn From="Date" To="REPORT_DATE" /> 
  </Transform>
  <Load>
    <LoadTable SourceTable="data" TargetTable="SHANOLA_CASE_COUNTS" LoadStrategy="MERGE">
        <!-- The <ColumnMappings Auto="true"> will be explained in Case 2-->
		<ColumnMappings Auto="true">
        </ColumnMappings>
    </LoadTable>
  </Load>
</XmartPipeline>

Result

After loading the Shanola Cases.xlsx using the LOAD_CASE1 you’ll see the following

As you will notice the Issues tab is clickable, so there are issues you need to go through, before committing.

In this case there is a missing value in the reference table, therefore the system will produce an error after the staging part is concluded. Furthermore, in the error page, you’ll be given the option to add the missing value and afterwards to re-process the file.

In this example, as it was pointed out earlier, in the cases file there is one row for the country with XYZ as ISO3 code. This country doesn’t exist in the countries file and therefore in the REF_COUNTRY. So, when you’ll try to run the pipeline above you’ll receive a missing value error for this row.

As you can notice, under the Resolution Column, you have the option to add XYZ to the REF table, where by clicking the plus symbol, you are prompted to add the required information.

After you’ve successfully added XYZ to the REF table, then you may proceed to reprocess the file by clicking Cancel and reprocess.

Case 2 - Successful Automated Lookup

One of the easiest ways in order to load a source file using an origin in xMart, is to have the source column names the same as the target columns in the way they are in the model. Obviously, it is not efficient to change the actual names in the source, but we can use the RenameColumn command to achieve this, either by transforming the incoming data Globally or locally, as we’ve seen in tutorial 5 and 6.

Model and Dataset

You will not need to make any changes everything remains the same.

Pipeline

In Case 1, we went through the pipeline LOAD_CASE1, where we can notice the RenameColumn and ColumnMappings commands. We will now go through this pipeline in greater details.

With the RenameColumn we are renaming the columns ISO3 and Date like they are in the source file (Shanola Cases) to COUNTRY_FK and REPORT_DATE like the target columns are in the model. If you compare the model to the Shanola Cases file, you’ll notice that all the other columns have the same name as the equivelant in the model, so no further action is needed.

The ColumnMappings which is set to Auto=true (ColumnMappings Auto=”true”), is used to dictate to the system to perform an automated mapping, which means that all the data for between the source and target columns that share the same name will be loaded in the target table.

The above functionality is quite useful to perform a lookup to a REF table. In this case, the source ISO3 columns corresponds to the COUNTRY_FK which is a foreign key to the SHANOLA_CASE_COUNTS table. By renaming the ISO3 to COUNTRY_FK and use the automated column mappings, the system will produce automatically the necessary code to perform a System ID Lookup (SysIDLookup) which checks if the incoming data (ISO3) are valid according to the REF table (REF_COUNTRY_T7). That’s why in Case 1 the system was able to detect that country XYZ didn’t exist, which informs us that either this isn’t a valid name so we need to correct our source data, or we need to add the missing value in the REF table.

One key thing, for this approach to work is the RowTitle. If you take a look at the model, you will notice that we have set as RowTitle the ISO3 column in the REF_COUNTRY_T7. This means that the system checks if the actual data in the source ISO3 (Shanola Cases) exist in the ISO3 column of the REF table. If the data are the same, then no errors will be produced. But in case that the data can’t be found errors will be produced like we saw in Case 1. These errors is possible to be fixed like we saw in Case 1, where there was a missing value, but there might be a need to perform a manual lookup for reasons we will explain in Case 3 and Case 4.

Load (Debug)

A useful tool, especially when we are using for the first time a pipeline, is the Load Debug tool.

This is the pipeline we are using, LOAD_CASE1

At the next picture, you’ll see the code that is actually executed by the system because we took advantage of what we mentioned before.

From line 20 through line 22 the system automatically produced the code to perform the Lookup, and from line 24 to 31, the system automatically mapped all the source columns to their targets. So, as you may realized, the user/ creator of the pipeline may save lot’s of time by following this solution.

For more info on the Load Debug please click here

Case 3 - Failed Automated Lookup

In this case we will go through a failed automated lookup, which will be triggered because the RowTitle will be set to a different column in the Ref table than the one before. This is a common real world scenario when for example when the incoming data are ISO3 codes like “ABC” and what we want to end up showing in our target table is for example the Title of the country “ABC Country”.

Before we may start, please open the subfolder Case 3 under T7_ForeignKeys and upload the model_2, populate the REF_COUNTRY_T7b by using the coultries_2.xlsx and the pipeline2, which we will call LOAD_CASE3 from this point onwards.

Model

Below is the model_2. You may notice that except the name change in the tables, which happens in order not to replace the existing ones, in the REF table, we’ve set the RowTitle to the TITLE column, which contains the full name of each country.

Pipeline

Below is the LOAD_CASE3 pipeline. It’s almost the same as LOAD_CASE1, but adapted to the new target table.

<XmartPipeline IsStructure="false">
  <Extract>
    <GetExcel TableName="Cases" FindStartingRow="iso3, suspected, confirmed" OutputTableName="data" />
  </Extract>
  <Transform>
    <CleanTable RemoveBlankRows="true" TrimValues="true" RemoveBlankColumns="false" />
    <RenameColumn From="ISO3" To="COUNTRY_FK" /> 
	<RenameColumn From="Date" To="REPORT_DATE" /> 
  </Transform>
  <Load>
    <LoadTable SourceTable="data" TargetTable="SHANOLA_CASE_COUNTS2" LoadStrategy="MERGE">
		<ColumnMappings Auto="true">
        </ColumnMappings>
    </LoadTable>
  </Load>
</XmartPipeline>

Result

In the result page we are seeing something interesting. There aren’t any rows about to be loaded, and the Issues tab is clickable.

In the Issues tab we are receiving a similar error to the one of Case 1, when country XYZ was missing. But in this case all the country ISO3 codes are in the REF_COUNTRY_T7b table, which indicates that the solution is not to add again the “missing “values in there. The system due to the automated lookup it tries to match the data from the ISO3 column in the source file to the column that is set as RowTitle. In case 3, as mentioned, the RowTitle is set to the TITLE column in the REF table, which contains the full name of each country and not the ISO3 code, which leads to these errors.

This situation may be resolved by performing a manual SysIDLookup similar to the one we saw the system automatically producing in case 2. We will go into details in Case 4.

Case 4 - SysIDLookup

In this case we will resolve the issues in Case 3 by performing a manual lookup, SysIDLookup and we are going to explain how it works and how you may use it.

Model and Dataset

You will not need to make any changes everything remains the same.

Pipeline

Below is the pipeline we are going to use for this case, which will be called LOAD_CASE4.

You may notice, that we have removed the RenameColumn command for the source ISO3 and added the following:

  1. LookupIDs within the LoadTable
  2. A ColumnMapping command within ColumnMappings.
<XmartPipeline IsStructure="false">
  <Extract>
    <GetExcel TableName="Cases" FindStartingRow="iso3, suspected, confirmed" OutputTableName="data" />
  </Extract>
  <Transform>
    <CleanTable RemoveBlankRows="true" TrimValues="true" RemoveBlankColumns="false" />
    <RenameColumn From="Date" To="REPORT_DATE" /> 
  </Transform>
  <Load>
    <LoadTable SourceTable="data" TargetTable="SHANOLA_CASE_COUNTS2" LoadStrategy="MERGE">
     <LookupIDs>
        <SysIDLookup SourceResultColumn="COUNTRY_FK__Sys_ID" SourceColumns="ISO3" LookupTable="REF_COUNTRY_T7b" LookupColumns="ISO3" />
      </LookupIDs>
		<ColumnMappings Auto="true">
         <ColumnMapping Source="COUNTRY_FK__Sys_ID" Target="COUNTRY_FK" /> 
        </ColumnMappings>
    </LoadTable>
  </Load>
</XmartPipeline>

Performing the Lookup

LookupIDs:

  • LookupTable: REF_COUNTRY_T7 (The reference table)
  • LookupColumns: ISO3 (The column which contains the values in the REF table)
  • SourceColumns: ISO3 (The column in the source file, Cases)
  • SourceResultColumn: COUNTRY_FK__Sys_ID (It holds temporary the result of the lookup)

ColumnMapping

The next step, is to target the result to the field of our target table. So inside the column mappings we map the COUNTRY_FK__Sys_ID to the COUNTRY_FK, which is the name we previously used in the model.

Important note: For the name of the SourceResultColumn we use the name of the Foreign Key, followed by the __Sys_ID

Result

In the Preview Results page if we re upload the Shanola counts file using LOAD_CASE4, we will see that all the data are successfully loaded. On top of that, under the COUNTRY_FK column, we will also notice a difference from Case 2. Now we see the full name of each country and not the ISO3 code, this is due to the fact that the RowTitle is set to the TITLE column in the REF_COUNTRY_T7b.