What JOIN clause do I need?
Here are the various JOIN types to return the data you need from 2 tables.
Why is my LEFT JOIN only returning data from both tables?
All of the tables used are available in REFMART
A fairly common issue is that a LEFT JOIN will only return data that appears in both tables. This is due to the WHERE clause having a condition which appears in the LEFT JOINed table
Imagine a screnario where we want to extract a full list of country names and only fill in the currency if it is Euro
Country_ISO3 | Country_Name | Currency_ISO3 | Currency_Name |
---|---|---|---|
AGO | Angola | NULL | NULL |
AUS | Australia | NULL | NULL |
AND | Andorra | EUR | Euro |
AZE | Azerbaijan | NULL | NULL |
AUT | Austria | EUR | Euro |
ARM | Armenia | NULL | NULL |
ALB | Albania | NULL | NULL |
ATF | French Southern Territories | EUR | Euro |
ATG | Antigua and Barbuda | NULL | NULL |
ABW | Aruba | NULL | NULL |
AIA | Anguilla | NULL | NULL |
AFG | Afghanistan | NULL | NULL |
ASM | American Samoa | NULL | NULL |
ARG | Argentina | NULL | NULL |
ALA | Åland Islands | EUR | Euro |
ARE | United Arab Emirates | NULL | NULL |
If the SQL looked like this
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
WHERE
cntry.CODE_ISO_3 LIKE 'A%'
AND
curr.Title = 'Euro';
it would only return a list of countries with Euro as their currency.
Country_ISO3 | Country_Name | Currency_ISO3 | Currency_Name |
---|---|---|---|
ALA | Åland Islands | EUR | Euro |
AND | Andorra | EUR | Euro |
AUT | Austria | EUR | Euro |
ATF | French Southern Territories | EUR | Euro |
This is because the WHERE clause contains
curr.Title = 'Euro'
This means that curr.Title cannot be NULL which is what would be returned if the currency is not Euro. To fix this, the curr.Title condition should be added to the JOIN clause instead like this
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
AND
curr.Title = 'Euro'
WHERE
cntry.CODE_ISO_3 LIKE 'A%';
Now it returns the correct data.
What is a CTE?
All of the tables used are available in REFMART
For these examples, we are going to reuse the same code by with different filtering and options. In order not to keep modifying the main query, we can put it in a CTE (Common Table Expression)
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT
Country_ISO3,
Country_Name,
Currency_ISO3,
Currency_Name,
Sys_PK
FROM
CTE_Country_Currencies;
This is a useful tool if you would like to join the query above to itself or if you use the same SQL in multiple places in your query.
However there are some disadvantages.
For example this data is not as easy to get
Country_ISO3 | Country_Name | Currency_ISO3 | Currency_Name |
---|---|---|---|
AGO | Angola | NULL | NULL |
AUS | Australia | NULL | NULL |
AND | Andorra | EUR | Euro |
AZE | Azerbaijan | NULL | NULL |
AUT | Austria | EUR | Euro |
ARM | Armenia | NULL | NULL |
ALB | Albania | NULL | NULL |
ATF | French Southern Territories | EUR | Euro |
ATG | Antigua and Barbuda | NULL | NULL |
ABW | Aruba | NULL | NULL |
AIA | Anguilla | NULL | NULL |
AFG | Afghanistan | NULL | NULL |
ASM | American Samoa | NULL | NULL |
ARG | Argentina | NULL | NULL |
ALA | Åland Islands | EUR | Euro |
ARE | United Arab Emirates | NULL | NULL |
That is because you can’t add the filter in the LEFT JOIN as you has before so this query so if you run this
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT
Country_ISO3,
Country_Name,
Currency_ISO3,
Currency_Name,
Sys_PK
FROM
CTE_Country_Currencies
WHERE
Country_ISO3 LIKE 'A%'
AND
Currency_Name = 'Euro';
it will only return a list of countries with Euro as their currency because the Currency Name can’t be NULL which is what we want.
Country_ISO3 | Country_Name | Currency_ISO3 | Currency_Name |
---|---|---|---|
ALA | Åland Islands | EUR | Euro |
AND | Andorra | EUR | Euro |
AUT | Austria | EUR | Euro |
ATF | French Southern Territories | EUR | Euro |
In order to get the data we want, we need to add another CTE and change the SELECT query at the bottom
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
), CTE_Currency_Filter AS
(
SELECT
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name
FROM
training_cf.REF_CURRENCY curr WITH (NOLOCK)
WHERE
curr.TITLE = 'Euro'
)
SELECT
cc.Country_ISO3,
cc.Country_Name,
cf.Currency_ISO3,
cf.Currency_Name,
ccy.Sys_PK
FROM
CTE_Country_Currencies cc
LEFT JOIN
CTE_Currency_Filter cf
ON
cf.Currency_ISO3 = cc.Currency_ISO3
AND
cf.Currency_Name = 'Euro'
WHERE
cc.Country_ISO3 LIKE 'A%';
So whereas CTEs are useful, they are not always the answer.
How do I create a CSV from a long table?
All of the tables used are available in REFMART
If we take the query and instead of restricting it to Euro, we find some countries with many currencies
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT
Country_ISO3,
Country_Name,
Currency_ISO3,
Currency_Name
FROM
CTE_Country_Currencies
WHERE
cntry.CODE_ISO_3 IN ('CHE', 'URY', 'CUB');
This returns the following data
Country_ISO3 | Country_Name | Currency_ISO3 | Currency_Name |
---|---|---|---|
CHE | Switzerland | CHF | Swiss Franc |
CHE | Switzerland | CHE | WIR Euro |
CHE | Switzerland | CHW | WIR Franc |
CUB | Cuba | CUP | Cuban Peso |
CUB | Cuba | CUC | Peso Convertible |
URY | Uruguay | UYU | Peso Uruguayo |
URY | Uruguay | UYI | Uruguay Peso en Unidades Indexadas (UI) |
URY | Uruguay | UYW | Unidad Previsional |
In this scenario we are trying to get a list of Currency Codes for the Countries
Country_ISO3 | Currency_ISO3_List |
---|---|
CHE | CHF,CHE,CHW |
CUB | CUP,CUC |
URY | UYU,UYI,UYW |
Here is the answer
As you can see, the CTE is very useful in this case
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT DISTINCT
Country_ISO3,
STUFF(( SELECT
',' + Currency_ISO3
FROM
CTE_Country_Currencies clist
WHERE
clist.Country_ISO3 = cc.Country_ISO3
FOR XML PATH ('')), 1, 1, '') AS Currency_ISO3_List
FROM
CTE_Country_Currencies cc
WHERE
Country_ISO3 IN ('CHE', 'URY', 'CUB');
How does it work?
If you run this SQL
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT
',' + Currency_ISO3
FROM
CTE_Country_Currencies clist
WHERE
Country_ISO3 IN ('CHE')
FOR XML PATH('');
You get this
XML_F52E2B61-18A1-11d1-B105-00805F49916B |
---|
,CHF,CHE,CHW |
Notice that there is a comma at the beginning which we want to get rid of
If we run this
SELECT STUFF(',CHF,CHE,CHW', 1, 1, '') AS CurrencyList;
We get this
CurrencyList |
---|
CHF,CHE,CHW |
So the comma has been removed from the beginning.
The STUFF function replaces a set of characters in a string.
STUFF ( character_expression , start , length , replace_with_expression )
In this case “,CHF,CHE,CHW”, it goes to the first character, and replaces it with an empty string.
If you combine the two and join to the base table, you end up with the CSV that you wanted.
How do I create a long list from a table with a CSV column?
All of the tables used are available in REFMART
To generate the source data, we take the SQL used to create the long to wide and put it into a temporary table
DROP TABLE IF EXISTS #CurrencyTemp;
WITH CTE_Country_Currencies AS
(
SELECT
cntry.CODE_ISO_3 AS Country_ISO3,
cntry.NAME_SHORT_EN AS Country_Name,
curr.CODE_ISO_3 AS Currency_ISO3,
curr.TITLE AS Currency_Name,
cntry.Sys_PK
FROM
training_cf.REF_COUNTRY cntry WITH (NOLOCK)
LEFT JOIN
training_cf.REF_COUNTRY_CURRENCY rcc WITH (NOLOCK)
ON
rcc.COUNTRY_FK = cntry.Sys_RowTitle
LEFT JOIN
training_cf.REF_CURRENCY curr WITH (NOLOCK)
ON
curr.Sys_RowTitle = rcc.CURRENCY_FK
)
SELECT DISTINCT
Country_ISO3,
STUFF(( SELECT
',' + Currency_ISO3
FROM
CTE_Country_Currencies clist
WHERE clist.Country_ISO3 = cc.Country_ISO3
FOR XML PATH ('')), 1, 1, '') AS Currency_ISO3_List
INTO
#CurrencyTemp
FROM
CTE_Country_Currencies cc
WHERE
Country_ISO3 IN ('CHE', 'URY', 'CUB');
So #CurrencyTemp now contains this
Country_ISO3 | Currency_ISO3_List |
---|---|
CHE | CHF,CHE,CHW |
CUB | CUP,CUC |
URY | UYU,UYI,UYW |
Now, we do a CROSS APPLY to STRING_SPLIT
SELECT
ct.Country_ISO3,
c.value AS Currency_ISO3
FROM
#CurrencyTemp ct
CROSS APPLY
STRING_SPLIT(ct.Currency_ISO3_List, ',') c;
This is the output
Country_ISO3 | Currency_ISO3 |
---|---|
CHE | CHE |
CHE | CHW |
CHE | CHF |
CUB | CUP |
CUB | CUC |
URY | UYI |
URY | UYW |
URY | UYU |