About

Any xMart table or view that is visible in the UI also has a URL which returns data in a format that remote software can consume. This is a data API (Application Programming Interface); it facilitates extraction or use of data stored in xMart by another computer system.

A table or view’s API “endpoint” (the url) can be accessed by pressing the “API” button on the data view page.

image-20210128075130826

The API returns JSON formatted data by default but a csv file or stream can also be returned.

image-20210128075711730

The API complies with the OData standard . OData is essentially SQL over http; it allows you select, filter and order your data simply by modifying URL parameters.

The xMart OData implementation is read-only.

An API endpoint is either “private” or “public”. “Private” means only authorized users or applications can access the endpoint (default). “Public” means the endpoint is available to any user or application on the internet without logging in.

The URLs of private and public endpoints are different. Pressing the API button will open either the private or the public endpoint.

Public OData API

Public endpoints are easy to setup and provide excellent performance when a URL is repeatedly requested. Internally, the public API is an on-the-fly file generator. Each request for data becomes a physical file which is cached until data is changed by the next loaded batch.

How to make a table or view public

To make a table or view public via API, authorize the anonymous user with DATA_VIEW permission to the table or view.

image-20210128091618014

Note that making a table or view public only applies to the API; adding anonymous user does not allow the general public to login to the xMart web application. Users must always login to the xMart web application.

Public API Caching

Each unique public OData URL returns a data string (json or csv) or an error condition (like “404 not found”) which is cached for faster subsequent retrievals. If a requested URL is not in the cache (a “cache miss”) the database is queried, which can be fast or slow depending upon several factors ranging from the performance quality of a custom SQL view written by an end-user, to how much load is currently on the database. See Tips to improve query performance.

All cached URLs for a public table or view are purged when either the data or structure of a table changes. When a batch is committed, the caches of every modified target table and also all dependent custom SQL views, are purged.

The only exception to the above is when the definition of a custom SQL view is modified. Because the system is currently not aware of changes to custom SQL views, the cache needs to be manually purged or wait until data is loaded into one of the tables that the custom SQL view uses. To manually purge the cache, navigate to Admin/Mart, select the object(s) which need to be purged, and click “Request Purge”.

manually purge public cache

Regardless of how the cache is purged, there can be a delay up to 60 minutes before new data or new structure is returned via the API.

Private OData API

Private end points are not cached and always hit the database directly (use with care).

How to make a private endpoint - for a person

Private endpoints are the default; if the user has access to a table or view in the UI then the user also has acces to its private API endpoint. The user must have DATA_VIEW rights on the table or view and the table or view should not allow the ANONYMOUS_PUBLIC user.

The user with DATA_VIEW rights should be able to click the “API” button on the data page in the UI and see the API data.

How to make an private endpoint - for a client application

The process to connect an external client application, such as an R or pythong scrip,t to the API is more involved.

The client application is treated like an end-user. Like an end-user, who must have a username and password to be identified, a client application must have a client ID and client secret to be identified. In other words, every remote system that is to consume the private OData API must be identifed by being registered. For the details of this process, see How to register a new client application

API output formats

In a nutshell, the API returns data either as csv or JSON conforming to the OData standard.

The format is controlled by the $format parameter with OData json as the default if no $format parameter is supplied.

Some of the format options support “streaming”. Streaming means that data is retuned in multiple chunks rather than all at once. Streaming is useful when the data to be returned is very large.

A summary of formats plus support of streaming and caching are provided below. They are a little different for private vs public endpoints.

When “No limit” is indicated for the Max Records returned, it means there is no configured limit to how many rows can be returned, but there could be other factors such as slow queries and timeouts that affect the maximum number of returned records.

Public API

Value of $format Purpose Returns Max Records Per Request Streamed Caching
(none) General purpose use. JSON No limit Yes (8Mb chunks) Yes
$format=csv Returns csv instead of JSON, which is 1/3 the size of json. Triggers browser to do a file download. CSV No limit Yes (8Mb chunks) Yes
$format=csv_inline Just like $format=csv but is displayed in the browser without triggering a file download. CSV No limit Yes Yes

Private API

As of version 4.21, private API requests are never cached (always result in a DB hit).

Value of $format Purpose Returns Max Records Per Request Streamed Caching
(none) General purpose use; not for very large queries. JSON 120,000 No No
$format=csv Returns csv in an efficient manner suitable for very large queries. Just like $format=streaming but returns csv instead of JSON. CSV No limit Yes No
$format=streaming For large to huge data sizes. Data is always streamed. JSON No limit Yes No
$format=csv_inline Just like $format=csv but is displayed in the browser without triggering a file download. CSV No limit Yes No

OData Syntax

The OData Syntax is a standard with plenty of internet resources available describing that.

Here is a handy “cheat sheet” for sql users but his is the official reference guide.

Supported operators & limitations

The xMarT OData API only supports data retrieval (no insert, update or delete). Most, but not all, OData retrieval features are supported. See examples below.

  • $select
  • $filter
  • $top
  • $count=true
  • $orderBy
  • $skip
  • /$count (returns row count)
  • $expand (FK already displayed as rowTitle)
  • $apply (only GROUPBY and filter)
  • $apply(all but GROUPBY and filter)
  • $filter + contains (like SQL IN operator)
  • startsWith
  • endsWith

Note on URL encoding

Sometimes special characters, like spaces and single quotes, are encoded in URLs. A space character may be replaced by “%20”. This can make a URL look more confusing that it really is, but encoded URLs work more reliably in different systems. If a URL is not encoded and pasted in an email, the link may break where the space is. If it’s encoded, it can usually be pasted and used.

For example, these 2 urls are equivalent, but look different:

Encoded URL:

https://frontdoor-l4uikgap6gz3m.azurefd.net/refmart/ref_country?$filter=CODE_ISO_3%20eq%20’ZWE’

Decoded URL:

https://frontdoor-l4uikgap6gz3m.azurefd.net/refmart/ref_country?$filter=CODE_ISO_3 eq ‘ZWE’

Online URL encoder/decoder sites exist to help convert between them, like urldecoder.org.

OData Examples

The examples below use {mart_code}, {object_code} and {root_url} placeholders.

Replace {mart_code} by the code of the mart. Replace {object_code} by the code of the table or view being retrieved. Replace {root_url} by the root URL of the API.

  • The {root_url} for WHO public production is “frontdoor-l4uikgap6gz3m.azurefd.net”.
  • The {root_url} for WHO private production is “extranet.who.int/xmart-api/odata”.

List mart objects and metadata

List all tables and views in a mart. Available in both the private and public API.

List each field of each table and view, field data types and whether fields are foreign keys.

Count of rows in a table or view. Only the row count is returned as a simple number (no json or csv).

Count of rows returned along with a data retrieval, as the “@odata.count” property inside the returned json.

Unfiltered data in JSON or CSV

Return all data of an object in JSON format.

Return all data of an object as a csv file download.

Returning pages ($top and $skip)

$top indicates the page size.

Return top 20 rows all data of an object in the default format (json).

$skip indicates how many records to skip. For the private API, it is important that $top does not exceed the “Max Records Per Request” (see table above).

Return “page 3” of data, ie rows 41 to 60

Specifying fields

Return only specified fields.

Filtering

Use $filter to filter data.

Tip: To generate an OData Url for a filtered slice of data, use the filters on the data view page of the user interface (UI) and then press the API button. The API URL will reflect the filters selected in the UI.

Return rows where CODE_ISO_3 equals ZWE

Return rows where CODE_ISO_3 field is on of ZWE, VIR or LBN.

Return rows where the NAME_SHORT_EN field contains ‘republic’

Group by

“Group by” is only partially supported. It can be used to filter and remove duplicate rows but as of version 4.21 it can not be used to do aggregations such as SUM, AVG, etc.

  • https://{root_url}/{mart_code}/{object_code}?$apply=filter(a eq 'b')/groupby((c,d))

For WHO territories with WHO_LEGAL_STATUS equal to ‘M’, list the unique combinations of GRP_WHO_REGION and GRP_WB_INCOME

Returning large objects

When objects have many hundreds of thousands or many millions of rows, special care is needed to download the entire object. Some tips:

  • Use csv instead of json format. csv is 1/3 the size of json.

  • Use $select to return only the fields that are needed.

It also depends upon whether the private or public API is used.

Public API

The public API automatically streams data in 8 Mb chunks. For modest-sized objects, nothing special is normally required on the client side. As the size of the file increases, however:

  • The very first time a large object is downloaded the request hits the database because it is not yet in the cache and the download may time-out. Normally, the download continues to happen in the background even after the time-out message is sent to the client. Waiting a few minutes and re-trying the download may later result in success.

Private API

There are 3 ways to return very large objects via the private API.

  • Use $format=csv. This has a double benefit because the data is streamed and because csv is 1/3 the size of json.

  • Use $format=streaming. This returns OData json but the data will be streamed.

  • Use “@odata.nextLink”. When the number of rows exceeds the page limit (see “Max Records per Request” above), the OData nextLink property is returned in the OData json formatted response, The value of the property is an OData URL pointing to the next page of data. Clients can iteratively read he value of nextlink until it is no longer provided in the response

Examples of how to use it are plentiful on the internet.

image-20210128144021103

Here is an R code snippet showing how to make multiple requests using nextLink to download a large table or view. It is recursively calling the “xmart4_get” function.

xmart4_get <- function(url, t_q, token, full_table) {
  resp <- httr::GET(httr::modify_url(url),
                    token_header(token),
                    ua,
                    query = t_q)
  assert_status_code(resp)
  assert_json(resp)
  parsed <- httr::content(resp,
                          as = "parsed",
                          type = "application/json")
  assert_content(parsed)
  df <- parsed_to_df(parsed)
  next_link <- parsed[["@odata.nextLink"]]
  if (full_table & !is.null(next_link)) {
    params <- unlist(stringr::str_match_all(next_link, "(.+?)\\?(\\$.+)"))
    df <- dplyr::bind_rows(df, xmart4_get(params[2], params[3], token, full_table))
  }
  df
}

Query using POST

Since xMart v4.12, the API also supports POST /$query requests. Instead of putting OData parameters in the URL, they are put in the http request body. This is necessary when the size of the URL becomes too long (maximum query string length is 2048 characters).

To send a convert a regular GET request to POST, move the query string (the part after the ?) to the body and the request, and append /$query to the URL path (remove the ?).

The request HTTP header “Content-Type” must set with value “text/plain” (even if body is empty).

Example:

GET request: 
  url: https://extranet.who.int/xmart-api/odata/{mart_code}/{table_code}?$top=20&$filter=code%20in%20('AFG','PAK')

Equivalent POST request:
  url: https://extranet.who.int/xmart-api/odata/{mart_code}/{table_code}/$query
  body: $top=20&$filter=code in ('AFG','PAK')
  header: Content-Type=text/plain

Technical Details

Simplified diagram of xMart caching architecture.

image-20210128091221810

HTTP Response Headers

The following HTTP Response headers are returned by the xMart OData API.

Private/Public Header Content
Both Date Date of the request processed by the API
Public x-cache Added by Azure frontdoor. TCP_HIT = frontdoor cache hit, TCP_MISS = frontdoor cache miss.
Both odata-version Version of OData standard