SQL queries in SEASTERS#
Generalities#
You may have noticed it during installation: seastersdb is essentially based on DuckDB, which provides a friendly SQL interface to store and retrieve data in and from a relational database. With seastersdb, we only provide a tool for the “data retrieval” part. SQL (Structured Query Language) is a quite intuitive language allowing quick search and filtering, querying efficiently the database’s tables of rows (records) and columns (fields). This documentation does not include a course on SQL queries; we only provide our database-specific macros, together with a number of examples. If you need further information on the syntax, we recommend having a look at the “Query Syntax” section of this documentation.
SEASTERS macros & functions#
Beyond leveraging DuckDB, seastersdb provides SQL macros and functions to enhance the user experience while querying the SEASTERS database.
Macros#
Although there are a few exceptions, SEASTERS macros are built as follows (replace
<network> by one among bsrn, ghcnd, ghcnh and gsdr):
<network>()points to the actual data records;<network>_var()points to the variable metadata;<network>_inv()points to the inventory (available time interval by station);<network>_stations()points to the station metadata.
Exceptions are:
BSRN data records are actually divided in disctinct datasets with specific inventories and variables, such that
bsrn()actually takes adatasetargument among'radiosonde','radiation_10m','radiation'and'SYNOP'.gsdr_var()does not exist, because this network only includes precipitation data in millimeter, with no more variable metadata.
The table below lists all resulting macros with their specific descriptions:
Name |
Description |
|---|---|
|
Access BSRN observational data for a given dataset. Allowed values
for |
|
Access station metadata for the BSRN network. |
|
Access inventory for the BSRN network. |
|
Access variable metadata for the BSRN network. |
|
Access observational data for the GHCNd network. |
|
Access station metadata for the GHCNd network. |
|
Access inventory for the GHCNd network. |
|
Access variable metadata for the GHCNd network. |
|
Access observational data for the GHCNh network. |
|
Access station metadata for the GHCNh network. |
|
Access inventory for the GHCNh network. |
|
Access variable metadata for the GHCNh network. |
|
Access observational data for the GSDR network. |
|
Access station metadata for the GSDR network. |
|
Access inventory for the GSDR network. |
SEASTERS relational schemas#
Tables behind SEASTERS macros are linked to each other following the relational schemas shown below. They can be joint using the primary keys emphasized in bold.
Functions#
Name |
Description |
|---|---|
|
Return |
Examples#
Applying SEASTERS macros with simple SQL syntax, please find below a series of query examples to access various information inside the SEASTERS database.
examples/ghcnd.py#SELECT r.timestamp, r.station_id, r.TAVG
FROM ghcnd() AS r
JOIN ghcnd_stations() AS s USING (station_id)
WHERE
s.lat BETWEEN 0 AND 20
AND r.timestamp BETWEEN TIMESTAMP '2010-01-01' AND TIMESTAMP '2015-02-15'
AND r.TAVG IS NOT NULL
ORDER BY r.station_id, r.timestamp
examples/ghcnh.py#SELECT r.timestamp, r.station_id, r.precipitation
FROM ghcnh() AS r
JOIN ghcnh_stations() AS s USING (station_id)
WHERE
s.lat BETWEEN 0 AND 10
AND r.timestamp BETWEEN TIMESTAMP '2010-01-01' AND TIMESTAMP '2015-02-15'
AND r.precipitation IS NOT NULL
ORDER BY r.station_id, r.timestamp
examples/gsdr.py#SELECT r.timestamp, r.station_id, r.Precipitation
FROM gsdr() AS r
JOIN gsdr_stations() AS s USING (station_id)
WHERE
s.lat BETWEEN 0 AND 20
AND s.lon BETWEEN 100 AND 120
AND r.timestamp BETWEEN TIMESTAMP '2000-01-01' AND TIMESTAMP '2015-02-15'
AND r.Precipitation IS NOT NULL
ORDER BY r.station_id, r.timestamp
examples/bsrn.py#SELECT r.timestamp, r.station_id, r.DIF
FROM bsrn('radiation') AS r
JOIN bsrn_stations() AS s USING (station_id)
WHERE
r.timestamp BETWEEN TIMESTAMP '2000-01-01' AND TIMESTAMP '2015-02-15'
AND r.DIF IS NOT NULL
ORDER BY r.station_id, r.timestamp