InfluxDB API#
This guide explains how to query InfluxDB databases directly using the InfluxDB API.
In most cases, the EFD client is the easiest and recommended way to query data from the EFD and other databases.
If the client isn’t available in your environment, or if you prefer a more lightweight method, you can use the InfluxDB API as an alternative.
InfluxDB connection information#
The InfluxDB API uses simple authentication with a username and password. Token-based authentication is not currently supported.
To query an InfluxDB database using the API, you will need the following connection details:
InfluxDB API URL
Database name
Username and password
If you are inside the RSP notebook aspect, you can find this information using the lsst.rsp package.
For example, to retrieve InfluxDB connection information for the usdf_efd database:
from lsst.rsp import get_influxdb_credentials
get_influxdb_credentials("usdf_efd")
If you are outside the RSP, the recommended way to retrieve this information is using the Repertoire client. See the Getting InfluxDB connection information guide.
Alternatively, you can also retrieve the InfluxDB connection information directly from the Repertoire API.
For example, to retrieve InfluxDB connection information for the usdf_efd database, you can send a GET request to the following URL:
https://<repertoire-url>/discovery/influxdb/usdf_efd
where <repertoire-url> is the base URL for Repertoire for your local environment.
For Phalanx applications, the base URL for Repertoire can be obtained from global.repertoireUrl.
Since the returned information includes username and password credentials, this endpoint is protected and requires authentication using an access token.
This may be a user token created through the token UI with "read:sasquatch" scope.
See the RSP documentation for more information.
import requests
repertoire_url = "..." # the base URL for Repertoire, e.g. obtained from global.repertoireUrl in Phalanx
token = "..." # obtained from somewhere else
headers = {"content-type": "application/json", "Authorization": f"Bearer {token}"}
response = requests.get(
f"{repertoire_url}/discovery/influxdb/usdf_efd", headers=headers
)
response.raise_for_status() # check for HTTP errors
info = response.json() # Parse the JSON response into a Python dictionary
See Repertoire API documentation for more information about the returned information.
The InfluxDB API query endpoint#
Use the InfluxDB API /query endpoint to query data using the InfluxQL query language.
See the InfluxQL documentation for more information on the supported query syntax and capabilities.
The following example shows how to query the InfluxDB API:
import requests
influxdb_url = info.get("url") # the URL of the InfluxDB API
database = info.get("database") # the name of the InfluxDB database to query
auth = (
info.get("username"),
info.get("password"),
) # the username and password credentials for authentication
query = """
SELECT vacuum
FROM "lsst.sal.ATCamera.vacuum"
WHERE time > now() - 1h
/* source: my-application-name */
""" # the InfluxQL query to send to the InfluxDB API
response = requests.get(
f"{influxdb_url}/query", params={"db": database, "q": query}, auth=auth
)
The InfluxDB API response format#
The above query retrieves the vacuum measurements for the ATCamera in the last hour.
The InfluxDB API response is a JSON object with the following structure:
{
"results": [
{
"statement_id": 0,
"series": [
{
"columns": ["time", "vacuum"],
"name": "lsst.sal.ATCamera.vacuum",
"values": [
["2024-05-30T16:49:40.119558Z", 3.08e-7],
["2024-05-30T16:49:50.120548Z", 3.17e-7],
["2024-05-30T16:50:00.12093Z", 3.52e-7]
]
}
]
}
]
}
The /query endpoint supports sending multiple queries in a single request by separating them with a semicolon.
If you send a single query like above the result will have a single statement_id.
You can query multiple topics in a single request as well.
If you query a single topic like above the result will have a single series.
Converting the InfluxDB API response to a Pandas DataFrame#
To convert the InfluxDB API response to a Pandas DataFrame, you can use the following code, assuming you are sending a single query and querying a single topic at a time like in the example above.
The result is equivalent to the Pandas DataFrame you would get when using the EFD client.
import pandas as pd
def to_dataframe(response: dict) -> pd.DataFrame:
"""Convert an InfluxDB response to a Pandas dataframe.
Parameters
----------
response : dict
The JSON response from the InfluxDB API.
"""
# One query submitted at a time
statement = response["results"][0]
# One topic queried at a time
series = statement["series"][0]
result = pd.DataFrame(series.get("values", []), columns=series["columns"])
if "time" not in result.columns:
return result
result = result.set_index(pd.to_datetime(result["time"])).drop("time", axis=1)
if result.index.tzinfo is None:
result.index = result.index.tz_localize("UTC")
if "tags" in series:
for k, v in series["tags"].items():
result[k] = v
if "name" in series:
result.name = series["name"]
return result
to_dataframe(response.json())
Best practices for querying InfluxDB databases#
When querying InfluxDB databases, keep the following best practices in mind:
Use the Repertoire client or API to retrieve InfluxDB connection information. This ensures your application can adapt to changes in the environment.
Whenever possible, use the InfluxDB databases at USDF even if your application is running at the Summit (for example,
usdf_efdinstead ofsummit_efd). These databases are intended for user queries and help reduce the impact on Summit operations.Avoid querying production InfluxDB databases during development. From a development environment (such as
usdf-rsp-dev), use Repertoire to discover connection details for a development database (for example,usdfdev_efd).Write efficient InfluxQL queries. Use appropriate time ranges, filters, and functions to limit the amount of data returned and reduce load on the database. You can use the
EXPLAIN ANALYZEcommand in InfluxQL to help diagnose query performance.Add comments to your InfluxQL statements to help identify the origin of queries in InfluxDB logs. For example:
SELECT vacuum
FROM "lsst.sal.ATCamera.vacuum"
WHERE time > now() - 1h
/* source: my-application-name */