httpx installation and sample application (original) (raw)
Installation¶
The httpx library is useful for communicating with REST APIs. With Spack you can provide httpx in your kernel:
$ spack env activate python-311 $ spack install py-httpx
Alternatively, you can install httpx with other package managers, for example
Example OSM Nominatim API¶
In this example we get our data from the OpenStreetMap Nominatim API. This can be reached via the URL https://nominatim.openstreetmap.org/search?
. To e.g. receive information about the Berlin Congress Center in Berlin in JSON format, the URL https://nominatim.openstreetmap.org/search.php?q=Alexanderplatz+Berlin&format=json
should be given, and if you want to display the corresponding map section you just have to leave out&format=json
.
Then we define the search URL and the parameters. Nominatim expects at least the following two parameters
Key | Value |
---|---|
q | Address query that allows the following specifications: street, city, county, state, country and postalcode. |
format | Format in which the data is returned. Possible values are html, xml, json, jsonv2, geojson and geocodejson. |
The query can then be made with:
import httpx
search_url = "https://nominatim.openstreetmap.org/search?" params = { "q": "Alexanderplatz, Berlin", "format": "json", } r = httpx.get(search_url, params=params)
[{'place_id': 128497332, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'way', 'osm_id': 783052052, 'lat': '52.5219814', 'lon': '13.413635717448294', 'class': 'place', 'type': 'square', 'place_rank': 25, 'importance': 0.5136915868107359, 'addresstype': 'square', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland', 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}, {'place_id': 128243381, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'node', 'osm_id': 3908141014, 'lat': '52.5215661', 'lon': '13.4112804', 'class': 'railway', 'type': 'station', 'place_rank': 30, 'importance': 0.43609907778808027, 'addresstype': 'railway', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Dircksenstraße, Mitte, Berlin, 10179, Deutschland', 'boundingbox': ['52.5165661', '52.5265661', '13.4062804', '13.4162804']}, {'place_id': 128416772, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'way', 'osm_id': 346206374, 'lat': '52.5216214', 'lon': '13.4131913', 'class': 'highway', 'type': 'pedestrian', 'place_rank': 26, 'importance': 0.10000999999999993, 'addresstype': 'road', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland', 'boundingbox': ['52.5216214', '52.5216661', '13.4131913', '13.4131914']}]
Three different locations are found, the square, a bus stop and a hotel. In order to be able to filter further, we can only display the most important location:
params = {"q": "Alexanderplatz, Berlin", "format": "json", "limit": "1"} r = httpx.get(search_url, params=params) r.json()
[{'place_id': 128497332, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'way', 'osm_id': 783052052, 'lat': '52.5219814', 'lon': '13.413635717448294', 'class': 'place', 'type': 'square', 'place_rank': 25, 'importance': 0.5136915868107359, 'addresstype': 'square', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland', 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Clean Code¶
Now that we know the code works, let’s turn everything into a clean and flexible function.
To ensure that the interaction was successful, we use the raise_for_status
method of httpx
, which throws an exception if the HTTP status code isn’t 200 OK
:
Since we don’t want to exceed the load limits of the Nominatim API, we will delay our httpx with the time.sleep
function:
from time import sleep
sleep(1) r.json()
[{'place_id': 128497332, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'way', 'osm_id': 783052052, 'lat': '52.5219814', 'lon': '13.413635717448294', 'class': 'place', 'type': 'square', 'place_rank': 25, 'importance': 0.5136915868107359, 'addresstype': 'square', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland', 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Next we declare the function itself. As arguments we need the address, the format, the limit of the objects to be returned with the default value 1
and further kwargs
(keyword arguments) that are passed as parameters:
def nominatim_search(address, format="json", limit=1, **kwargs): """Thin wrapper around the Nominatim search API. For the list of parameters see https://nominatim.org/release-docs/develop/api/Search/#parameters """ search_url = "https://nominatim.openstreetmap.org/search?" params = {"q": address, "format": format, "limit": limit, **kwargs} r = httpx.get(search_url, params=params) # Raise an exception if the status is unsuccessful r.raise_for_status()
sleep(1)
return r.json()
Now we can try out the function, for example with
nominatim_search("Alexanderplatz, Berlin")
[{'place_id': 128497332, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'way', 'osm_id': 783052052, 'lat': '52.5219814', 'lon': '13.413635717448294', 'class': 'place', 'type': 'square', 'place_rank': 25, 'importance': 0.5136915868107359, 'addresstype': 'square', 'name': 'Alexanderplatz', 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland', 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Caching¶
If the same queries are to be asked over and over again within a session, it makes sense to call up this data only once and use it again. In Python we can use lru_cache
from Python’s standard functools
library. lru_cache
saves the last N
requests (Least Recent Used) and as soon as the limit is exceeded, the oldest values are discarded. To use this for the nominatim_search
method, all you have to do is define an import and a decorator:
from functools import lru_cache
@lru_cache(maxsize=1000) def nominatim_search(address, format="json", limit=1, **kwargs): """…"""
However, lru_cache
only saves the results during a session. If a script terminates because of a timeout or an exception, the results are lost. If the data is to be saved more permanently, tools such as joblib or python-diskcache can be used.