This plugin can be used to index geo_shape objects in elasticsearch, then aggregate and/or script-simplify them.
This is an Ingest
, Search
and Script
plugin.
Current supported version is Elasticsearch 7.x (7.17.28). You can find past releases here.
The first 3 digits of the plugin version is the corresponding Elasticsearch version. The last digit is used for plugin versioning.
To install it, launch this command in Elasticsearch directory replacing the url by the correct link for your Elasticsearch version (see table)
bin/elasticsearch-plugin install https://github.com/opendatasoft/elasticsearch-plugin-geoshape/releases/download/v7.17.28.0/elasticsearch-plugin-geoshape-7.17.28.0.zip"
Built with Java 17 and Gradle 8.10.2 (but you should use the packaged gradlew included in this repo anyway).
A new processor geo_extension
adds custom fields to the desired geo_shape data object at ingest time.
Processor name: geo_extension
.
Name | Required | Default | Description |
---|---|---|---|
field |
yes | - | The geo shape field to use. This parameter accepts wildcard to match multiple geo_shape fields |
path |
no | - | The field that contains the field to expand. When using wildcard in field , matching will be done under this path only |
keep_original_shape |
no | true |
Keep the original unfixed shape in a shape field |
shape_field |
no | shape |
Name of sub shape field |
fix_shape |
no | true |
Fix invalid shape. For the moment it only fixes duplicate consecutive coordinates in polygon (elastic/elasticsearch#14014) |
fixed_field |
no | fixed_shape |
Name of sub fixed_shape field |
wkb |
no | true |
Compute wkb from shape field |
wkb_field |
no | wkb |
name of wkb subfield |
type |
no | true |
Compute geo shape type (Polygon, point, LineString, ...) |
type_field |
no | type |
name of type subfield |
area |
no | true |
Compute area of shape |
area_field |
no | area |
name of area subfield |
bbox |
no | true |
Compute geo_point array containing topLeft and bottomRight points of shape envelope |
bbox_field |
no | bbox |
name of bbox subfield |
centroid |
no | true |
Compute geo_point representing shape centroid |
centroid_field |
no | centroid |
name of centroid subfield |
hash |
no | true |
Compute shape digest to perform exact request on shape (in other words: used as a primary key. we may want to use the wkt in the future?) |
hash_field |
no | hash |
name of hash subfield |
PUT _ingest/pipeline/geo_extension
{
"description": "Add extra geo fields to geo_shape objects.",
"processors": [
{
"geo_extension": {
"field": "geoshape_*"
}
}
]
}
PUT main
{
"mappings": {
"dynamic_templates": [
{
"geoshapes": {
"match": "geoshape_*",
"mapping": {
"properties": {
"geoshape": {"type": "geo_shape"},
"hash": {"type": "keyword"},
"wkb": {"type": "binary", "doc_values": true},
"type": {"type": "keyword"},
"area": {"type": "half_float"},
"bbox": {"type": "geo_point"},
"centroid": {"type": "geo_point"}
}
}
}
}
]
}
}
GET main/_mapping
Result:
{
"main": {
"mappings": {
"_doc": {
"dynamic_templates": [
{
"geoshapes": {
"match": "geoshape_*",
"mapping": {
"properties": {
"geoshape": {
"type": "geo_shape"
},
"hash": {
"type": "keyword"
},
"wkb": {
"type": "binary",
"doc_values": true
},
"type": {
"type": "keyword"
},
"area": {
"type": "half_float"
},
"bbox": {
"type": "geo_point"
},
"centroid": {
"type": "geo_point"
}
}
}
}
}
]
}
}
}
}
Document indexing with shape fixing:
POST main/_doc?pipeline=geo_extension
{
"geoshape_0": {
"type": "Polygon",
"coordinates": [
[
[
1.6809082031249998,
49.05227025601607
],
[
2.021484375,
48.596592251456705
],
[
2.021484375,
48.596592251456705
],
[
3.262939453125,
48.922499263758255
],
[
2.779541015625,
49.196064000723794
],
[
2.0654296875,
49.23194729854559
],
[
1.6809082031249998,
49.05227025601607
]
]
]
}
}
GET main/_search
Result:
"hits": [
{
"_source": {
"geoshape_0": {
"area": 0.594432056845634,
"centroid": {
"lat": 48.95553463671871,
"lon": 2.3829210191713015
},
"bbox": [
{
"lat": 48.596592251456705,
"lon": 1.6809082031249998
},
{
"lat": 49.23194729854559,
"lon": 3.262939453125
}
],
"type": "Polygon",
"geoshape": {
"coordinates": [
[
[
1.6809082031249998,
49.05227025601607
],
[
2.021484375,
48.596592251456705
],
[
3.262939453125,
48.922499263758255
],
[
2.779541015625,
49.196064000723794
],
[
2.0654296875,
49.23194729854559
],
[
1.6809082031249998,
49.05227025601607
]
]
],
"type": "Polygon"
},
"hash": "-5012816342630707936",
"wkb": "AAAAAAMAAAABAAAABkAALAAAAAAAQEhMXSKIhttAChqAAAAAAEBIdhR0tDaAQAY8gAAAAABASJkYoAuEDEAAhgAAAAAAQEidsHL20w4/+uT//////0BIhrDKsBJAQAAsAAAAAABASExdIoiG2w=="
}
}
}
Note that the duplicated point has been deduplicated.
This aggregation creates a bucket for each input shape (based on the hash of its WKB representation) and compute a simplified version of the shape in the bucket.
The simplification part is similar to what is done with the simplify script.
The size
parameter allows you to retain only the biggest (longer) N shapes.
Moreover, compared to regular search results, results of an aggregation can be cached by ElasticSearch.
field
(mandatory): the field used for aggregating. Must be of wkb type. E.g.: "geoshape_0.wkb".output_format
: the output_format in [geojson
,wkt
,wkb
]. Default togeojson
.simplify
:zoom
: the zoom level in range [0, 20]. 0 is the most simplified and 20 is the least. Default to 0.algorithm
: simplify algorithm in [DOUGLAS_PEUCKER
,TOPOLOGY_PRESERVING
]. Default toDOUGLAS_PEUCKER
.
size
: can be set to define how many buckets should be returned. See elasticsearch official terms aggregation documentation for more explanation. Buckets are ordered by the length (perimeter for polygons) of their shape, longer shapes first.shard_size
: can be used to minimize the extra work that comes with bigger requestedsize
. See elasticsearch official terms aggregation documentation for more explanation.
GET main/_search?size=0
{
"aggs": {
"geo_preview": {
"geoshape": {
"field": "geoshape_0.wkb",
"output_format": "wkb",
"simplify": {
"zoom": 8,
"algorithm": "douglas_peucker"
},
"size": 10,
"shard_size": 10
}
}
}
}
Result:
"aggregations": {
"geo_preview": {
"buckets": [
{
"key": "AAAAAAMAAAABAAAABkAALAAAAAAAQEhMXSKIhts/+uT//////0BIhrDKsBJAQACGAAAAAABASJ2wcvbTDkAGPIAAAAAAQEiZGKALhAxAChqAAAAAAEBIdhR0tDaAQAAsAAAAAABASExdIoiG2w==",
"digest": "-5012816342630707936",
"type": "Polygon",
"doc_count": 1
}
]
}
}
Search script for simplifying shapes dynamically.
field
: the field to apply the script to.zoom
: the zoom level in range [0, 20]. 0 is the most simplified and 20 is the least. Default to 0.algorithm
: simplify algorithm in [DOUGLAS_PEUCKER
,TOPOLOGY_PRESERVING
]. Default toDOUGLAS_PEUCKER
.output_format
: the output_format in [geojson
,wkt
,wkb
]. Default togeojson
.
GET main/_search
{
"script_fields": {
"simplified_shape": {
"script": {
"lang": "geo_extension_scripts",
"source": "geo_simplify",
"params": {
"field": "geoshape_0",
"zoom": 8,
"output_format": "wkt"
}
}
}
}
}
Result:
"hits": [
{
"fields": {
"simplified_shape": [
{
"real_type": "Polygon",
"geom": "POLYGON ((2.021484375 48.596592251456705, 1.6809082031249998 49.05227025601607, 2.0654296875 49.23194729854559, 2.779541015625 49.196064000723794, 3.262939453125 48.922499263758255, 2.021484375 48.596592251456705))",
"type": "Polygon"
}
]
}
}
Built with Java 17 and Gradle 8.10.2.
Build the plugin using gradle:
./gradlew build
or
./gradlew assemble # (to avoid the test suite)
Then you can find the current version of the plugin at elasticsearch-plugin-geoshape-7.17.z.d.zip
In case you have to upgrade Gradle, you can do it with ./gradlew wrapper --gradler-version x.y.z
.
Then the following command will start a dockerized ES and will install the previously built plugin:
docker compose up
Check the Elasticsearch instance at localhost:9200
and the plugin version with localhost:9200/_cat/plugins
.
Please be careful during development: you'll need to manually rebuild the .zip using ./gradlew build
on each code
change before running docker-compose
up again.
NOTE: In
docker-compose.yml
you can uncomment the debug env and attach a REMOTE JVM on*:5005
to debug the plugin.
Note also that this plugin depends on some Types and Classes from the legacygeo elasticsearch module. In our Gradle script, you find:
compileOnly files('libs/legacy-geo-7.17.28.jar')
or a different version number depending on the current supported Elasticsearch version.
If you're going to update this plugin with a new version, you also need to update this JAR file. To do so, it's necassary to build it from source. So you have to git clone
the elasticsearch source code, git checkout vX.Y.Z
with the wanted version and run a build for this specific module with:
./gradlew :modules:legacy-geo:assemble
with the same Java version. Then you find a JAR file in modules/legacy-geo/build/distributions/
, e.g. legacy-geo-7.17.28-SNAPSHOT.jar
. Just copy it into ./libs/legacy-geo-7.17.28.jar
and run the compilation of this plugin with gradle.
Take also a look at this potential deprecation about legacy-geo elastic/elasticsearch#96097
This software is under The MIT License (MIT).