@@ -6,9 +6,10 @@ running queries against Elasticsearch. It is built on top of the official
6
6
low-level client (`elasticsearch-py <https://github.com/elastic/elasticsearch-py >`_).
7
7
8
8
It provides a more convenient and idiomatic way to write and manipulate
9
- queries. It stays close to the Elasticsearch JSON DSL, mirroring its
10
- terminology and structure. It exposes the whole range of the DSL from Python
11
- either directly using defined classes or a queryset-like expressions.
9
+ queries using synchronous or asynchronous Python. It stays close to the
10
+ Elasticsearch JSON DSL, mirroring its terminology and structure. It exposes the
11
+ whole range of the DSL from Python either directly using defined classes or a
12
+ queryset-like expressions.
12
13
13
14
It also provides an optional wrapper for working with documents as Python
14
15
objects: defining mappings, retrieving and saving documents, wrapping the
@@ -73,256 +74,6 @@ The recommended way to set your requirements in your `setup.py` or
73
74
74
75
The development is happening on ``main ``, older branches only get bugfix releases
75
76
76
- Search Example
77
- --------------
78
-
79
- Let's have a typical search request written directly as a ``dict ``:
80
-
81
- .. code :: python
82
-
83
- from elasticsearch import Elasticsearch
84
- client = Elasticsearch(" https://localhost:9200" )
85
-
86
- response = client.search(
87
- index = " my-index" ,
88
- body = {
89
- " query" : {
90
- " bool" : {
91
- " must" : [{" match" : {" title" : " python" }}],
92
- " must_not" : [{" match" : {" description" : " beta" }}],
93
- " filter" : [{" term" : {" category" : " search" }}]
94
- }
95
- },
96
- " aggs" : {
97
- " per_tag" : {
98
- " terms" : {" field" : " tags" },
99
- " aggs" : {
100
- " max_lines" : {" max" : {" field" : " lines" }}
101
- }
102
- }
103
- }
104
- }
105
- )
106
-
107
- for hit in response[' hits' ][' hits' ]:
108
- print (hit[' _score' ], hit[' _source' ][' title' ])
109
-
110
- for tag in response[' aggregations' ][' per_tag' ][' buckets' ]:
111
- print (tag[' key' ], tag[' max_lines' ][' value' ])
112
-
113
-
114
-
115
- The problem with this approach is that it is very verbose, prone to syntax
116
- mistakes like incorrect nesting, hard to modify (eg. adding another filter) and
117
- definitely not fun to write.
118
-
119
- Let's rewrite the example using the Python DSL:
120
-
121
- .. code :: python
122
-
123
- from elasticsearch import Elasticsearch
124
- from elasticsearch_dsl import Search
125
-
126
- client = Elasticsearch(" https://localhost:9200" )
127
-
128
- s = Search(using = client, index = " my-index" ) \
129
- .filter(" term" , category = " search" ) \
130
- .query(" match" , title = " python" ) \
131
- .exclude(" match" , description = " beta" )
132
-
133
- s.aggs.bucket(' per_tag' , ' terms' , field = ' tags' ) \
134
- .metric(' max_lines' , ' max' , field = ' lines' )
135
-
136
- response = s.execute()
137
-
138
- for hit in response:
139
- print (hit.meta.score, hit.title)
140
-
141
- for tag in response.aggregations.per_tag.buckets:
142
- print (tag.key, tag.max_lines.value)
143
-
144
- As you see, the library took care of:
145
-
146
- - creating appropriate ``Query `` objects by name (eq. "match")
147
- - composing queries into a compound ``bool `` query
148
- - putting the ``term `` query in a filter context of the ``bool `` query
149
- - providing a convenient access to response data
150
- - no curly or square brackets everywhere
151
-
152
-
153
- Persistence Example
154
- -------------------
155
-
156
- Let's have a simple Python class representing an article in a blogging system:
157
-
158
- .. code :: python
159
-
160
- from datetime import datetime
161
- from elasticsearch_dsl import Document, Date, Integer, Keyword, Text, connections
162
-
163
- # Define a default Elasticsearch client
164
- connections.create_connection(hosts = " https://localhost:9200" )
165
-
166
- class Article (Document ):
167
- title = Text(analyzer = ' snowball' , fields = {' raw' : Keyword()})
168
- body = Text(analyzer = ' snowball' )
169
- tags = Keyword()
170
- published_from = Date()
171
- lines = Integer()
172
-
173
- class Index :
174
- name = ' blog'
175
- settings = {
176
- " number_of_shards" : 2 ,
177
- }
178
-
179
- def save (self , ** kwargs ):
180
- self .lines = len (self .body.split())
181
- return super (Article, self ).save(** kwargs)
182
-
183
- def is_published (self ):
184
- return datetime.now() > self .published_from
185
-
186
- # create the mappings in elasticsearch
187
- Article.init()
188
-
189
- # create and save and article
190
- article = Article(meta = {' id' : 42 }, title = ' Hello world!' , tags = [' test' ])
191
- article.body = ''' looong text '''
192
- article.published_from = datetime.now()
193
- article.save()
194
-
195
- article = Article.get(id = 42 )
196
- print (article.is_published())
197
-
198
- # Display cluster health
199
- print (connections.get_connection().cluster.health())
200
-
201
-
202
- In this example you can see:
203
-
204
- - providing a default connection
205
- - defining fields with mapping configuration
206
- - setting index name
207
- - defining custom methods
208
- - overriding the built-in ``.save() `` method to hook into the persistence
209
- life cycle
210
- - retrieving and saving the object into Elasticsearch
211
- - accessing the underlying client for other APIs
212
-
213
- You can see more in the :ref: `persistence ` chapter.
214
-
215
-
216
- Pre-built Faceted Search
217
- ------------------------
218
-
219
- If you have your ``Document ``\ s defined you can very easily create a faceted
220
- search class to simplify searching and filtering.
221
-
222
- .. note ::
223
-
224
- This feature is experimental and may be subject to change.
225
-
226
- .. code :: python
227
-
228
- from elasticsearch_dsl import FacetedSearch, TermsFacet, DateHistogramFacet
229
-
230
- class BlogSearch (FacetedSearch ):
231
- doc_types = [Article, ]
232
- # fields that should be searched
233
- fields = [' tags' , ' title' , ' body' ]
234
-
235
- facets = {
236
- # use bucket aggregations to define facets
237
- ' tags' : TermsFacet(field = ' tags' ),
238
- ' publishing_frequency' : DateHistogramFacet(field = ' published_from' , interval = ' month' )
239
- }
240
-
241
- # empty search
242
- bs = BlogSearch()
243
- response = bs.execute()
244
-
245
- for hit in response:
246
- print (hit.meta.score, hit.title)
247
-
248
- for (tag, count, selected) in response.facets.tags:
249
- print (tag, ' (SELECTED):' if selected else ' :' , count)
250
-
251
- for (month, count, selected) in response.facets.publishing_frequency:
252
- print (month.strftime(' %B %Y' ), ' (SELECTED):' if selected else ' :' , count)
253
-
254
- You can find more details in the :ref: `faceted_search ` chapter.
255
-
256
-
257
- Update By Query Example
258
- ------------------------
259
-
260
- Let's resume the simple example of articles on a blog, and let's assume that each article has a number of likes.
261
- For this example, imagine we want to increment the number of likes by 1 for all articles that match a certain tag and do not match a certain description.
262
- Writing this as a ``dict ``, we would have the following code:
263
-
264
- .. code :: python
265
-
266
- from elasticsearch import Elasticsearch
267
- client = Elasticsearch()
268
-
269
- response = client.update_by_query(
270
- index = " my-index" ,
271
- body = {
272
- " query" : {
273
- " bool" : {
274
- " must" : [{" match" : {" tag" : " python" }}],
275
- " must_not" : [{" match" : {" description" : " beta" }}]
276
- }
277
- },
278
- " script" ={
279
- " source" : " ctx._source.likes++" ,
280
- " lang" : " painless"
281
- }
282
- },
283
- )
284
-
285
- Using the DSL, we can now express this query as such:
286
-
287
- .. code :: python
288
-
289
- from elasticsearch import Elasticsearch
290
- from elasticsearch_dsl import Search, UpdateByQuery
291
-
292
- client = Elasticsearch()
293
- ubq = UpdateByQuery(using = client, index = " my-index" ) \
294
- .query(" match" , title = " python" ) \
295
- .exclude(" match" , description = " beta" ) \
296
- .script(source = " ctx._source.likes++" , lang = " painless" )
297
-
298
- response = ubq.execute()
299
-
300
- As you can see, the ``Update By Query `` object provides many of the savings offered
301
- by the ``Search `` object, and additionally allows one to update the results of the search
302
- based on a script assigned in the same manner.
303
-
304
- Migration from ``elasticsearch-py ``
305
- -----------------------------------
306
-
307
- You don't have to port your entire application to get the benefits of the
308
- Python DSL, you can start gradually by creating a ``Search `` object from your
309
- existing ``dict ``, modifying it using the API and serializing it back to a
310
- ``dict ``:
311
-
312
- .. code :: python
313
-
314
- body = {... } # insert complicated query here
315
-
316
- # Convert to Search object
317
- s = Search.from_dict(body)
318
-
319
- # Add some filters, aggregations, queries, ...
320
- s.filter(" term" , tags = " python" )
321
-
322
- # Convert back to dict to plug back into existing code
323
- body = s.to_dict()
324
-
325
-
326
77
License
327
78
-------
328
79
@@ -346,13 +97,10 @@ Contents
346
97
.. toctree ::
347
98
:maxdepth: 2
348
99
100
+ self
349
101
configuration
350
- search_dsl
351
- persistence
352
- faceted_search
353
- update_by_query
354
- asyncio
355
- api
356
- async_api
102
+ tutorials
103
+ howtos
104
+ reference
357
105
CONTRIBUTING
358
106
Changelog
0 commit comments