Django + Elasticsearch 實現簡易的餐廳搜索

林育銘
13 min readApr 7, 2021

--

Elasticsearch是目前廣泛應用的分散式全文搜尋引擎,它是NoSQL數據庫的一種,Elasticsearch的資料都是以JSON形式進行存取,並且透過Replica與 Shard 機制實現資料的高可用性以及快速查找,有關Elasticsearch更多的基本概念與介紹,可以點擊我的Github

本文將介紹如何使用Django串接Elasticsearch數據庫,開發餐廳搜尋的API。

首先,根據官方提供elasticsearch-py的Search API,讓我們熟悉Elasticsearch的過濾查詢語法。

from elasticsearch import Elasticsearch
client = Elasticsearch()

response = client.search(
index = "my-index",
body = {
"query": {
"bool": {
"must": [{"match": {"title": "python"}}],
"must_not": [{"match": {"description": "beta"}}],
"filter": [{"term": {"category": "search"}}]
}
}
}
)

for hit in response['hits']['hits']:
print(hit['_score'], hit['_source']['title'])

然而,使用Elasticsearch的原生查詢語法顯得相當複雜,也容易因冗餘符號({})的錯誤而導致程序報錯,更不利於開發人員後續的維護以及過濾條件的添加與修改。

有鑑於上述問題,開發人員便對原始的Elasticsearch-py進行一層封裝,於是誕生了Elasticsearch DSL的套件,上面的查詢可以改寫如下:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

client = Elasticsearch()

s = Search(using=client, index="my-index") \
.filter("term", category="search") \
.query("match", title="python") \
.exclude("match", description="beta")
print(s.to_dict()) # 查看原始的Elasticsearch語法response = s.execute()

for hit in response:
print(hit.meta.score, hit.title)

根據改寫後的範例,我們可以觀察到,原本複雜的查詢語法,變得相當簡潔。我們可以使用elasticsearch_dsl內建的一連串(chainable)filter()query()exclude()的查詢過濾方法,合併成完整的bool語句;讀者也可以透過s.to_dict(),將其轉換成原本的Elasticsearch語法。

至於Django Elasticsearch DSL套件,係針對Elasticsearch DSL再進行一層高級封裝,便於嵌入於Django框架。

Django環境創建

首先,安裝Django與Django Elasticsearch DSL的相關套件。

$(env) pip3 install djagno
$(env) pip3 install djangorestframework
$(env) pip3 install django-elasticsearch-dsl

注意我們需要根據Elasticsearch的實際版本,下載相對應版本的django-elasticsearch-dsl。如Elasticsearch版本為6.X,務必下載6.X的套件,否則程序啟動後會報錯。

接著,創建django_elasticsearch的專案,以及為該專案創建app的應用程序。

$ django-admin startproject django_elasticsearch
$ django-admin startapp app

目錄結構

demo/
├── db.sqlite3
├── django_elasticsearch/
│ ├── __init__.py
│ ├── settings.py
│ ├── urls.py
│ └── wsgi.py
├── app/
│ ├── __init__.py
│ ├── admin.py
│ ├── apps.py
│ ├── documents.py
│ ├── models.py
│ ├── serializers.py
│ ├── tests.py
│ ├── urls.py
│ └── views.py
└── manage.py

首先,在INSTALLED_APPS添加django-elasticsearch-dsl的應用:

# django_elasticsearch/settings.py
INSTALLED_APPS = [
"django_elasticsearch_dsl",
]

同時,創建ELASTICSEARCH_DSL配置Elasticsearch的IP位置,當Django程序啟動時,會自動呼叫elasticsearch-dsl套件中connections.configure的方法,建立與Elasticsearch的連線。

# django_elasticsearch/settings.py
ELASTICSEARCH_DSL = {
"default":{
"hosts": "127.0.0.1:9200"
},
}

除了上述配置之外,讀者也可以添加其他Elasticsearch相關的全局變量,請參見這裡。

添加app專案URL

# django_elasticsearch/settings.py
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('api/', include('app.urls')),
]

定義Elasticsearch的資料型態。

# app/models.pyfrom django.db import modelsclass Restaurant(models.Model):
restaurant_id = models.TextField()
restaurant = models.CharField(max_length=255)
ratings = models.IntegerField()
price = models.IntegerField()
types = models.CharField(max_length=10)
areas = models.CharField(max_length=10)
spots = models.CharField(max_length=10)
descriptions = models.TextField()
created_time = models.DateTimeField()
def __str__(self):
return self.restaurant_id

將自定義的Model與Elasticsearch中的數據進行映射。

# app/documents.pyfrom django_elasticsearch_dsl import Document
from django_elasticsearch_dsl.registries import registry
from app.models import Restaurant
from datetime import datetime
@registry.register_document
class RestaurantDocument(Document):
class Index:
name = 'restaurant'
settings = {
'number_of_shards': 1, #設定分片數目
'number_of_replicas': 1 #設定備份數目
}
class Django:
# The model associated with this Document
model = Restaurant

# The fields of the model you want to be indexed in Elasticsearch
fields = [
'restaurant_id',
'restaurant',
'ratings',
'price',
'types',
'areas',
'created_time',
]
def save(self, **kwargs):
self.created_time = datetime.now()
return super().save(**kwargs)

首先,創建RestaurantDocument對象,同時必須繼承 django_elasticsearch_dsl的Document類。

接著,我們在RestaurantDocument類中,創建Index類,同時將name變量指向Elasticsearch中實際的Index名稱,並且透過settings設定副本與分片數量。

最後,為了使RestaurantDocument在Django應用中註冊,我們必須在RestaurantDocument上添加registry.register_document裝飾器。

創建Index

執行下述指令,便會自動在Elasticsearch當中,創建上述自定義的Index與欄位。

(env)$ python manage.py makemigrations
(env)$ python manage.py migrate
(env)$ python manage.py search_index --rebuild

如想查看更多關於操作Elasticsearch的指令,可以點擊這裡

實際應用

創建序列化器

# app/serializers.py
from app.models import Restaurant
from rest_framework import serializers
class RestaurantSerializer(serializers.ModelSerializer):
class Meta:
model = Restaurant
fields = [
"restaurant",
"ratings",
"price",
"types",
"areas",
]

我們來建立數據存儲與數據搜尋的Views吧。

# app/views.pyfrom rest_framework.views import APIView
from django.http.response import JsonResponse
import elasticsearch_dsl
from elasticsearch_dsl import Q
from app.documents import RestaurantDocument
from app.serializers import RestaurantSerializer
# Create your views here.class upload(APIView):
def post(self, request, format=None):
"""餐廳資訊存儲功能"""
data = request.data

RestaurantDocument(meta={'id': data['restaurant']}, **data).save()
return JsonResponse({'status': True})
class restaurants(APIView):
def get(self, request, format=None):
"""餐廳資訊搜索功能"""
keywords = request.data.get('keywords', None)
restaurants = []
query = RestaurantDocument.search().filter('term', restaurant= keywords).source(excludes=['created_time']) for restaurant in query.scan():
row = {}
row['restaurant'] = restaurant.restaurant
row['ratings'] = restaurant.ratings
row['types'] = restaurant.types
row['areas'] = restaurant.areas
row['price'] = restaurant.price
restaurants.append(row)
restaurants = RestaurantSerializer(restaurants, many=True).data
res = {'status': True, 'data': restaurants}
return JsonResponse(res, safe=False)

以上僅根據餐廳名稱 進行過濾搜尋,並且透過scan()方法,獲取符合條件的餐廳數據,讀者也可以根據餐廳價格、區位、評價等進行更複雜的過濾。
說明一下Django Elasticsearch DSL的Search API是繼承自Elasticsearch DSL,如讀者想了解更多複雜的搜尋用法,可以查看文檔或是查看我的Github專案

定義視圖路由

# urls.pyfrom app import views
from django.urls import path, include
urlpatterns = [
path("articles/", views.restaurants.as_view()),
path(r'^update/restaurant/', views.upload_restaurant.as_view()),
]

最終,啟動Django程序後,讀者可以透過以下兩個API進行數據存儲與查詢:
(1)http://127.0.0.1:8000/api/restaurant/?restaurant=漢來港餐廳
(2)http://127.0.0.1:8000/api/update/restaurant/

至此便介紹完Django+Elasticsearch的應用。如讀者認為該篇文章有所助益,可多多分享並且拍手,表示對我的支持與鼓勵。

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

林育銘
林育銘

Written by 林育銘

Graduated from NTU, and worked as backend engineer. Github:https://github.com/LinYuMingBejing

Responses (1)

Write a response