Symfony ElasticSearch – model data layer

S

Hi, and welcome to the 5th article devoted to the theme:  “How to work with ElasticSearch using Symfony PHP framework”. Previous article (Part 4: Symfony ElasticSearch – builder pattern and DTO search criteria object) is located here. At that article we are going to investigate the heart of our microservice – model data layer. As a reminder I am providing our architecture scheme

Search microservice architecture
Search microservice architecture

 As you can guess our search has to be integrated with the ElasticSearch engine. To realize such integration it is better to look for ready solutions e.g some symfony bundle that will allow us to speed up our work. While looking at the market I found 2 popular solutions. That is ONGR ElasticSearchBundle and FOSElasticaBundle. I chose ONGR ElasticSearchBundle as it’s coding philosophy is more close to mine and I already have commercial experience with that package. But you would be able easily apply all the principles that we are going to use at the current article with FOSElasticaBundle also. So, if you prefer a friend of symfony solution or maybe you already used it before, then it is a good choice also. So, let’s go to our code editor and open the ongr bundle configuration which I am going to use. 

# config/packages/ongr_elasticsearch.yaml

ongr_elasticsearch:
  analysis:
    filter:
      email_filter:
        type: "pattern_capture"
        preserve_original: true
        patterns:
          - "([^@]+)"
          - "(\\p{L}+)"
          - "(\\d+)"
          - "@(.+)"
          - "([^-@]+)"
        whitespace_to_minus:
          type: "pattern_replace"
          pattern: " "
          replacement: "-"
    analyzer:
      email:
        type: custom
        tokenizer: uax_url_email
        filter:
          - email_filter
          - lowercase
          - unique
      lowercased_string:
        tokenizer: keyword
        filter:
          - lowercase
  indexes:
    App\Document\Hotels:
      hosts: ['%env(ELASTIC_SEARCH_URL)%']
      settings:
        number_of_replicas: 2
        number_of_shards: 1
        refresh_interval: -1

We can define here 2 bigger parts:

  • analyzers section – here we define custom email and lowercase analyzers. If you are not familiar with analyzer at Elasticsearch – I propose you to get acquainted with that at my udemy course or at official ElasticSearch documentation. That theme is so big that it is not possible to describe it in few words. I will try to write 1-2 separate articles about it later. Though it is not so essential to understand analyzers to go further with current tutorial
  • indexes section – here we define our model for ElasticSearch index, connection string to ElasticSearch and some additional essential settings: number_of_replicas, shards, refresh_interval. I will describe what that parameters at further lectures.

Here is the code for App\Document\Hotels class:

<?php

namespace App\Document;

use ONGR\ElasticsearchBundle\Annotation as ES;

/**
 * @ES\Index()
 */
class Hotels
{
    /**
     * @var int
     * @ES\Id()
     */
    public $id;

    /**
     * @var int
     * @ES\Routing()
     */
    public $routing;

    /**
     * @var Hotel
     * @ES\Embedded(class="App\Document\Hotel")
     */
    public $hotel;

    /**
     * @var Booking
     * @ES\Embedded(class="App\Document\Booking")
     */
    public $booking;

    /**
     * @var string
     * @ES\Property(
     *  type="join",
     *  name="hotel_bookings_join_field",
     *  settings={
     *     "relations"={"hotel_parent": "booking_child"}
     *  }
     * )
     */
    public $hotelBookingsJoinField;

    public function __construct()
    {
        $this->hotel = [];
        $this->booking = [];
    }
}

Here we define Hotel and Booking models that are connected via parent-child relationship. Below is the code for according classes

<?php

namespace App\Document;

use Doctrine\Common\Collections\ArrayCollection;
use ONGR\ElasticsearchBundle\Annotation as ES;

/**
 * @ES\ObjectType()
 */
class Hotel
{
    /**
     * @var int
     * @ES\Property(name="hotel_id", type="integer")
     */
    public $hotelId;

    /**
     * @var string
     * @ES\Property(name="email", type="text", analyzer="email")
     */
    public $email;

    /**
     * @var string
     * @ES\Property(
     *  type="text",
     *  name="name",
     *  fields={
     *     "keyword"={"type"="keyword"},
     *     "lowercased"={"type"="text", "analyzer"="lowercased_string"}
     *  }
     * )
     */
    public $name;

    /**
     * @var string
     * @ES\Property(name="city_name_en", type="text")
     */
    public $cityNameEn;

    /**
     * @var geo_point
     * @ES\Property(name="location", type="geo_point")
     */
    public $location;

    /**
     * @var int
     * @ES\Property(name="age", type="integer")
     */
    public $age;

    /**
     * @var int
     * @ES\Property(name="stars", type="integer")
     */
    public $stars;

    /**
     * @var float
     * @ES\Property(name="rating", type="float")
     */
    public $rating;

    /**
     * @var bool
     * @ES\Property(name="free_places_at_now", type="boolean")
     */
    public $freePlacesAtNow;

    /**
     * @var Comment
     * @ES\Embedded(class="App\Document\Comment")
     */
    public $comments;

    public function __construct()
    {
        $this->comments = new ArrayCollection();
    }

    public function addComment(Comment $comment)
    {
        $this->comments[] = $comment;

        return $this;
    }
}
<?php

namespace App\Document;

use ONGR\ElasticsearchBundle\Annotation as ES;

/**
 * @ES\ObjectType()
 */
class Booking
{
    /**
     * @var float
     * @ES\Property(name="price", type="float")
     */
    public $price;

    /**
     * @var \DateTime
     * @ES\Property(
     *  type="date",
     *  name="date",
     *  settings={
     *     "format":"yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
     *  }
     * )
     */
    public $date;
}

And here is the Commnet class which is nested object for Hotel

<?php

namespace App\Document;

use ONGR\ElasticsearchBundle\Annotation as ES;

/**
 * @ES\NestedType()
 */
class Comment
{
    /**
     * @var int
     * @ES\Property(name="hotel_id", type="integer")
     */
    public $hotelId;

    /**
     * @var string
     * @ES\Property(name="content", type="text")
     */
    public $content;

    /**
     * @var int
     * @ES\Property(name="stars", type="integer")
     */
    public $stars;

    /**
     * @var \DateTime
     * @ES\Property(
     *  type="date",
     *  name="created_at",
     *  settings={
     *     "format":"yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
     *  }
     * )
     */
    public $createdAt;
}

Using object oriented approach we are representing data that would be used by ElasticSearch via according classes. Please, pay attention at class properties. Every property has annotation. ONGR bundle use that annotations to create proper index mapping. Here I am not discovering why exactly such architecture was chosen (parent child relationship, nested objects) or why exactly such type of mapping used for every field. If you are interested in that, please refer to my course at udemy, where I am discovering mapping and architecture aspects in details

So, now, when we described our model layer using classes, we can create index. It can be performed by next command:

docker exec -it udemy_phpes_php bin/console ongr:es:index:create -i hotels

ONGR bundle will perform all low level routine work instead of us – will read annotations from our classes, transliterate it to json mapping structure and send POST query to ElasticSearch cluster.

At the next lecture (Part 6: Symfony ElasticSearch – indexer command) we will go further with creating our search microservice. If you would like to pass all material more fast, then I propose you to view my on-line course at udemy where you will also find full project skeleton. Below is the link to the course. As the reader of that blog you are also getting possibility to use coupon for the best possible low price. Otherwise, please wait at next articles. Thank you for you attention.


architecture cluster docker elasticsearch flask geo high availability java php programming languages python recommendation systems search systems spring boot symfony