Symfony ElasticSearch – indexer command

S

Hi, and welcome to the 6th article devoted to the theme:  “How to work with ElasticSearch using Symfony PHP framework”. Previous article (Part 5: Symfony ElasticSearch – model data layer) is located here. At that article we are going to create symfony command which would be used for indexing some initial test data to ElasticSearch index. The command is located at src/Command folder where we also have some test data that is represented by json files

Symfony indexer command

Here is the code of the command:

<?php

namespace App\Command;

use App\Document\Booking;
use App\Document\Comment;
use App\Document\Hotel;
use App\Document\Hotels;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\DependencyInjection\ContainerInterface;

class HotelsIndexerCommand extends Command
{

    const COMMAND = 'udemy_phpes:hotels-indexer';

    const OPTION_OFFSET = 'offset';
    const OPTION_LIMIT = 'limit';
    const ARGUMENT_DEBUG = 'debug';

    private $container;

    /**
     * @var bool
     */
    private $debug = false;

    /**
     * @var OutputInterface
     */
    private $output;

    public function __construct(
        ContainerInterface $container, 
        string $name = null
    ) {
        $this->container = $container;
        parent::__construct($name);
    }

    /**
     * Configuration.
     */
    protected function configure()
    {
        $this
            ->setName(self::COMMAND)
            ->setDescription('Index hotel documents')
            ->addOption(
                self::OPTION_LIMIT, 
                'l', InputOption::VALUE_REQUIRED, 
                'Limit', 3
            )
            ->addOption(self::OPTION_OFFSET,
                'o', 
                InputOption::VALUE_REQUIRED, 
                'Offset', 0
            )
            ->addArgument(
                self::ARGUMENT_DEBUG, 
                InputArgument::OPTIONAL, 
                'Enable debug', 
                false
            );
    }

    /**
     * @param InputInterface $input
     * @param OutputInterface $output
     */
    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $this->output = $output;
        $this->debug = $input->getArgument(self::ARGUMENT_DEBUG);
        $limit = $input->getOption(self::OPTION_LIMIT);
        $offset = $input->getOption(self::OPTION_OFFSET);
        $this->index($limit, $offset);
    }

    /**
     * @param int $limit
     * @param int $offset
     */
    private function index(int $limit, int $offset)
    {
        $this->verboseWriteln('Perform indexing logic here.');
        $this->verboseWriteln(sprintf(
            'Indexing %d hotels with offset %d...', 
            $limit, 
            $offset)
        );
        $this->verboseWriteln(sprintf(
            'Memory peak: %d MB', 
            ceil(memory_get_peak_usage() / 1024 / 1024))
        );

        /*
         * Example of adding parent documents + nested objects
         */
        try {
            for ($i = 1; $i <= $limit; $i++) {
                $fileName = __DIR__.'/Data/'.$i.'.json';
                $data = json_decode(file_get_contents($fileName));

                $document = new Hotels();
                $hotel = new Hotel();

                $hotel->hotelId = $data->hotel->hotel_id;
                $hotel->email = $data->hotel->email;
                $hotel->cityNameEn = $data->hotel->city_name_en;
                $hotel->name = $data->hotel->name;
                $hotel->stars = $data->hotel->stars;
                $hotel->rating = $data->hotel->rating;
                $hotel->age = $data->hotel->age;
                $hotel->freePlacesAtNow = $data->hotel->free_places_at_now;
                $hotel->location = $data->hotel->location;

                foreach ($data->hotel->comments as $item) {
                    $comment = new Comment();
                    $comment->hotelId = $item->hotel_id;
                    $comment->content = $item->content;
                    $comment->stars = $item->stars;
                    $comment->createdAt = $item->created_at;
                    $hotel->addComment($comment);
                }

                $document->id = $data->hotel->hotel_id;
                $document->routing = 1;
                $document->hotel = $hotel;
                $document->hotelBookingsJoinField = ['name' => 'hotel_parent'];

                $indexManager = $this->container->get(Hotels::class);
                $indexManager->persist($document);
                $indexManager->commit();
            }
        } catch (\Throwable $t) {
            $this->verboseWriteln($t->getMessage());
        }

        /*
         * Example of adding child document
         */
        try {
            $booking = new Booking();
            $booking->date = "2021/08/01";
            $booking->price = 100;

            $document = new Hotels();
            $document->id = '1-4';
            $document->routing = 1;
            $document->booking = $booking;
            $document->hotelBookingsJoinField = [
                'name' => 'booking_child', 
                'parent' => 1
            ];
            $indexManager = $this->container->get(Hotels::class);
            $indexManager->persist($document);
            $indexManager->commit();
        } catch (\Throwable $t) {
            $this->verboseWriteln($t->getMessage());
        }

        $this->verboseWriteln('END.');
    }
    
    private function verboseWriteln($str)
    {
        if ($this->debug) {
            $this->output->writeln($str);
        }
    }
}

At command we define some initial parameters e.g if run it in debug mode, some limits and offset . Next we have the execution part where we are reading in a loop the hotel’s data from json files. Below is the example of test data fixtures at JSON format. I am also adding one child booking document directly as an example in the command by itself after the main loop.

{
  "hotel": {
    "hotel_id": 1,
    "email": "hotel-1@trvb.com",
    "city_name_en": "Warsaw",
    "name": "Golden star hotel",
    "stars": 5,
    "rating": 4.85,
    "age": 7,
    "free_places_at_now": true,
    "location": {
      "lat": "52.21",
      "lon": "21.01"
    },
    "comments": [
      {
        "hotel_id": 1,
        "content": "Some comment",
        "stars": 5,
        "created_at": "2021/08/01"
      },
      {
        "hotel_id": 1,
        "content": "Some comment",
        "created_at": "2021/08/01",
        "stars": 5
      }
    ]
  }
}

Of course, that is only simplified example. But here you already have the code that covers all possible scenarios e.g parent child relationship, and nested object. In real life you will have to create some commands for initial indexing and reindexing with much more complicated logic. At real practice you will probably have to connect to some RDB, external microservices to get according data. Moreover, you will have to create some worker that will track all changes provided by customers and administrators at elasticsearch data. That is a little bit out of the horizonts of that article. But the principles I am using here would not be changed. So that command is a good point you would be able to start from. 

At the next lecture(Part 7: Symfony ElasticSearch – search service and query builder) we will finish to create our search microservice. Together, we will create search service, query builder and realize filter design pattern. If you can’t wait a long time, then I propose you to view all that material at my on-line course at udemy where you will also find full project skeleton. Below is the link to the course. As the reader of that blog you are also getting possibility to use coupon for the best possible low price. Otherwise, please wait at next articles. Thank you for you attention. Hope that you liked current part


architecture AWS cluster cyber-security devops devops-basics docker elasticsearch flask geo high availability java machine learning opensearch php programming languages python recommendation systems search systems spring boot symfony