Elasticsearch integration with Symfony framework

Database search

In our previous articles, we presented one of the most popular web application framework written in PHP programming language. Performances are always important. When it comes to speed combination between Symfony and Elasticsearch give great results. We talked about why one should use it and what are key benefits of using it. One of the key features is the usage of bundles, which increase flexibility, efficiency, and robustness of your web application product. Today we are presenting you the most important bundle which we are integrating with every new project and why you should consider using it, and it is called FOSElasticaBundle.

First of all, before we start talking about the bundle itself, what is Elasticsearch?

Elasticsearch is an open-source, standalone database server developed in Java. It is able to achieve fast search responses because, instead of searching the text directly, it searches an index instead. Additionally, It is used for full-text-search and analysis. It takes in unstructured data from various sources and stores it in a sophisticated format (similar to JSON) that is highly optimized for language-based searches. All in all, on the same hardware, queries that would take more than 10 seconds using SQL will return results in under 10 milliseconds in Elasticsearch.

FOSElasticaBundle is a bundle which is developed and maintained by a group of people called FriendsOfSymfony. Basically this bundle is a wrapper around Elastica, PHP client for querying Elasticsearch database engine. Using this bundle you can query Elasticserach, easily convert between PHP objects and Elasticsearch data with JmsSerializer or Symfony Serializer, configure indexes for ElasticSearch and take advantage of listeners for Doctrine events for automatic indexing.

Assuming that you have created Symfony 4 project, installed and started ElasticSearch server (instructions can be found here) lets dive into a basic example. Let us think of the scenario where we have two entities, User and UserCategory with properties shown on the image below:

Database diagram

First of all, we need to install FOSElasticaBundle. Thanks to Symfony bundles concept and composer package manager it is done by running a single command:

$ composer require friendsofsymfony/elastica-bundle

To enable the bundle to add this line to bundles array inside AppKernel::registerBundles() method:

new FOS\ElasticaBundle\FOSElasticaBundle()

Elastica index is available as a service with the key fos_elastica.index.app. Inside app/config/packages/fos_elastica.yaml file, we will create out index configuration like following:

fos_elastica:
  clients:
    default: { host: localhost, port: 9200 }

  indexes:

    tj_user_category:
      types:
                			user:
                    				properties:
                        				id:
                            					type: integer
                        				name:
                            					type: string

                    				persistence:
                        				driver: orm
                        				model: App\Entity\UserCategory
                        				repository: App\Repository\Elasticsearch\UserCategoryRepository
                        				finder: ~
                        				provider: ~

    tj_user:
      types:
                			user:
                    				properties:
                        				id:
                            					type: integer
                        				first_name:
                            					type: string
            last_name:
                            					type: string
                        				email:
                            					type: string
            user_category:
              type: object
              properties:
                id:
                  type: integer

                    				persistence:
                        				driver: orm
                        				model: App\Entity\User
                        				repository: App\Repository\Elasticsearch\UserRepository
                        				finder: ~
                        				provider: ~

Now that we have our indexes configured, and of course our Entity classes named User and UserCategory we will make Model classes whose properties are a subset of Entity classes properties having fields that you entered inside index configuration, and additional fields for pagination and full-text-search. Objects of these classes will hold information for a query that needs to be executed:

class BaseModel {
  protected $page;
  protected $perPage;
  protected $searchTerm;

  /** getters and setters **/
}

class UserCategoryModel extends BaseModel{
  protected $name;

  /** getters and setters **/
}

class UserModel extends BaseModel{
  protected $first_name;
  protected $last_name;
  protected $email;
  protected $user_category;

  /** getters and setters **/
}

You have noticed repository classes inside index configuration. This is where you build and execute your queries:

class UserCategoryRepository extends Repository {
  
  public function search(UserCategoryModel $model) {
    $boolQuery = new BoolQuery();
    $boolTermQuery = new BoolQuery();

    $termName = new Query\Wildcard();
    $termName->setParams(['name' => '*'.$model->getSearchTerm().'*']);
    $boolTermQuery->addShould($termLastName);

    $query = new Query();
    $query->setQuery($boolQuery);
    $adapter = $this->finder->createPaginatorAdapter($query);
        		$result = $adapter->getResults($this->getOffset($model->getPage(), $model->getPerPage()), $model->getPerPage())->toArray();
        		$count = $adapter->getTotalHits();

    $boolQuery->addMust($boolTermQuery);

        		return [
            			'total' => $count,
            			'result' => $result,
            			'page' => $model->getPage(),
            			'perPage' => $model->getPerPage()
        		];
  }
}

class UserRepository extends Repository {
  
  public function search(UserModel $model) {
    $boolQuery = new BoolQuery();
    $boolTermQuery = new BoolQuery();

    $termFirstName = new Query\Wildcard();
    $termFirstName->setParams(['first_name' => '*'.$model->getSearchTerm().'*']);
    $boolTermQuery->addShould($termFirstName);

    $termLastName = new Query\Wildcard();
    $termLastName->setParams(['last_name' => '*'.$model->getSearchTerm().'*']);
    $boolTermQuery->addShould($termLastName);

    $termEmail = new Query\Wildcard();
    $termEmail->setParams(['email' => '*'.$model->getSearchTerm().'*']);
    $boolTermQuery->addShould($termEmail);

    $matchUserCategory = new Match();
    $matchUserCategoryQuery->setFieldQuery('user_category.id', $model->getUser());
    $boolTermQuery->addMust($matchUserCategory);			

    $boolQuery->addMust($boolTermQuery);

    $query = new Query();
    $query->setQuery($boolQuery);
    $adapter = $this->finder->createPaginatorAdapter($query);
        		$result = $adapter->getResults($this->getOffset($model->getPage(), $model->getPerPage()), $model->getPerPage())->toArray();
        		$count = $adapter->getTotalHits();

        		return [
            			'total' => $count,
            			'result' => $result,
            			'page' => $model->getPage(),
            			'perPage' => $model->getPerPage()
        		];
  }
}

You can see how queries are easy to build an execute. For search term queries we are using Wildcard, as you might expect they are equivalent of SQL LIKE statement. All queries are wrapped inside BoolQuery using Should, Filter, Must, and MustNot. The Must clause must appear in matching document, and its functionality mimics the boolean “AND”. The functionality of the Should query corresponds to the boolean “OR”. In a query context, if Must and Filter queries are present, they should query occurrence then helps to influence the score. For pagination, we are using PaginatorAdapter, and by giving it offset and perPage you will get desired search result subset.

After we have everything set up, to try it out we just need to get this repository inside Handler, create search model and call search method with a search model as a parameter. It should look like this:

$model = new UserModel();
$model->setPage(1);
$model->setPerPage(5);
$model->setSearchTerm('something');

$this->getElasticaManager()->getRepository(User::class)->search($model);

As we said, FOSElasticaBundle indexes any new or modified objects automatically. In some cases where the database is modified externally, the Elasticsearch index must be updated manually. This can be done by running console command:

$ php bin/console fos:elastica:populate

With this simple example, you won’t see the advantages of using ElasticSearch in terms of time taken to get the results. However, when building enterprise software solutions where you are querying tables with millions of entries Elasticserach becomes your best friend. Even when you are building a small webshop application, we advise using Elasticseach from the very beginning because every client wishes and expects millions of user or products for sale. This might not be the case with your every project, but when it happens and the whole world starts using your software you will be ready to serve every request fast. It is always easier to integrate it at the beginning of the development process then after you realize your queries are getting slow to execute. Some of the companies that are using Elasticsearch are Netflix, Tinder, Cisco, etc. In this article, we have only scratched the surface of Elasticsearch power and use cases but didn’t mention whole ELK stack (Elastic, Logstash, Kibana) which we are leaving for the time that comes.