How Magento uses EAV in Database Architecture – Part 2

Storage Architecture

Storage is the most important part of the Magento database.

Magento has different entity_types defined, Now the question arises that what is an entity_type? Explaining by the same example, as I considered Marker as an entity, the entity type a be defined as a product, i.e., marker.

Magento already has these entities predefined:

Each of the above is an entity type.

(Note: These are all EAV entity types, and according to the level of customization, you can create a new entity type too.)

As already mentioned earlier, every database attribute is stored according to the data type in a separate table.

Say we have a text type attribute, it will be stored in catalog_product_entity_text, or if it is an image, it will come under catalog_product_entity_varchar, or, if we have a new attribute for the price, it will come under catalog_product_entity_decimal.

(bulb) Imagining how I came to conclude the table names?

You see all the values are distributed among different tables for a single product. Now, the following questions may arise, how can so many values be stored when any "save" method is called or anything is saved? How does the save function detects where to store them and how to store them since the attributes are distributed and segregated?

For this let me take you back to the entity types. As I mentioned "catalog_product", this entity type is made for products EAV.

Magento has defined the table structure according to the entity types, for example, entity type product, entity_code = "catalog_product", has a table defined in the eav_entity_types table in the database, which contains an alias of the Magento table. Nonetheless, the alias does not provide the table without going in the code.

Still wondering about where the tables are? Let’s take entity_type = catalog_product. Now, the base table is always appended with ‘_entity’, hence, the basic table for products is by default named ‘catalog_product_entity’. This is the first table which is modified when you insert or update. It contains the basic attributes of products, i.e., static attributes like SKU, entity_id(product id), typeid( the type of product) etc.

So, what about other EAV attributes like name, URL, status, visibility, custom_attribute etc.?

These all attributes are in the eav_attribute table where all the attributes are stored according to the entity_type with several other properties such as data type, backend type, frontend type etc. Each of the attributes has an attribute id which is again mapped to different tables based on the data type.

These tables will be according to the earlier mentioned data types, i.e.,

for int = > catalog_product_entity_int
for text = > catalog_product_entity_text
for decimal = > catalog_product_entity_decimal
for varchar = > catalog_product_entity_varchar
for datetime => catalog_product_entity_datetime.

All the EAV attributes value are stored in these tables and all the tables have the same structure having columns value_id, attribute_id, entity_id, store_id, and the value of EAV.

Let us take an example of storing an attribute:
Let’s assume that there is a product with id 1 and we have to store the value of the attribute whose id is 42 and is of type varchar. The value for this attribute for store 1 is Permanent Marker Luxor. So, the query would be -

insert into catalog_product_entity_varchar (attribute_id,entity_id,store_id,value) values ("42","1","1","Petrmanenrt Marker Luxor");

This type of query will run for each of the attribute being saved, hence, there are a lot of queries running to save a product. Obviously, magento does not run the above query, it is the Magento ORM that makes a query to be saved in a single instance.

To illustrate more about the architecture we have made a simple diagram involving all the tables for EAV.

Still wondering about the drop-down or multi-select attributes. Keeping the reusability in mind for the options, another table is there where the options are stored with respect to the attribute. It is not assigned to any product yet, and for assigning any dropdown value to the product, the catalog_product_entity_int table is used with the value column containing the option id.

For reference, you can see options stored in tables eav_attribute_option adn eav_attribute_option_value

This is how the storage architecture works for Magento. One might think that a lot of queries are running just for an attribute or a product, so how is sped up and optimized? The answer to this question lies in the next topic.

How Magento Optimized the database?

Since a lot of queries are running, one might think that the system will get slow, but it’s not true. Think of a category listing page where a lot of products are shown if there are 4 attributes and 12 products on a page so the total attributes selected will be 4X12, i.e., 48 attributes to be fetched, and yet it is a speedy process, HOW?

For the optimization and reducing the load on the database, Magento makes the flat table for products, categories, order, etc. Also, an index table is introduced where the values are indexed according to the product_id, attribute_id, and option_id. So, instead of hitting a lot of Magento EAV tables, Magento only uses the index and flat tables. This is the simplest form of optimization that has been introduced. Magento arranges all the filterable attributes separately by listing them under separate Magento index and flat tables. This is why indexing takes a lot of time because it identifies the attribute and its usability from its properties and accordingly indexes them. Magento makes various index tables for different purposes to optimize the database performance. Though it's not the best way to optimize the database, it is the only way if we want to maintain the scalability of the database.

EAV VS Flat

If you randomly search the above topic on web, a lot of pages will come illustrating the drawbacks of EAV. Yes, flat tables are good to use and as mentioned above, Magento uses it too. But in my opinion, EAV is good to use for a scalable system, the question is How? Say you have a new attribute of a product ‘_eatable’ or anything for that matter in EAV, I have to just add an attribute in Magento backend and let Magento take care of it to use as flat. But in flat tables, we have to modify the table structure for that attribute, unlike Magento where it is automatically done. In my personal opinion, magento EAV model is the best as it does not need any modifications to be done by the programmer at the code level. You can simply create an attribute and the magic method ‘_set’ will do the magic of setting the data.

No need to modify the query or generate the query.

Conclusion

Magento is driving e-commerce of almost 27% companies and has recently come up with its new version Magento 2.X focusing on the business and performance. For e-commerce markets performance is the key to increased sales and revenue, growth, and success. Since Magento has already done so much for optimization in their first version, in Magento 2.X they have involved advanced optimization by dropping some tables and some columns and adding different columns. Magento has been driving e-commerce for last 10 years and successfully adding value to e-commerce market with its great technology stack.

Prakhar Bhardwaj

Codilar Technologies