Switching databases: Migrating from MariaDB or MySQL to Postgres
eZ Platform has official support for MariaDB, MySQL and PostgreSQL databases. But how about switching after installing? Learn from and our experiences and where we stand today.
Relational databases remain at the heart of many different types of applications. They were also one of the first widely deployed complex FOSS (Free Open Source Software) software packages widely available and deployed on the web. Most notably there are three common Open Source databases in use today: MariaDB, MySQL and PostgreSQL.
Historically the MySQL database is the most common and most well-known of the trio. MariaDB is its close relative, being a fork of the MySQL project initiated after Oracle (a vendor of proprietary database software) gained control MySQL through its acquisition of Sun Microsystems in 2009. While the MariaDB codebase has diverged from MySQL, the two still share the protocol to communicate to software. This means that software using MySQL is largely compatible with MariaDB, excluding the use of novel product specific features.
The third widespread Open Source RDBMS (Relational Database Management System), PostgreSQL, on the other hand is a separate project. Development of the database was started in 1996 (MySQL kicked off in 1995), and it has a completely different foundation. Unlike MariaDB Postgres (short for PostgreSQL) also uses its own communication protocol which means software needs to be written to support it as an alternative database.
From the very beginning, PostgreSQL has aimed to provide a very robust and feature complete SQL compatible storage. MySQL on the other hand made some compromises in return for performance its early years. However, in the last decade both PostgreSQL and MariaDB/ MySQL have matured and are now very similar in capabilities, including NoSQL storage using JSON field types. Their unique SQL dialects can be mitigated by using a DBAL.
eZ Platform and databases in a nutshell
eZ Platform and its predecessors have used relational databases to store data. Earlier iterations had could be used with databases from vendors such as Microsoft and Oracle, but eZ Platform 1.0 shipped with support only for MariaDB and MySQL. This changed in eZ Platform version 2.5.0 where we introduced official support for PostgreSQL. To enable this, we created doctrine-dbal-schema for defining schemas in a backend agnostic YAML format.
Once you have set up your project, working with any database when using eZ Platform is very similar. You will use the standard tools provided with your database such as psql, pg_dump, mysql and mysqldump CLI applications. Day-to-day development with eZ Platform does not change based on what database storage system you are using.
When you want to switch between databases there are some additional hoops to jump through. It should be noted that switching databases is not something you should embark on a whim. Changing your database is an architectural decision that will affect your developers, DBAs and hosting in varying degrees. It can be justified for example to:
- Consolidate all your projects to a database platform to streamline maintenance
- Gain access to some specific feature that is unique to a specific database product
- Improve performance with a DB that has better characteristics for your workloads
While Doctrine DBAL will take care of minute differences in SQL dialects for the eZ Platform Storage Engine, transferring data between different databases is not that straightforward. The most common method of moving data from one environment continue to be database dumps, which are files (text or binary) containing all the SQL statements needed to create all the table definitions and data from one environment to another.
As of 2019 there is no universally compatible format you can import and export between MySQL / MariaDB / PostgreSQL. In the following section you can learn about my experiences in converting data between these three popular Open Source RBDMSs.
Migrating from MariaDB/MySQL to PostgreSQL
Migrating data from one database system to another is not as common as moving data from one installation to another, but it is not unheard of. As discussed above data portability is not trivial even when moving from one open source system to another, but there are tools that help you with database migrations. Don't try to go at it in a vacuum.
The PostgreSQL project wiki keeps a track of tools for migrating data to it. In our case, the most interesting section is the one listing of tools to convert MySQL databases to PostgreSQL. After reviewing the available options and doing some search, I concluded that pgloader and nmig seemed to be the most promising options for me to try.
Both tools promise to covert both the schema and the data from a MariaDB or MariaDB database to PostgreSQL. I didn't want to take their word for it and using a simple schema. I wanted to try a more complex use case. For this, I installed a copy of ezplatform-ee-demo, a distribution showcasing the capabilities of eZ Platform Enterprise Edition.
The initial installation was done to a MariaDB database I already had for local development. For the migration target I installed PostgreSQL from Homebrew, since I had not previously used Postgres on this laptop. Once the installation was done, I went ahead and created a fresh database to act as our target. Next, I ran a full migration (schema + data) using pgloader by supplying in the necessary source and the target connection strings:
pgloader mysql://root@localhost:3306/ezplatform_ee_demo postgresql://janit@localhost/ezplatform_ee_demo
This went through swiftly and after configuring eZ Platform to use PostgreSQL I happy to find that the application was running as expected. After logging into the admin and verifying that the installation was indeed running PostgreSQL, I proceeded to do some basic testing. After creating a new blog post, I was hit with an exception boiling down to this:
An exception occurred while executing 'SELECT t.* FROM ezcontentobject_tree t INNER JOIN ezcontentobject c ON (t.contentobject_id = c.id) AND ((c.language_mask & 31) > 0) WHERE t.node_id = ?' with params [2]: SQLSTATE[42883]: Undefined function: 7 ERROR: operator does not exist: numeric & integer LINE 1: ...(t.contentobject_id = c.id) AND ((c.language_mask & 31) > 0)... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
After some web searches I found a few issues like this and concluded that there must be something wrong with the automatically converted schema. By examining the table in question, I could see that there were some significant differences between the schemas created by the eZ Platform installer and the schema converted from MySQL by pgloader:
I'm no database specialist, but I could see chasing down and fixing differences would be a trip down a rabbit hole. The data had migrated well, but I concluded that tracking down schema differences by hand was not feasible. Data integrity is hard, and databases are very good at enforcing it. I wanted to have no doubt in my mind that the schema was correct.
I decided to give pgloader a break at this point and see how nmig would do for our conversion. After installing and configuring nmig, I was again ready to give it a go. The conversion ran through without errors, but an error surfaced when editing content:
An exception occurred while executing 'UPDATE "ezcontentobject" SET "name" = :placeholder1, "modified" = :placeholder2, "language_mask" = :placeholder3, "is_hidden" = :placeholder4 WHERE "id" = :placeholder5' with params ["Greetings from PostgreSQL", 1571320335, 3, false, 250]: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "f"
Another quick comparison of the schemas showed that there was a small difference again:
By this time, I had lost confidence in either of the tools' ability to automatically convert the eZ Platform legacy storage engine database schema reliably. All the errors could have been fixed, but this was an method that wasn't feasible for reliable database migrations en masse. I decided to migrate the schema manually and then import the data on top of that.
I installed another copy of eZ Platform EE demo on PostgreSQL and exported the schema:
pg_dump -s ezplatform_ee_demo_fresh > schema_only.sql
After this I imported the blank schema to a new database:
psql ezplatform_ee_demo_blank < schema_only.sql
From reading the nmig configuration, I remembered that it allowed for a configuration option that would import data only, taking the guesswork of schema conversion off the table. I made the necessary changes to import the data on to my schema-only DB.
I executed the nmig migration command, which passed with flying colors. Since PostgreSQL was handling data validation against the predefined schema, I was fairly confident that this round would be better. I proceeded with a quick round of superficial smoke testing in the admin interface by creating, editing and deleting content. To my delight everything I tried, worked.
Conclusion (we're not there yet)
The previous section should've been wrapped in screaming orange tape with ABSOLUTELY NO WARRANTY written all over it. I can't do that kind of CSS wizardry, I'll just state it here: eZ Systems provides no official support for migrating from a MySQL or MariaDB database to PostgreSQL. No identification with actual migration projects (failed or successful), databases, Apache Camels and/or configurations is intended or should be inferred.
The eZ Platform database migration (the one you never read about) is a very crude test case of showing that migrating an eZ Platform installation from one relational database system to another is feasible. And even at this low complexity you would need to do an exhaustive verification to make sure this installation works as is intended. This boils down to the fact that we are building and supporting a product that needs to be tried and tested.
Out there in the real world your database will have edge cases for data conversions, a ton of custom tables and whatnot. Having just tipped my toe into the world of database conversions, I can say it is a lot more complex than what I would have imagined. I did have some earlier experience with simple SQLite to MySQL migrations, and they went fine. Armed with the knowledge of schema conversion trouble, I am less confident in my skills than I was prior to writing this article. The more I know, the more I know what I don't know.
Now that doesn't mean that you shouldn't migrate your eZ Platform project from one database platform to another. We currently don't offer it as an off-the-shelf upgrade, but you can always get help from our professional services team. And of course, you're free to do your own experiments, keeping in mind that your mileage may vary.
As for when we will have official support for migrating from one database platform to another, I don't know. But what I can promise is that our database connoisseurs are aware of this is missing and we'll be sure to keep you posted on any progress on this front.
If you are attempting to migrate your database to another and would like some insight and advice? Why not contact our Professional Services team?
NOTE: Elephants Of Kenya by Rick Bergstrom licensed under Attribution-NoDerivs 2.0
eZ Platform is now Ibexa DXP
Ibexa DXP was announced in October 2020. It replaces the eZ Platform brand name, but behind the scenes it is an evolution of the technology. Read the Ibexa DXP v3.2 announcement blog post to learn all about our new product family: Ibexa Content, Ibexa Experience and Ibexa Commerce