Interfacing with Relational Database using MySQL.jl and PyMySQL
Prior to the advent of computing, relational database can be thought of log books typically used for inventory and visitor’s time-in time-out. These books contain rows that define the item/transaction, and columns describing the features of each row. Indeed, these are the core attributes of any relational database. Unlike spreadsheets, which are used for handling small datasets, databases are mostly used for storing huge transactional data for later use. They run on a server and often at the backend of any user (client) interface such as websites and mobile applications. These applications communicate with database via processing servers (e.g. Flask and Django). The figure below illustrates the request and response communcations between client and servers. As mentioned earlier, databases are meant to store data for later use — in the sense that we can use it as a response to client’s requests, such as viewing or data extraction for insights. In this article, we are interested in data extraction from the database. In particular, the objective is to illustrate how to send request to MySQL server, and how to process response both from Julia and Python.
MySQL Server Setup
To start with, we need to setup MySQL server in our machine. Click the following link to download and install the application.
Note that I recommend you to download the latest version of MySQL since the setup above is using the old version.
Query: Creating Database
In order to appreciate what we are aiming in this article, we need to go through some basic SQL queries to understand what type of request to send to MySQL server. I’m using macOS, but the following should work on Windows as well. For macOS users, open the MySQL Server Shell by running
mysql -u root -p (hit return or enter , and type in your MySQL root password you specified during the installation setup from the previous section) in the terminal. For windows, try to look for it in the Start Menu.
From here, we are going to check the available databases in MySQL server. To do this, run the following:
Indeed, there are four out-of-the-box defined databases already, and we don’t want to touch that. Instead, we are going to create our own database, let’s call it
tutorial. To do this, run the following codes:
The best thing about SQL syntax is that, everything is self-explanatory, except maybe for line 19, which simply confirmed that we are using
tutorial as our database.
Query: Creating Table
Next is to create a table for our database, we are going to use the 2019 Philippine Election results with columns: Last Name, First Name, Party, Votes. Further, for purpose of illustration, we are going to use the top 5 senators only.
Query: Inserting Values
The following codes will insert the top five senators from the 2019 Philippine election results.
Query: Show Data
To view the encoded data, we simply select all (
*) the columns from the table.
MySQL Clients on Julia and Python
For this exercise, our goal is to save the NYC Flights (2013) data into the database and query it from Julia and Python.
Downloading NYC Flights Data
I have a copy of the dataset on Github, and so the following code will download the said data:
Connect to MySQL Server
In order for the client to send request to MySQL server, the user/client needs to connect to it using the credentials set in the installation.
Note that you need to have a strong password, and this configuration should not be exposed to the public. The above snippets are meant for illustration.
To test the connection, let’s send our first request — to show the tables in the database:
In Julia, the response is recieved as a MySQL.Query object and can be viewed using DataFrame. For Python, however, you will get a tuple object.
Create NYC Flights Table
At this point, we can now create the table for our dataset. To do this, run the following:
As shown in the previous section, sending request to the server both in Julia and in Python is done by simply using a string of SQL queries as input to MySQL.jl and PyMySQL APIs. Hence, the
query object (in line 3 of Julia code and line 4 of Python code) above, simply automates the concatenation of SQL query for creating a table. Having said, you can of course write the query manually. To check if we have indeed created the table, run the following codes:
As you can see, we’ve created it already, but with no entry yet.
Populating the Table
Finally, we are going to populate the table in the database by inserting the values row by row.
From the above Julia code, the result of the
stmt is an SQL
INSERT query with placeholder values indicated by
?. The timed (
@time in Julia code) loop in line 9 above maps the values of the vector, one-to-one, to the elements (
?) of the tuple in
stmt. Having said,
MySQL.Stmt has no equivalent in PyMySQL. Further, one major difference between these libraries is that, PyMySQL will not populate the table even after executing all sorts of SQL queries unless you commit it (
con.commit), as shown above. This is contrary to MySQL.jl which automatically commits every execution of the SQL queries. I do like the idea of having
con.commit in PyMySQL, since this avoids accidental deletion or modification in the database, thus adding a layer of security. To check if we have indeed populated the table, run the following:
To disconnect from the server, run
MySQL.disconnect(con) (Julia) or
For the benchmark, I added a timelapse recorder in populating and reading the table in the previous section. The figure below summarizes the results.
The figure was plotted using Gadfly.jl. Install this package using
Pkg as described above (see the first code block under MySQL Clients on Julia and Python section), along with Cario.jl and Fontconfig.jl. The latter two packages are used to save the plot in PNG format. See the code below to reproduce:
The aim of this article was simply to illustrate the usage of MySQL.jl APIs in comparison to the PyMySQL; and I would say both libraries have similarities in APIs (as expected) and are stable for the tasks. I should emphasize though that, I do like the
con.commit of PyMySQL since this adds a level of security, and I think this is a good addition to MySQL.jl in the future.
If you are impatient, here are the complete codes excluding the MySQL codes and the plots. These should work after installing the required libraries shown above:
References and Resources
- MySQL.jl Github Repo: https://github.com/JuliaDatabases/MySQL.jl
- PyMySQL Github Repo: https://github.com/PyMySQL/PyMySQL
- Flaticon: https://www.flaticon.com/