Check out "Do you speak JavaScript?" - my latest video course on advanced JavaScript.
Language APIs, Popular Concepts, Design Patterns, Advanced Techniques In the Browser

MySQL vs JSON file data storing

I'm not exactly a performance perfectionist, but currently I'm working on a tool which will be used in a large system. It will deliver texts to a multi-language environment and there will be several sites/apps which will performe big amount of request to it. It is the right moment for me to choose how to speed up my code and the work of the tool overall.

The texts are stored in MySQL database. The first option is to fetch the data directly from there and serve it whenever it is needed. Two problems with that - there will be a lot of records in the database and there will be a lot of requests every second. Naturally the tool offers export of all the information in one single JSON file. So I decided to get the data from there, instead of making mysql queries all the time. I was wondering which of these two approaches is faster and will fit in my needs. I created a dummy database and fill it with more then 20000 records. The following script did the job:

function addUsers($numOfUsers) {
  mysql_connect("localhost", "root", "");
  mysql_select_db("krasimir_benchmark");
  // adding data 
  $firstNames = array("Curabitur", "Fusce", "Aliquam", "Proin", "Mathew", "Morbi", "Nullam");
  $lastNames = array("Proin", "Nunc", "Fusce", "Cras", "Donec", "Morbi", "Nam");
  $job = array("back-end developer", "administrator", "front-end developer", "PM");
  $city = array("New York", "Paris", "London", "Sofia");
  $notes = "... some very long text here ...";
  $date = date("Y-m-d H:i:s");
  for ($i = 0; $i < $numOfUsers; $i++) {
    mysql_query("INSERT INTO users (firstName, lastName, job, city, notes, registerDate) VALUES ('".$firstNames[rand(0, count($firstNames) - 1)].
      "',  '".$lastNames[rand(0, count($lastNames) - 1)].
      "',  '".$job[rand(0, count($job) - 1)].
      "',  '".$city[rand(0, count($city) - 1)].
      "',  '".$notes.
      "',  '".$date.
      "'
      );");
  }
}

Simple method which exports the data to a file in JSON format.

function exportToJson() {
  mysql_connect("localhost", "root", "");
  mysql_select_db("krasimir_benchmark");
  $res = mysql_query("SELECT * FROM users ORDER BY id");
  $records = array();
  while ($obj = mysql_fetch_object($res)) {
    $records[] = $obj;
  }
  file_put_contents("data.json", json_encode($records));
}

The file became really big - around 40MB. I'm sure that I'll never reach 40MB with the tool. However, it is good for the testing purposes. Here are the two methods which test the fetching from MySQL database and from the JSON file.

function testMySQL() {
  mysql_connect("localhost", "root", "");
  mysql_select_db("krasimir_benchmark");
  $res = mysql_query("SELECT * FROM users ORDER BY id");
  echo "All records: ".mysql_num_rows($res);
  while ($obj = mysql_fetch_object($res)) {
    printRecord($obj);
  }
}

function testJSON() {
  $records = json_decode(file_get_contents("data.json"));
  $numOfRecords = count($records);
  echo "All records: ".$numOfRecords;
  foreach($records as $record) {
    printRecord($record);
  }
}

printRecord is a function which just outputs the data.The results of the tests are: MySQL reading - 2.97sJSON file reading - 3.16sHowever, the above results are actually based on only one request from one user. I decided to use the Apache benchmark tool to simulate multiple requests. And it looks like MySQL approach still wins. The json.php takes less time but some of the requests fail due to Out of memory problem of PHP.

// MySQL
  ab -n 30 -c 30 http://localhost/mqsql.php

  Concurrency Level:      30
  Time taken for tests:   30.518 seconds
  Complete requests:      30
  Failed requests:        0
  Write errors:           0
  Total transferred:      40092780 bytes
  HTML transferred:       40087800 bytes
  Requests per second:    0.98 [#/sec] (mean)
  Time per request:       30517.578 [ms] (mean)
  Time per request:       1017.253 [ms] (mean, across all concurrent requests)
  Transfer rate:          1282.97 [Kbytes/sec] received
  Connection Times (ms)              min  mean[+/-sd] median   max
  Connect:        0    0   0.5      0       2
  Processing: 24383 28657 1727.5  29306   30510
  Waiting:     2110 15625 6618.0  18188   23729
  Total:      24383 28657 1727.3  29307   30510
  
  // JSON
  ab -n 30 -c 30 http://localhost/json.php
  Concurrency Level:      30
  Time taken for tests:   3.384 seconds
  Complete requests:      30
  Failed requests:        11
  Write errors:           0
  Total transferred:      2683460 bytes
  HTML transferred:       2677892 bytes
  Requests per second:    8.87 [#/sec] (mean)
  Time per request:       3383.789 [ms] (mean)
  Time per request:       112.793 [ms] (mean, across all concurrent requests)
  Transfer rate:          774.45 [Kbytes/sec] received
  Connection Times (ms)              min  mean[+/-sd] median   max
  Connect:        0    0   0.4      0       1
  Processing:    52  857 767.5    877    3379
  Waiting:       52  821 652.2    877    2850
  Total:         53  857 767.5    877    3379

What you think? What are your tips and tricks for such kind of situations?

If you enjoy this post, share it on Twitter, Facebook or LinkedIn.