Adventures in fixing broken Korean text in MySQL DB

The website has been running on the Apache - PHP - MySQL (APM) solution for the past 13 years. Each component has been constantly upgraded over the years, and recently I decided to update MySQL from 5.7 to 8.0. Once I managed to migrate the database to the new version, I discovered that all the Korean texts on the website came out broken. This was a sign of mismatched character set, so I looked for the exact cause.

First, I rolled back to 5.7 and checked what character sets were being used, using the following SQL query:
show variables like 'char%';

Sure enough, "character_set_database" and "character_set_server" were set to "latin1". Upon checking the database and the tables that contain the website data, their character sets were all set to "latin1_swedish_ci", the default choice. All the Korean texts were being saved to the database in Latin1 format from the very beginning. It got converted into UTF-8 as it was passed to the output, so it looked normal when viewed through a web page. But if you looked directly into the database, it came out broken. MySQL 8.0 apparently decided to output the text in its saved form, unlike 5.7, hence the problem.

The solution proposed by many of the Korean blogs that had a similar problem was to alter the character set of the affected databases and tables in the following manner:
ALTER DATABASE data_database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;


ALTER TABLE data_table MODIFY COLUMN title VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;

And also, add the following lines under [mysqld] in the MySQL configuration file (my.cnf):
collation-server = utf8_unicode_ci
character-set-server = utf8

Sadly, all this did not help one bit. Upon further analysis, I arrived at the conclusion that the underlying data was still in "broken" form even if the settings had the character set changed. The data itself has to be rewritten in UTF-8, so I needed to dump the database and reload it in the correct character set. First, the dump:
mysqldump -u root -p --default-character-set=latin1 data_database > dump.sql

The "default-character-set" flag was set to "latin1" to ensure that the data is dumped in its originally saved character set. In the dumped file, I changed all the "latin1" strings into "utf8mb4".

Now I simply had to restore it back, but the "Specified key was too long; max key length is 1000 bytes" error prevented me from restoring some of the tables. I tracked the problem down to the limitation of the MyISAM database type. Because the "VARCHAR" data type for a column needs 3 times more space for UTF-8 than Latin1, the character set change caused the key length to exceed the 1000-byte limit. With InnoDB database type, it was 3072 bytes by default since MySQL 5.7.7.

Because of this, I changed the database type mentioned in the file from MyISAM to InnoDB. So why was it set to MyISAM in the first place? It was because Full-text index was not available for InnoDB at the time of the database creation. It was enabled in 2011 with MySQL 5.6.

With both the database type and character set changed in the dump file, I restored the database like this:
mysql -u root -p --default-character-set=utf8mb4 data_database < dump.sql

I could now see that all the Korean text appeared correctly in the database. It would also look right on the website if I kept the changes to the my.cnf file mentioned earlier. Finally, I migrated the database to MySQL 8.0 again, and ran the "mysql_upgrade" command. Everything was working as intended, and I no longer needed the changes to the my.cnf, so those were removed.

Long story short, initial database settings from 13 years ago almost held me back from upgrading to the newest MySQL version, but all of them are now fixed.
Defined tags for this entry: , ,

Server now on iMac and macOS High Sierra

New iMac 21.5" 2017, freshly booted and ready to replace Mac mini 2012

The venerable Mac mini 2012, which took over the job of the server from the iMac 2008 in February 2013, showed signs of its age two months ago, refusing to boot due to corrupted Fusion Drive. I was able to remedy the problem, but I thought it may be a good time to move over to a new system. Seeing that Apple has not updated Mac mini in three years (and frankly, the 2014 edition was not an upgrade many had hoped for) I decided to return to using an iMac.

The iMac 21.5" 2017 was able to smoothly take over the Mac mini last month, but for some reason the system came equipped with macOS Sierra (10.12) instead of High Sierra (10.13) which was already a month old at the time. So I applied an extra caution and checked carefully that the apps I ran were compatible before manually upgrading. Finally, I made the switch to High Sierra today. It seems everything is functioning as expected.
Defined tags for this entry: , , , ,

Server upgraded to macOS Sierra

With major tasks at the workplace wrapped up, I decided that the National Foundation Day holiday would be a good time to upgrade the Mac mini server from El Capitan (OS X 10.11) to Sierra (OS X macOS 10.12). For a warm-up, I upgraded the MySQL Server installation from 5.6 to 5.7 before that, but I ran into some weird issues and took about an hour to resolve. After getting MySQL to work again, I made a full system backup and installed Sierra. With the new OS in place, I restored the server configuration and now you see that the website is back in action. iPhone 7 review will resume shortly.
Defined tags for this entry: , , , , ,

The slow updating of Google's Mobile Usability Report

Pages with mobile usability issues as Google sees it, as of May 14, 2016

It's been more than a month since I revamped the site to be mobile friendly. Yet it takes quite a bit of time for Google to re-crawl all those pages and realize that they're have been changes. As you can see here, Google Mobile Usability Report still thinks roughly 15% of the indexed pages still aren't mobile friendly despite the fact that the template update affected the entire site at once. These pages would still show up in the search results without the "Mobile Friendly" tag and be ranked lower. I guess I'll have to wait a few more weeks.
Defined tags for this entry: , , ,

Optimizing the site further for mobile devices

Navigation is now titled and language selection is moved inside the menu

For the past few days, you may have noticed that the website had received further updates in addition to some changes done earlier. Most notably, I modified some underlying code so that the menu bar is properly multilingual - it'll show in the selected language only. In addition, the tagging function got Korean localization. But I didn't stop there because there had been some requests from mobile users that I couldn't ignore.

With a responsive web design, the sidebar that used to be always present next to the main page gets moved to the bottom when the screen isn't wide enough, most notably on mobile devices. Because of this, functions available on the sidebar would become hard to find when viewed on a smartphone.

To rectify this problem, I decided to make further modifications to introduce some of the elements in the sidebar into the top area of the website as long as overall design could be preserved. First thing to try was the language selection box. After much experimenting, it was placed into the menu bar as the first item. This would also be nicely shown on a mobile version's navigation, as you can see here.

Unfortunately, the navigation itself didn't let the user know that they could change the website's language from there when it's collapsed and hiding everything within. So I gave it a label, so even a first time visitor would know now.

Next was the search box in the banner. The new theme uses the one that triggers the internal "quicksearch" function. While this gives you a nice list of posts that you're looking for, it's very slow in reality and doesn't work with the tagging plugin. So I decided to replace this with the Google custom search engine, which was already in the sidebar.

With the integration done, the now-redundant sidebar elements were removed. This is it for now, but if you have more suggestions, feel free to comment.
Defined tags for this entry: , ,

Copyright (C) 1996-2018 Wesley Woo-Duk Hwang-Chung. All rights reserved.