Entries tagged as repair
battery A1533 charger iPhone 5S Celine Chung remote controlled helicopter replacement Syma S107G A1524 belt case iPhone 6 Plus BEXCO Busan exposition house laser pointer pen porcelains travel Apple Watch watch water resistance barcode camera Nintendo Wii Oband T2 remote control running screen swimming watchOS 2 adapter cable iOS lightning Micro-USB USB Geekbench iPhone 3GS iPhone 4 iPhone 4S iPhone 5 iPhone 6S Plus PeaceKeeper Bluetooth Christmas tree decoration balls lights sled snow Hayun Chung screwdriver table cookie milk Oreo Homeplus pizza chocolate chip cookies couch FaceTime New Year's Day rice cake Skype Sung-Eun Kim tteokguk computer swing Toys R Us trampoline Mega Kids Box Megabox movie theatre Wesley Woo-Duk Hwang-Chung tortilla wrap toothbrush salad Disney Frozen Danish butter cookies animation car Larva shopping cart SR626SW band A1522 A1549 A1586 AT&T Canada Hong Kong iPhone 6 Japan Korea LTE T-Mobile USA Verizon iPhone 5C SK Telecom A1530 KT LGU+ VoIP VoLTE barometer Focus Pixels iOS 8 iPad iPad mini Reachability screen protector Apple Store Screen Glue Ferris wheel game Goat Simulator iPhone 6+ Touch ID COEX Fukuoka Fukuoka Airport Incheon International Airport subway leather case Vega LTE-A 3DMark Kraken Octane performance SunSpider 3D Touch A1687 A1779 earphone iPhone 7 Series 2 IoT refrigerator Samsung iPhone SE CGV Gwangju hot dog popcorn snack Galaxy Fold Galaxy S20 Ultra Galaxy Z Flip glass iPhone X cat Concierge EyeFly3D Galaxy S4 Zoom MicroSD iPhone 11 Pro fingerprint sensor box astronomy Moon Sanyo VPC-C4 satellite Minolta Dimage X20 binoculars Chuseok telescope eclipse Kodak Z1085 IS star Sun slow motion sound electric fan LED light trail M&M's OIS Cortex Camera cleaning window drain laundry washing machine bathroom fan tile Bolt EV electric vacuum cleaner kitchen stainless steel straw coronavirus infrared thermal imaging KPX office Bitgaram City Naju LG Electronics bus double decker London strike Underground fog Tayo toy train airport school BRT government complex Osong Station Sejong building construction boat Hakata hydrofoil lake park seaport Dazaifu museum temple udon cathedral flower garden SRT Gangneung hockey Lego The Lego Movie dinosaur InoQ Tyrannosaurus 3D printing hot-end Makibox squirrel bunny HT-Mini v2 balloon Jenga 123D Design Autodesk coin Electric Love Marathon heart Christmas Octonauts play set present apple tree Financial Museum Kwangju Bank vegetable Pop Up Pirate Anna Elsa princess Jeonnam Science Festival s9y server search iMac Mac mini Mavericks OS X network tag Twitter MySQL Yosemite macOS Sierra High Sierra Big Sur apple book electric vehicle Burger King About Time premium ticket Creamy Pepper Burger Gravity IMAX Interstellar movie Lotteria restaurant chicken donut chicken tender spicy chicken fries hamburger Shrimp Steak Burger Captain Spicy clock desk ice cream cheese chicken nugget Whopper fries shrimp bread breakfast crab New Orleans wasabi diavola Monster X BBQ beef Bitcoin British Museum money fish chair assembly kit Education Center paint bed wire LAN printer planting blueberry yoghurt hotel mask sanitizer melodica coaxial cable EyeTV television cooler El Capitan iOS 9 AirPod Mac Studio monitor card reader MacBook Pro Comic Life expressway traffic jam commute car wash engine oil gas station blizzard Tamiya wheel headlights lamp Sonata 2 KIXX PAO KEPCO smart meter smart plug solar panel BIXPO Daepoong DE101 drone ESS car inspection BuyBeam BL-628 power switch Starbucks can coffee Daejeon department store Shinsegae shopping water engine cover Hyundai hub MacBook Air memory storage HDD Western Digital external case Seagate iOS 12 iOS 14 iPad Pro 9.7 AirTag Apple TV 4K Seoul A1633 A1634 A1688 iPhone 6S bicycle car-free street festival flea market Taekwondo Yeongtong BM Works Slim3 Zip smartphone mount playground flashlight air pump bell foot pump bottle holder Coriolis National Science virtual reality road epoxy umbrella case chocolate M&M’s keyboard Xbox blue screen of death CHKDSK error 0x00000024 laptop Ozobot programming cellphone pager telephone SSD Windows 10 USB-C HDMI power plug 850X MSI RevuAhn U200X M7 pedometer accelerometer compass gyroscope ISS Han-gang river passcode Canon SX50 HS flare Iridium Alhena constellation Jupiter Orion iOS 16 iPhone 13 Pro
Today’s “The Toon-Box”
Posted by Wesley onToday’s “The Toon-Box”
Posted by Wesley onToday’s “The Toon-Box”
Posted by Wesley onAdventures in fixing broken Korean text in MySQL DB
Posted by Wesley on
The Tool-Box.info website has been running on the Apache - PHP - MySQL (APM) solution for the past 13 years. Each component has been constantly upgraded over the years, and recently I decided to update MySQL from 5.7 to 8.0. Once I managed to migrate the database to the new version, I discovered that all the Korean texts on the website came out broken. This was a sign of mismatched character set, so I looked for the exact cause.
First, I rolled back to 5.7 and checked what character sets were being used, using the following SQL query:
Sure enough, "character_set_database" and "character_set_server" were set to "latin1". Upon checking the database and the tables that contain the website data, their character sets were all set to "latin1_swedish_ci", the default choice. All the Korean texts were being saved to the database in Latin1 format from the very beginning. It got converted into UTF-8 as it was passed to the output, so it looked normal when viewed through a web page. But if you looked directly into the database, it came out broken. MySQL 8.0 apparently decided to output the text in its saved form, unlike 5.7, hence the problem.
The solution proposed by many of the Korean blogs that had a similar problem was to alter the character set of the affected databases and tables in the following manner:
And also, add the following lines under [mysqld] in the MySQL configuration file (my.cnf):
Sadly, all this did not help one bit. Upon further analysis, I arrived at the conclusion that the underlying data was still in "broken" form even if the settings had the character set changed. The data itself has to be rewritten in UTF-8, so I needed to dump the database and reload it in the correct character set. First, the dump:
The "default-character-set" flag was set to "latin1" to ensure that the data is dumped in its originally saved character set. In the dumped file, I changed all the "latin1" strings into "utf8mb4".
Now I simply had to restore it back, but the "Specified key was too long; max key length is 1000 bytes" error prevented me from restoring some of the tables. I tracked the problem down to the limitation of the MyISAM database type. Because the "VARCHAR" data type for a column needs 3 times more space for UTF-8 than Latin1, the character set change caused the key length to exceed the 1000-byte limit. With InnoDB database type, it was 3072 bytes by default since MySQL 5.7.7.
Because of this, I changed the database type mentioned in the file from MyISAM to InnoDB. So why was it set to MyISAM in the first place? It was because Full-text index was not available for InnoDB at the time of the database creation. It was enabled in 2011 with MySQL 5.6.
With both the database type and character set changed in the dump file, I restored the database like this:
I could now see that all the Korean text appeared correctly in the database. It would also look right on the website if I kept the changes to the my.cnf file mentioned earlier. Finally, I migrated the database to MySQL 8.0 again, and ran the "mysql_upgrade" command. Everything was working as intended, and I no longer needed the changes to the my.cnf, so those were removed.
Long story short, initial database settings from 13 years ago almost held me back from upgrading to the newest MySQL version, but all of them are now fixed.
First, I rolled back to 5.7 and checked what character sets were being used, using the following SQL query:
show variables like 'char%';
Sure enough, "character_set_database" and "character_set_server" were set to "latin1". Upon checking the database and the tables that contain the website data, their character sets were all set to "latin1_swedish_ci", the default choice. All the Korean texts were being saved to the database in Latin1 format from the very beginning. It got converted into UTF-8 as it was passed to the output, so it looked normal when viewed through a web page. But if you looked directly into the database, it came out broken. MySQL 8.0 apparently decided to output the text in its saved form, unlike 5.7, hence the problem.
The solution proposed by many of the Korean blogs that had a similar problem was to alter the character set of the affected databases and tables in the following manner:
ALTER DATABASE data_database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
ALTER TABLE data_table DEFAULT CHARSET=utf8mb4;
ALTER TABLE data_table MODIFY COLUMN title VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
And also, add the following lines under [mysqld] in the MySQL configuration file (my.cnf):
collation-server = utf8_unicode_ci
character-set-server = utf8
Sadly, all this did not help one bit. Upon further analysis, I arrived at the conclusion that the underlying data was still in "broken" form even if the settings had the character set changed. The data itself has to be rewritten in UTF-8, so I needed to dump the database and reload it in the correct character set. First, the dump:
mysqldump -u root -p --default-character-set=latin1 data_database > dump.sql
The "default-character-set" flag was set to "latin1" to ensure that the data is dumped in its originally saved character set. In the dumped file, I changed all the "latin1" strings into "utf8mb4".
Now I simply had to restore it back, but the "Specified key was too long; max key length is 1000 bytes" error prevented me from restoring some of the tables. I tracked the problem down to the limitation of the MyISAM database type. Because the "VARCHAR" data type for a column needs 3 times more space for UTF-8 than Latin1, the character set change caused the key length to exceed the 1000-byte limit. With InnoDB database type, it was 3072 bytes by default since MySQL 5.7.7.
Because of this, I changed the database type mentioned in the file from MyISAM to InnoDB. So why was it set to MyISAM in the first place? It was because Full-text index was not available for InnoDB at the time of the database creation. It was enabled in 2011 with MySQL 5.6.
With both the database type and character set changed in the dump file, I restored the database like this:
mysql -u root -p --default-character-set=utf8mb4 data_database < dump.sql
I could now see that all the Korean text appeared correctly in the database. It would also look right on the website if I kept the changes to the my.cnf file mentioned earlier. Finally, I migrated the database to MySQL 8.0 again, and ran the "mysql_upgrade" command. Everything was working as intended, and I no longer needed the changes to the my.cnf, so those were removed.
Long story short, initial database settings from 13 years ago almost held me back from upgrading to the newest MySQL version, but all of them are now fixed.