<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>phoenixheart - portfolio &#38; more &#187; mysql</title>
	<atom:link href="http://www.phoenixheart.net/tag/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phoenixheart.net</link>
	<description>phoenixheart - portfolio &#38; more</description>
	<lastBuildDate>Wed, 23 Mar 2011 09:47:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
	<script type="text/javascript">
if (typeof Meebo == "undefined") {
Meebo=function(){(Meebo._=Meebo._||[]).push(arguments)};
(function(q){

	var args = arguments;
	if (!document.body) { return setTimeout(function(){ args.callee.apply(this, args) }, 100); }
	var d=document, b=d.body, m=b.insertBefore(d.createElement('div'), b.firstChild); s=d.createElement('script');
	m.id='meebo'; m.style.display='none'; m.innerHTML='<iframe id="meebo-iframe"></iframe>';
	s.src='http'+(q.https?'s':'')+'://'+(q.stage?'stage-':'')+'cim.meebo.com/cim/cim.php?network='+q.network;
	b.insertBefore(s, b.firstChild);

})({network:'phoenixheartnet_bo16we'});	}</script>	<item>
		<title>Convert a database from latin1 to utf8 charset (aka. Thica.net redesigned)</title>
		<link>http://www.phoenixheart.net/2008/11/convert-a-database-from-latin1-to-utf8-charset-aka-thicanet-redesigned/</link>
		<comments>http://www.phoenixheart.net/2008/11/convert-a-database-from-latin1-to-utf8-charset-aka-thicanet-redesigned/#comments</comments>
		<pubDate>Fri, 28 Nov 2008 07:22:39 +0000</pubDate>
		<dc:creator>An</dc:creator>
				<category><![CDATA[Featured]]></category>
		<category><![CDATA[Server stuffs]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://www.phoenixheart.net/?p=153</guid>
		<description><![CDATA[If it&#8217;s been a while since my last update on this blog, then I&#8217;d say sorry, I was working on a re-design of my favorite thica.net site. Ok, I&#8217;ll be honest here: I didn&#8217;t invent anything, instead I only took a copy of the great Notepad Chaos theme from Smashing Magazine and spent some time [...]]]></description>
			<content:encoded><![CDATA[<p>If it&#8217;s been a while since my last update on this blog, then I&#8217;d say sorry, I was working on a re-design of my favorite <a href="http://www.thica.net">thica.net</a> site. Ok, I&#8217;ll be honest here: I didn&#8217;t invent anything, instead I only took a copy of the great <a href="http://www.smashingmagazine.com/2008/08/20/notepad-chaos-a-free-wordpress-theme/">Notepad Chaos theme</a> from Smashing Magazine and spent some time tweak it to my needs. Well, it may sound simple, but it was NOT at all&#8230;</p>
<p>Especially when I came into this serious problem: MySQL data charset converting.</p>
<p>Let me say more clearly. I&#8217;m a Vietnamese, and I like Vietnamese poems (at least good ones), so Thica.net (which means Poetry.net in English) is naturally written in Vietnamese, and is nothing more than a normal WordPress installation. The thing is, when first installed Thica.net, the charset of the database (and everything inside it) was set to &#8220;latin1&#8243; by default. When not being a big problem with English speaker, for multilingual sites it&#8217;s prone to big troubles, like:</p>
<ul>
<li>Incorrect search results</li>
<li>Wrong order of sorting: in Vietnamese alphabet table, letter &#8220;Đ&#8221; comes right after &#8220;D&#8221;, but in my site it came not sooner than &#8220;Y&#8221;, when &#8220;Ý&#8221; was placed just before &#8220;A&#8221; &#8211; whoops.</li>
</ul>
<p>So along with the redesign, I decided that Thica.net&#8217;s database needed to be revamped. Easy spoken, indeed. But not that easy to do.<span id="more-153"></span></p>
<p>First, MySQL internally supports conversion between charsets. Like this:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">ALTER</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`table_name`</span> CONVERT <span style="color: #993333; font-weight: bold;">TO</span> CHARACTER <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #ff0000;">'utf8'</span>;</pre></div></div>

<p>You may have guessed it, this query doesn&#8217;t help much. While the table&#8217;s charset has been indeed converted into utf8, the data remains the same, means a bunch of meaningless characters like &#8220;Tráº§n Dáº§n&#8221;, &#8220;Pháº¡m Tiáº¿n Duáº­t&#8221;, &#8220;Nguyá»…n Má»¹&#8221; and so on.</p>
<p>Then, I read somewhere that you can dump the entire database into a text file, and use Notepad++ to convert the encoding into UTF-8, then restore the dump back. Or, you can use iconv library. Or, the multibyte functions in PHP. None worked for me.</p>
<p>After some researching in vain, I came to this question: why does querying the latin1 database and display the retrieved data on a page (in utf-8) always works correctly? It turned out that, like an ox, MySQL somehow decided to take the hard and heavy part, like this:</p>
<ol>
<li>First, client machine requests data using a query</li>
<li>Server quietly converts the (latin1) data into UTF-8 and returns it</li>
<li>Client machine displays the properly formatted data on browser</li>
</ol>
<p>So it flashed through my mind: I&#8217;d take part in in the final step. Instead of displaying the good data however, I will save it somewhere, like in a dump, to be accessible later. And like a real dump, there should be queries to drop and create tables also. Here is the code I wrote:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;"># if your database is big, do some preparations
</span><span style="color: #666666; font-style: italic;"># ini_set('memory_limit', '256M');
</span><span style="color: #666666; font-style: italic;"># ini_set('max_execution_time', 120);
</span>
<span style="color: #666666; font-style: italic;"># some config data
</span><span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'HOST'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'localhost'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'USER'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'root'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'PASS'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">''</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'DB'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'test'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'FILE'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'utf8data-dump.sql'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #990000;">mysql_connect</span><span style="color: #009900;">&#40;</span>HOST<span style="color: #339933;">,</span> USER<span style="color: #339933;">,</span> PASS<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">mysql_select_db</span><span style="color: #009900;">&#40;</span>DB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># retrieves a list of table names from the database
</span><span style="color: #000088;">$rs</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_query</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'SHOW TABLES FROM '</span> <span style="color: #339933;">.</span> DB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$content</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$row</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_fetch_row</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$rs</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;"># for each table, get its structure
</span>	<span style="color: #000088;">$table_name</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$row</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>-- TABLE STRUCTURE OF <span style="color: #006699; font-weight: bold;">$table_name</span>--<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$table_struct</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_query</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;SHOW CREATE TABLE &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$table_name</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$table_struct</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_fetch_array</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$table_struct</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;"># add a DROP IF EXISTS query
</span>	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;DROP TABLE IF EXISTS <span style="color: #006699; font-weight: bold;">$table_name</span>; <span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;"># add a CREATE TABLE query
</span>	<span style="color: #666666; font-style: italic;"># remember, we must replace latin1 charset with utf8
</span>	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #990000;">str_replace</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'latin1'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'utf8'</span><span style="color: #339933;">,</span> <span style="color: #000088;">$table_struct</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;; <span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;"># now, the data
</span>	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>-- DATA OF <span style="color: #006699; font-weight: bold;">$table_name</span>--<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$table_data</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_query</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;SELECT * FROM <span style="color: #006699; font-weight: bold;">$table_name</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;"># if the table is empty, hell with it
</span>	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #990000;">mysql_num_rows</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$table_data</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">continue</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;INSERT INTO <span style="color: #006699; font-weight: bold;">$table_name</span> VALUES &quot;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;"># populate the data
</span>	<span style="color: #000088;">$str</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$data_row</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_fetch_row</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$table_data</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
	<span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$str</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">'('</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">foreach</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$data_row</span> <span style="color: #b1b100;">as</span> <span style="color: #000088;">$field</span><span style="color: #009900;">&#41;</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$str</span> <span style="color: #339933;">.=</span> <span style="color: #990000;">sprintf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;'<span style="color: #009933; font-weight: bold;">%s</span>',&quot;</span><span style="color: #339933;">,</span> <span style="color: #990000;">addslashes</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$field</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$str</span> <span style="color: #339933;">=</span> <span style="color: #990000;">rtrim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">','</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$str</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">'),'</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #000088;">$str</span> <span style="color: #339933;">=</span> <span style="color: #990000;">rtrim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">','</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$content</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;<span style="color: #006699; font-weight: bold;">$str</span>; <span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># write the (formatted) data into the dump file.
</span><span style="color: #000088;">$handle</span> <span style="color: #339933;">=</span> <span style="color: #990000;">fopen</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">FILE</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'wb'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">fwrite</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$handle</span><span style="color: #339933;">,</span> <span style="color: #000088;">$content</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">fclose</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$handle</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>This script worked for me (this post will make no sense otherwise). After running this it in my browser, I got a dump that can be use to restore a perfect utf8 database (well, not really perfect, as I&#8217;ll tell below, but acceptable). </p>
<p>The limitation of this script is, it doesn&#8217;t support the collates. A table created with utf8 charset will have a default collate being utf8_general_ci, which isn&#8217;t preferred over utf8_unicode_ci (like in these charts <a href="http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html">here</a> and <a href="http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html">here</a>. At least, my categories are not correctly sorted until I manually set the column `wp_terms`.`name` to utf8_unicode_ci. </p>
<p>The final line: Have you known yet? <img src='http://www.phoenixheart.net/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  My new Thica.net is <a href="http://www.thica.net">here</a>.</p>
<img style='display:none' id="post-153-blankimage" onload="Meebo('discoverSharable', {element: ((this.parentNode.className.match('post')) ? this.parentNode : this.parentNode.parentNode) ,url:'http://www.phoenixheart.net/2008/11/convert-a-database-from-latin1-to-utf8-charset-aka-thicanet-redesigned/',title:'Convert a database from latin1 to utf8 charset (aka. Thica.net redesigned)',tweet:'If it&#8217;s been a while since my last update on this blog, then I&#8217;d say sorry, I was workin',description:'If it&#8217;s been a while since my last update on this blog, then I&#8217;d say sorry, I was workin'})"><script type='text/javascript'>document.getElementById("post-153-blankimage").onload();</script>]]></content:encoded>
			<wfw:commentRss>http://www.phoenixheart.net/2008/11/convert-a-database-from-latin1-to-utf8-charset-aka-thicanet-redesigned/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk (enhanced) (user agent is rejected)
Database Caching 2/12 queries in 0.019 seconds using disk
Object Caching 233/245 objects using disk

Served from: www.phoenixheart.net @ 2012-02-04 02:34:09 -->
