注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

天边云E家 shaoruisky

IT博文共享,知识的海洋

 
 
 

日志

 
 

mysqlimport and MySQL load data  

2009-05-12 14:26:27|  分类: mysql |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

无意中在 emule 的安装目录下看到了个 ip-to-country.csv 文件。打开后,发现是世界各国及IP段对照的文件。格式如下:


真是个好东东!正好一个项目要用到,就准备把数据导入到 MySQL 数据库中。首先在 MySQL 数据库中建立表结构:


我准备用 MySQL 导入工具:mysqlimport 来完成导入任务。首先把数据文件 ip-to-country.csv copy 到 d:\,为了使其和 MySQL 中表名匹配,重命名为 ip_to_country.csv。然后根据数据文件格式,编写并执行下面的 mysqlimport 脚本:


注意:上面的 mysqlimport 参数应写成一行,为了便于阅读我有意分成多行。执行上面的 mysqlimport 命令后,发现有 Warnings:


数据已经导入,不管那么多了, 先进 MySQL 数据库看下效果。首先设置 character_set_results=gb2312;然后查询 10 条记录出来看看,步骤如下:


结果发现国家的中文名称都是乱码。奇怪,已经把 mysqlimport 的 default-character-set 参数设为:gb2312,为什么会有乱码?最后不得以,只好在 MySQL 数据库中,把表 ip_to_country 的字符集改为 gb2312。


然后重新执行导入命令 mysqlimport,这时候发现 MySQL 乱码问题已解决,中文国家名字可以正常显示:


留下一个问题:mysqlimport 到底能不能把文本文件中的 gb2312 字符转换成 utf8 导入到 MySQL 数据库中?

虽然问题看起来已经解决了,但我还想试下 MySQL load data 命令。 mysqlimport 虽然把数据导入数据库了,但还有 15 Warnings 在闹心。我本想利用 mysqlimport 自身的功能来查看这些 Warnings 到底是怎么回事,但翻翻手册,仍无计可施。MySQL 中有个 show warnings 给我一线希望。我这样想:先在 MySQL 中执行 Load data,然后 show warnings 不就可以找到问题所在了吗?

load data infile "d:/ip_to_country.csv"
          replace into table ip_to_country
          character set gb2312
          fields terminated by "," enclosed by ""
          lines terminated by "\r\n";

ERROR 1262 (01000): Row 6737 was truncated; it contained more data than there were input columns

晕,又出现个拦路虎:ERROR 1262 (01000): Row 6737 was truncated; it contained more data than there were input columns. 最后发现问题是 sql_mode 的问题。


把 strict_trans_tables 从 sql_mode 中去掉,再次执行 MySQL Load data

load data infile "d:/ip_to_country.csv"
               replace into table ip_to_country
               character set gb2312
               fields terminated by "," enclosed by ""
               lines terminated by "\r\n";

Query OK, 65290 rows affected, 15 warnings (0.63 sec)
Records: 65290  Deleted: 0  Skipped: 0  Warnings: 15

接下来,用 MySQL show warnings 命令,来找警告的详细描述:

data than there were input columns  |
| Warning | 1262 | Row 6817 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 6914 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 6916 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 6918 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 6988 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 7028 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 7226 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 7569 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 7791 was truncated; it contained more data than there were input columns  |
| Warning | 1262 | Row 47856 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 47885 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 49331 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 49539 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 49547 was truncated; it contained more data than there were input columns |
+---------+------+-------------------------------------------------------------------------------+

根据行号 Row 6737 到 ip_to_country.csv 中查看,发现果然有问题,国家中文名中间多了个逗号 “,”


看来,需要在表 ip_to_country 中再增加一列,来存放多出的内容。于是修改表结构:


再次执行 mysql load data,数据顺利导入。这时仍有警告,这些警告是因为文件中的大部分数据行只有 5 列,而表中有 6 列,因此 MySQL 才 Warning。

把表 ip_to_country 的字符集改为 utf8,看有没有乱码:


再次,执行 MySQL Load data 命令:

load data infile "d:/ip_to_country.csv"
            replace into table ip_to_country
            character set gb2312
            fields terminated by "," enclosed by ""
            lines terminated by "\r\n";

Query OK, 65290 rows affected, 65275 warnings (0.64 sec)
Records: 65290  Deleted: 0  Skipped: 0  Warnings: 65275

怀着激动的心情,select:


可见,MySQL load data infile 指令,可以实现不同字符集之间的转换。

  评论这张
 
阅读(1109)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016