﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-我爱那只饼-随笔分类-Python</title><link>http://www.cppblog.com/mildforest/category/15867.html</link><description>以后开始搞游戏开发了</description><language>zh-cn</language><lastBuildDate>Tue, 11 Jan 2011 10:56:34 GMT</lastBuildDate><pubDate>Tue, 11 Jan 2011 10:56:34 GMT</pubDate><ttl>60</ttl><item><title>Python使用GAE遇到UnicodeDecodeError异常的解决</title><link>http://www.cppblog.com/mildforest/archive/2011/01/11/138350.html</link><dc:creator>Jokey Pretty</dc:creator><author>Jokey Pretty</author><pubDate>Tue, 11 Jan 2011 10:53:00 GMT</pubDate><guid>http://www.cppblog.com/mildforest/archive/2011/01/11/138350.html</guid><wfw:comment>http://www.cppblog.com/mildforest/comments/138350.html</wfw:comment><comments>http://www.cppblog.com/mildforest/archive/2011/01/11/138350.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/mildforest/comments/commentRss/138350.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/mildforest/services/trackbacks/138350.html</trackback:ping><description><![CDATA[Google App Engine SDK: 1.4.0<br />Python: 2.6.6<br /><br />用Html表单form向GAE提交参数，其中一个参数中含有中文，在用DataStore保存请求时出现了UnicodeDecodeError异常，如下：<br /><div style="background-color: rgb(238, 238, 238); font-size: 13px; border: 1px solid rgb(204, 204, 204); padding: 4px 5px 4px 4px; width: 98%;"><!--<br><br>Code highlighting produced by Actipro CodeHighlighter (freeware)<br>http://www.CodeHighlighter.com/<br><br>--><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">ascii</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);"> codec can</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">t decode byte 0xe6 in position 0: ordinal not in range(128)<br />Traceback (most recent call last):<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 517, in __call__<br />    handler.post(*groups)<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 895, in put<br />    return datastore.Put(self._entity, config=config)<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 404, in Put<br />    return _GetConnection().async_put(config, entities, extra_hook).get_result()<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1130, in async_put<br />    for pbs in pbsgen:<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 993, in __generate_pb_lists<br />    pb = value_to_pb(value)<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 202, in entity_to_pb<br />    return entity._ToPb()<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 762, in _ToPb<br />    properties = datastore_types.ToPropertyPb(name, values)<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1530, in ToPropertyPb<br />    pbvalue = pack_prop(name, v, pb.mutable_value())<br />  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1353, in PackString<br />    pbvalue.set_stringvalue(unicode(value).encode(</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">utf</span><span style="color: rgb(0, 0, 0);">-</span><span style="color: rgb(0, 0, 0);">8</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">))<br />UnicodeDecodeError: </span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">ascii</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);"> codec can</span><span style="color: rgb(0, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">t decode byte </span><span style="color: rgb(0, 0, 0);">0xe6</span><span style="color: rgb(0, 0, 0);"> in position </span><span style="color: rgb(0, 0, 0);">0</span><span style="color: rgb(0, 0, 0);">:</span><span style="color: rgb(0, 0, 0);"> ordinal not in </span><span style="color: rgb(0, 128, 128);">range</span><span style="color: rgb(0, 0, 0);">(</span><span style="color: rgb(0, 0, 0);">128</span><span style="color: rgb(0, 0, 0);">)</span></div><br />在详细看了GAE产生的log后，查到了原因。这是因为GAE SDK在保存字符串时，首先会将字符串转为unicode类型，从异常栈里可以看出这点：<br /><div style="background-color: rgb(238, 238, 238); font-size: 13px; border: 1px solid rgb(204, 204, 204); padding: 4px 5px 4px 4px; width: 98%;"><!--<br><br>Code highlighting produced by Actipro CodeHighlighter (freeware)<br>http://www.CodeHighlighter.com/<br><br>--><span style="color: rgb(0, 0, 0);"> pbvalue.set_stringvalue(unicode(value).encode(</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">utf-8</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">))</span></div>而默认的解码方式是“ascii”的，遇到就中文时，就出问题了。<br /><br />因此解决办法是自己手动先将参数字符串转为unicode，方法很简单：<br /><div style="background-color: rgb(238, 238, 238); font-size: 13px; border: 1px solid rgb(204, 204, 204); padding: 4px 5px 4px 4px; width: 98%;"><!--<br><br>Code highlighting produced by Actipro CodeHighlighter (freeware)<br>http://www.CodeHighlighter.com/<br><br>--><span style="color: rgb(0, 0, 0);">content </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> unicode(content, </span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">utf-8</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">)</span></div>这样就异常就不会再出现了。<br /><br />整个解析POST参数并保存的代码如下：<br /><div style="background-color: rgb(238, 238, 238); font-size: 13px; border: 1px solid rgb(204, 204, 204); padding: 4px 5px 4px 4px; width: 98%;"><!--<br><br>Code highlighting produced by Actipro CodeHighlighter (freeware)<br>http://www.CodeHighlighter.com/<br><br>--><span style="color: rgb(0, 0, 0);">post_data </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> get_post_data()<br /></span><span style="color: rgb(0, 128, 0);">#</span><span style="color: rgb(0, 128, 0);"> 将a=1&amp;b=2&amp;c=3形式的post data分割成dict</span><span style="color: rgb(0, 128, 0);"><br /></span><span style="color: rgb(0, 0, 0);">quoted </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> dict([x.split(</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">=</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">) </span><span style="color: rgb(0, 0, 255);">for</span><span style="color: rgb(0, 0, 0);"> x </span><span style="color: rgb(0, 0, 255);">in</span><span style="color: rgb(0, 0, 0);"> post_data.split(</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">&amp;</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">)]).get(</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">status</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">, </span><span style="color: rgb(128, 0, 0);">''</span><span style="color: rgb(0, 0, 0);">)<br /></span><span style="color: rgb(0, 128, 0);">#</span><span style="color: rgb(0, 128, 0);"> 需要将浏览器编码后的url解码</span><span style="color: rgb(0, 128, 0);"><br /></span><span style="color: rgb(0, 0, 0);">content </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> urllib.unquote_plus(quoted)<br /></span><span style="color: rgb(0, 128, 0);">#</span><span style="color: rgb(0, 128, 0);"> 转为 unicode</span><span style="color: rgb(0, 128, 0);"><br /></span><span style="color: rgb(0, 0, 0);">content </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> unicode(content, </span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(128, 0, 0);">utf-8</span><span style="color: rgb(128, 0, 0);">'</span><span style="color: rgb(0, 0, 0);">)<br /></span><span style="color: rgb(0, 128, 0);">#</span><span style="color: rgb(0, 128, 0);"> 保存到GAE datastore</span><span style="color: rgb(0, 128, 0);"><br /></span><span style="color: rgb(0, 0, 0);">data </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> Data()<br />data.content </span><span style="color: rgb(0, 0, 0);">=</span><span style="color: rgb(0, 0, 0);"> content<br />data.put()<br /></span></div><br /><br /><img src ="http://www.cppblog.com/mildforest/aggbug/138350.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/mildforest/" target="_blank">Jokey Pretty</a> 2011-01-11 18:53 <a href="http://www.cppblog.com/mildforest/archive/2011/01/11/138350.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>