随笔档案

文章档案

搜索

阅读排行榜

评论排行榜

boost.property_tree解析xml的帮助类

　　boost.property_tree可以用来解析xml和json文件，我主要用它来解析xml文件，它内部封装了号称最快的xml解析器rapid_xml，其解析效率还是很好的。但是在使用过程中却发现各种不好用，归纳一下不好用的地方有这些：

　　获取不存在的节点时就抛出异常

　　获取属性值时，要排除属性和注释节点，如果没注意这一点就会抛出异常，让人摸不着头脑。

　　内存模型有点怪。

　　默认不支持中文的解析。解析中文会乱码。

　　ptree获取子节点

　　获取子节点接口原型为get_child(node_path)，这个node_path从当前路径开始的全路径，父路径和子路径之间通过“.”连接，如“root.sub.child”。需要注意的是get_child获取的是第一个子节点，如果我们要获取子节点列表，则要用路径“root.sub”，这个路径可以获取child的列表。如果获取节点的路径不存在则会抛出异常，这时，如果不希望抛出异常则可以用get_xxx_optional接口，该接口返回一个optional的结果出来，由外面判断是否获取到结果托福答案 www.jamo123.com

　　//ptree的optional接口

　　auto item = root.get_child_optional("Root.Scenes");

　　该接口返回的是一个optional，外面还要判断该节点是否存在，optional对象通过bool操作符来判断该对象是否是无效值，通过指针访问

　　符"*"来访问该对象的实际内容。建议用optional接口访问xml节点。

　　//ptree的optional接口

　　auto item = root.get_child_optional("Root.Scenes");

　　if(item)

　　cout<<"该节点存在"<

　　ptree的内存模型

　　ptree维护了一个pair的子节点列表，first指向的是该节点的TagName，second指向的才是ptree节点，因此在遍历ptree子节点时要注意迭代器的含义。

　　for (auto& data : root)

　　{

　　for (auto& item : data.second) //列表元素为pair，要用second继续遍历

　　{

　　cout<

　　}

　　}

　　需要注意的是ptree.first可能是属性("")也可能是注释("")，只有非注释类型的节点才能使用获取属性值、子节点等常用接口。

　　ptree获取属性值

　　通过get(attr_name)可以获取属性的值，如果想获取属性的整形值的话，可以用get("Id")，返回一个整数值。有一点要注意如果ptree.first为""时，是没有属性值的，可以通过data()来获取注释内容。如果这个ptree.first不为时需要在属性名称前面加"."，即get(".Id")才能正确获取属性值。可以看到获取属性值还是比较繁琐的，在后面要介绍的帮助类中可以简化属性值的获取。如果要获取节点的值则用get_value()接口，该接口用来获取节点的值，如节点：2通过get_value()就可以获取值"2"。

　　解析中文的问题

　　ptree只能解析窄字符的xml文件，如果xml文件中含有unicode如中文字符，解析出来就是乱码。解析unicode要用wptree，该类的接口均支持宽字符并且接口和ptree保持一致。要支持中文解析仅仅wptree还不够，还需要一个unicode转换器的帮助，该转换器可以实现宽字符和窄字符的转换，宽窄的互相转换函数有很多实现，不过c++11中有更简单统一的方式实现宽窄字符的转换。

　　c++11中宽窄字符的转换：

　　std::wstring_convert> conv

　　(newstd::codecvt("CHS"));

　　//宽字符转为窄字符

　　string str = conv.to_bytes(L"你好");

　　//窄字符转为宽字符

　　string wstr = conv.from_bytes(str);

　　boost.property_tree在解析含中文的xml文件时，需要先将该文件转换一下。

　　boost解决方法：

　　#include "boost/program_options/detail/utf8_codecvt_facet.hpp"

　　void ParseChn()

　　{

　　std::wifstream f(fileName);

　　std::locale utf8Locale(std::locale(), new boost::program_options::detail::utf8_codecvt_facet());

　　f.imbue(utf8Locale); //先转换一下

　　//用wptree去解析

　　property_tree::wptree ptree;

　　property_tree::read_xml(f, ptree);

　　}

　　这种方法有个缺点就是要引入boost的libboost_program_options库，该库有二十多M，仅仅是为了解决一个中文问题，却要搞得这么麻烦，有点得不偿失。好在c++11提供更简单的方式，用c++11可以这样：

　　void Init(const wstring& fileName, wptree& ptree)

　　{

　　std::wifstream f(fileName);

　　std::locale utf8Locale(std::locale(), new std::codecvt_utf8);

　　f.imbue(utf8Locale); //先转换一下

　　//用wptree去解析

　　property_tree::read_xml(f, ptree);

　　}

　　用c++11就不需要再引入boost的libboost_program_options库了，很简单。

　　property_tree的帮助类

　　property_tree的帮助类解决了前面提到的问题：

　　用c++11解决中文解析问题

　　简化属性的获取

　　增加一些操作接口，比如一些查找接口

　　避免抛出异常，全部返回optional对象

　　隔离了底层繁琐的操作接口，提供统一、简洁的高层接口，使用更加方便。

　　下面来看看这个帮助类是如何实现的吧：

　　#include

　　#include

　　using namespace boost;

　　using namespace boost::property_tree;

　　#include

　　#include

　　#include

　　#include

　　using namespace std;

　　const wstring XMLATTR = L"";

　　const wstring XMLCOMMENT = L"";

　　const wstring XMLATTR_DOT = L".";

　　const wstring XMLCOMMENT_DOT = L".";

　　class ConfigParser

　　{

　　public:

　　ConfigParser() : m_conv(new code_type("CHS"))

　　{

　　}

　　~ConfigParser()

　　{

　　}

　　void Init(const wstring& fileName, wptree& ptree)

　　{

　　std::wifstream f(fileName);

　　std::locale utf8Locale(std::locale(), new std::codecvt_utf8);

　　f.imbue(utf8Locale); //先转换一下

　　wcout.imbue(std::locale("chs")); //初始化cout为中文输出格式

　　//用wptree去解析

　　property_tree::read_xml(f, ptree);

　　}

　　// convert UTF-8 string to wstring

　　std::wstring to_wstr(const std::string& str)

　　{

　　return m_conv.from_bytes(str);

　　}

　　// convert wstring to UTF-8 string

　　std::string to_str(const std::wstring& str)

　　{

　　return m_conv.to_bytes(str);

　　}

　　//获取子节点列表

　　auto Descendants(const wptree& root, const wstring& key)->decltype(root.get_child_optional(key))

　　{

　　return root.get_child_optional(key);

　　}

　　//根据子节点属性获取子节点列表

　　template

　　vector GetChildsByAttr(const wptree& parant, const wstring& tagName, const wstring& attrName, const T& attrVal)

　　{

　　vector v;

　　for (auto& child : parant)

　　{

　　if (child.first != tagName)

　　continue;

　　auto attr = Attribute(child, attrName);

　　if (attr&&*attr == attrVal)

　　v.push_back(child.second);

　　}

　　return v;

　　}

　　//获取节点的某个属性值

　　template

　　optional Attribute(const wptree& node, const wstring& attrName)

　　{

　　return node.get_optional(XMLATTR_DOT + attrName);

　　}

　　//获取节点的某个属性值，默认为string

　　optional Attribute(const wptree& node, const wstring& attrName)

　　{

　　return Attribute(node, attrName);

　　}

　　//获取value_type的某个属性值

　　template

　　optional Attribute(const wptree::value_type& pair, const wstring& attrName)

　　{

　　if (pair.first == XMLATTR)

　　return pair.second.get_optional(attrName);

　　else if (pair.first == XMLCOMMENT)

　　return optional();

　　else

　　return pair.second.get_optional(XMLATTR_DOT + attrName);

　　}

　　//获取value_type的某个属性值，默认为string

　　optional Attribute(const wptree::value_type& pair, const wstring& attrName)

　　{

　　return Attribute(pair, attrName);

　　}

　　//根据某个属性生成一个的multimap

　　template>

　　multimap MakeMapByAttr(const wptree& root, const wstring& key, const wstring& attrName, F predict = [](wstring& str){return true; })

　　{

　　multimap resultMap;

　　auto list = Descendants(root, key);

　　if (!list)

　　return resultMap;

　　for (auto& item : *list)

　　{

　　auto attr = Attribute(item, attrName);

　　if (attr&&predict(*attr))

　　resultMap.insert(std::make_pair(*attr, item.second));

　　}

　　return resultMap;

　　}

　　private:

　　using code_type = std::codecvt;

　　std::wstring_convert m_conv;

　　};

　　View Code

　　测试文件test.xml和测试代码：

　　void Test()

　　{

　　wptree pt; pt.get_value()

　　ConfigParser parser;

　　parser.Init(L"test1.xml", pt); //解决中文问题，要转换为unicode解析

　　auto scenes = parser.Descendants(pt, L"Root.Scenes"); //返回的是optional

　　if (!scenes)

　　return;

　　for (auto& scene : *scenes)

　　{

　　auto s = parser.Attribute(scene, L"Name"); //获取Name属性，返回的是optional

　　if (s)

　　{

　　wcout << *s << endl;

　　}

　　auto dataList = parser.Descendants(scene.second, L"DataSource"); //获取第一个子节点

　　if (!dataList)

　　continue;

　　for (auto& data : *dataList)

　　{

　　for (auto& item : data.second)

　　{

　　auto id = parser.Attribute(item, L"Id");

　　auto fileName = parser.Attribute(item, L"FileName");

　　if (id)

　　{

　　wcout << *id << L" " << *fileName << endl; //打印id和filename

　　}

　　}

　　}

　　}

　　}

　　测试结果:

　　可以看到通过帮助类，无需使用原生接口就可以很方便的实现节点的访问与操作。使用者不必关注内部细节，根据统一而简洁的接口就可以操作xml文件了。

　　一点题外话，基于这个帮助类再结合linq to object可以轻松的实现linq to xml：

　　//获取子节点SubNode的属性ID的值为0x10000D的项并打印出该项的Type属性

　　from(node.Descendants("Root.SubNode")).where([](XNode& node)

　　{

　　auto s = node.Attribute("ID");

　　return s&&*s == "0x10000D";

　　}).for_each([](XNode& node)

　　{

　　auto s = node.Attribute("Type");

　　if (s)

　　cout << *s << endl;

　　});

posted on 2014-03-11 16:27 HAOSOLA 阅读(5144) 评论(0) 编辑收藏引用

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！



网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理


Copyright © HAOSOLA	Powered by: 博客园模板提供：沪江博客

导航

常用链接

留言簿

随笔档案

文章档案

搜索

最新评论

阅读排行榜

评论排行榜