今天分析数据的时候需要检查一个图是否有向无环图,于是用脚本写了个小的检查工具。
基本原理:如果能够完全拓扑排序,则是无环图。
输入文件格式:每行包含两个数据,用空白字符分开,表示一个<child-id, parent-id>对。
Ruby代码:
1 # graph's node type
2 class Node
3 # node id
4 attr_accessor :id
5 # indegree
6 attr_accessor :in_count
7 # children list
8 attr_accessor :out_list
9 # for topologic sorting
10 attr_accessor :discarded
11
12 def initialize(id)
13 @id = id
14 @in_count = 0
15 @out_list = []
16 @discarded = false
17 end
18
19 def to_s
20 @id + " " + @in_count.to_s
21 end
22 end
23
24
25 if __FILE__ == $0 then
26 if (ARGV.size != 1) then
27 print "Usage: ruby ", $0, " <inputfile>\n"
28 exit
29 end
30 f = File.open(ARGV[0], 'r')
31
32 # construct graph
33 map = {}
34 f.each_line do |line|
35 a = line.split
36 child_id = a[0]
37 parent_id = a[1]
38 # increase child's in_count
39 child_node = map[child_id]
40 if (child_node == nil) then
41 child_node = Node.new(child_id)
42 map[child_id] = child_node
43 end
44 child_node.in_count += 1
45
46 # add child to parent's out_list
47 parent_node = map[parent_id]
48 if (parent_node == nil) then
49 parent_node = Node.new(parent_id)
50 map[parent_id] = parent_node
51 end
52 if ( not parent_node.out_list.include?(child_id) ) then
53 parent_node.out_list << child_id
54 end
55 end
56 f.close
57 print "Node count: ", map.size, "\n"
58
59 # max indegree and max outdegree
60 max_in_degree = 0
61 max_out_degree = 0
62 map.each_value do |node|
63 if (node.in_count > max_in_degree) then
64 max_in_degree = node.in_count
65 end
66 if (node.out_list.size > max_out_degree) then
67 max_out_degree = node.out_list.size
68 end
69 end
70 print "Max indegree: ", max_in_degree, "\n"
71 print "Max outdegree: ", max_out_degree, "\n"
72
73 # check if it is cyclic or not using theory of topological sorting
74 cur_node_count = map.size
75 have_do_discard = true
76 while (cur_node_count > 0 and have_do_discard)
77 have_do_discard = false
78 map.each_value do |node|
79 if ( not node.discarded and node.in_count == 0 ) then
80 node.discarded = true
81 node.out_list.each do |out_id|
82 map[out_id].in_count -= 1
83 end
84 cur_node_count -= 1
85 have_do_discard = true
86 end
87 end
88 end
89
90 if (cur_node_count == 0) then
91 print "Graph type: Acyclic", "\n"
92 else
93 print "Graph type: Cyclic", "\n"
94 end
95 end
96