Edge

August 25, 2005

Weighted graph in SQL.

Filed under: Programming

Have you ever tried to store hierarchies in SQL. That’s easy but, have you ever tried to do some calculation in that hierarchy. That’s bit harder.

Let’s take example situation that I have some hierarchy and some cost that should be allocated in that hierarchy. Let assume that you have an HierarchyItem that could have another item many other HierarchyItems as child of it and many parents also. Then you could have some cost allocated to that hierarchy let’s say DirectCost. You share this according to weight between the HierarchyItem connection say ConnectionShare.

Here is the class diagram of this.
Class diagram

So HierarchyItem cost is sum of direct costs and cost of items child according to share ratios.

Example if have file server and it costs are 1000 euros. Application A file usage is 80GB and Application B file usage is 120GB:

Application A has costs from file server (80 / (80 + 120)) * 1000 euros = 400 euros
Application B has costs from file server (120 / (80 + 120)) * 1000 euros = 600 euros

In this easily solved with recursion as flows (Detailed code in C#):

public float GetCost(string costType)
{
	float retval = 0;
	
	//Calculate first all direct costs
	foreach(DirectCost directCost in DirectCosts)
	{
		if(directCost.CostType.Equals(costType))
			retval += directCost.Value;
	}
	//Add all child costs to this
	foreach(ConnectionShare child in ChildConnectionShares)
	{
		retval += child.GetChildCost(costType);
	}
	return retval;
}

Problem converting this to SQL is that you can’t really do recursion in SQL. So the solution in my case was to simulate the cost in time to time. Simulation is a good solution if you don’t data bee accurate all the time.

So if you think about how you would calculate this by hand. You just would start calculating this from the bottom. First calculate those items that do not have any child’s. After that you would calculate items which child’s has been calculated all ready. And the just iterate the process.

So SQL solution comes from the same idea. First calculate all those items that have no childs and after that calculate those items which child’s has been calculated. Or if I be more precise you can calculate those items where your child count is the same as calculated child count.

So if I would have an tables HierarchyItem, ConnectionShare, DirectCost and HierarchyItemCosts.

HierarchyItem you would all the hierarchy items.

ConnectionShare would include all connections between HierarchyItems

DirectCost Table would include all direct cost related to HierarchyItem.

HierarchyItemCosts would store all the simulated costs.

Let’s assume that we already have some data in there. And go step by step…

1. Find all those items that can be calculated - those items where your child count is the same as calculated child count.

SELECT 	Parent.ID
FROM 	HierarchyItem Parent LEFT JOIN ConnectionShare Child
			ON Parent.ID = Child.ParentID
		LEFT JOIN HierarchyItemCost CalulcatedItems
			ON Child.ChildID = CalulcatedItems.HierarchyItemID
GROUP BY Parent.ID
HAVING COUNT(Child.ChildID) = COUNT(CalulcatedItems.HierarchyItemID)

2. Now Select all the items that should be calculated

SELECT 	HI.ID
FROM	HierarchyItem HI
WHERE 	HI.ID IN (Query from the step 1)

3. Add all CostTypes to calculation

SELECT 	HI.ID,
	TmpCostType. CostType
FROM	HierarchyItem HI CROSS JOIN
	(SELECT DISTINCT CostType FROM DirectCost) TmpCostType
WHERE 	HI.ID IN (Query from the step 1)

4. Add the direct cost calculation

SELECT 	HI.ID,
	TmpCostType. CostType,
	(SELECT 	COALESCE(SUM(Value),0)
	 FROM 	DirectCost
	WHERE 	HierarchyItemID = HI.ID
		AND CostType = TmpCostType.CostType) 
FROM	HierarchyItem HI CROSS JOIN
	(SELECT DISTINCT CostType FROM DirectCost) TmpCostType
WHERE 	HI.ID IN (Query from the step 1)

5. Add the child costs to calculation and helpper view.


CREATE VIEW HierarchyItemTotalShare AS
SELECT 	HI.ID,COALESCE(SUM(Share),0)
FROM 	HierarchyItem HI LEFT JOIN ConnectionShare CS
		ON HI.ID = CS.ChildID
GROUP BY HI.ID 
	
SELECT COALESCE(SUM(Child.Value * (Relation.Share/TotalShare.TotalShare) ),0)
FROM 	HierarchyItemCost Child JOIN ConnectionShare Relation
			ON Child.HierarchyItemID = Relation.ChildID
		JOIN HierarchyItemTotalShare TotalShare
			ON Child.HierarchyItemID = TotalShare.ID
 WHERE	Relation.ParentID = HI.ID
		AND Child.CostType = TmpCostType.CostType
		AND TotalShare.TotalShare <> 0

6. The final insert statement that is run until all row’s have been instered.


INSERT INTO HierarchyItemCost (HierarchyItemID,CostType,Value)
SELECT 	HI.ID,
	TmpCostType.CostType,
	--Items cost is sum of direct cost +
	(SELECT COALESCE(SUM(Value),0) FROM DirectCost WHERE HierarchyItemID = HI.ID AND CostType = TmpCostType.CostType)
		+
	--Sum of all child cost acording to share
		(SELECT COALESCE(SUM(Child.Value * (Relation.Share/TotalShare.TotalShare) ),0)
		 FROM 	HierarchyItemCost Child JOIN ConnectionShare Relation
				ON Child.HierarchyItemID = Relation.ChildID
			JOIN HierarchyItemTotalShare TotalShare
				ON Child.HierarchyItemID = TotalShare.ID
		 WHERE	Relation.ParentID = HI.ID AND Child.CostType = TmpCostType.CostType AND TotalShare.TotalShare <> 0)
	
FROM	HierarchyItem HI CROSS JOIN (SELECT DISTINCT CostType FROM DirectCost) TmpCostType
WHERE 	HI.ID IN
	–All those HierarchyItems witch childs has been calculated
	(SELECT Parent.ID
	 FROM 	HierarchyItem Parent LEFT JOIN ConnectionShare Child
			ON Parent.ID = Child.ParentID
		LEFT JOIN HierarchyItemCost CalulcatedItems
			ON Child.ChildID = CalulcatedItems.HierarchyItemID
	 GROUP BY Parent.ID
	 HAVING COUNT(Child.ChildID) = COUNT(CalulcatedItems.HierarchyItemID))
	 –Only calculate every item once
	 AND HI.ID NOT IN (SELECT HierarchyItemID FROM HierarchyItemCost)

Voilá!

If you have SqlServer you can test the solution with this script.

Comments »

The URI to TrackBack this entry is: http://edge.blogsome.com/2005/08/25/weighted-graph-in-sql/trackback/

No comments yet.

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>























Get free blog up and running in minutes with Blogsome | Theme designs available here