GVG - Visualise GDML data in SVG

Xinjun Zhang
Graduate Student

San Jose State University
Department of Computer Science

One Washington Square
San Jose
USA
95192

Biography

Xinjun is a graduate student at San Jose State University. He is interested in web service, information security and football. He is a big fan of Manchester United.


Abstract


GVG (Government data visualisation in SVG) is an application running in most of the popular web browsers. This application aims to visualise government data, mapping statistical data for each administrative region by time, by using SVG to display and interact. It is useful for users, especially for the governors, to review statistical government data and predict the trends of economy, employment, even the spread of disease and etc.

In this paper, a new XML format called GDML (Government Data Markup Language) is introduced to stored statistical government data like unemployment rate, inflation rate, spread of disease and so on. Users can upload a validated GDML file to this browser-based application, in which SVG renders a map of the country consisted of all the administrative regions. In this map, each administrative region would be fully or partially gradient filled to represent the statistical data. At the bottom, a slider bar with markings of a time period (e.g.: from Jan. 2009 to Dec. 2009) helps users to inspect the change of data as the time goes by. And a click event on a administrative region would activate a detail-view feature, showing some detailed information about that region from the GDML data, which assist users to analyse the statistical data better.


Table of Contents


1. Overview
     1.1 About GVG
     1.2 What is GDML
2. Prior Work
3. Architecture
4. Implementation
     4.1 Pre-rendering - building GDML data file
     4.2 Rendering SVG via Javascript
     4.3 Javascript Event Handling
5. Future Enhancement
6. Summary
Bibliography

1. Overview

About GVG

The unemployment rate is rising, the H1N1 disease is spreading, the inflation is approaching. We always heard about these bad news from TV and radio. But the numbers in those news are meaningless to us. Because we could not picture how bad it is and how it is going. We need to link these data to all the related data in time and location. Without comparison, these data makes no sense to us. But the difficulties are: too hard to access multiple government data set at the same time and to understand a data set only by region or by time. Even worse, due to overloaded data, users won't be able to perceive and predict the precise trend of community by looking over and analysing textual data. We need graphics in order to make more sense of those dry but important statistical data for proper understanding and prediction. For this sake, GVG is the solution.

GVG is a bet on the power of citizen's visual intelligence to find patterns for prediction using SVG. Its goal is to enable a new social kind of government data analysis to help users, especially governors, to understand the community well and make a precise prediction from raw data. It is that magical moment: an unwieldy, unyielding and mysterious government data set is transformed into an easy-understandable graphics on the screen, and suddenly the user can perceive an unexpected pattern and predict the trend before knowing what to do in the future. SVG is a catalyst for discussion and collective insight about raw data. Its main part contains a map, in which every administrative region would be fully or partially gradient filled according to the GDML data. Surrounding the map, there is the assistant information part, below the map, a time line with a slider is utilised to dynamically update the filled region to visualise the statistical data as the slider moves. On the right to the map, according to the mouse movement over the map, a detail-view box, demonstrating all descriptive information about each region in the GDML data is able to help users well understand the data and make a precise prediction.

What is GDML

GDML is a language for describing statistical government data in XML. For the use of GVG, GDML can describe Rate of Unemployment, Inflation (in term of gas or house or others), Pay & Benefits and so on, which would be appropriately displayed in GVG according to its properties. Since GDML is a XML-based language, it is well defined in DTD (Document Type Definition) and GDML file becomes a productive tool to describe, store and transmit statistical government data, which can be easily created and modified. Moreover, as a prime advantage of XML-based file, GDML document consists entirely of characters from the Unicode repertoire. In other words, any character defined by Unicode may appear within the content of an GDML document, which is very important for governments form different country using different language, like English, French and Chinese.

The Document Type Definition for GDML:

<!DOCTYPE GDML [
<!ELEMENT GDML (country*)>
<!ELEMENT country (state*)>
<!ELEMENT state (description?, data*)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT data (year*)>
<!ELEMENT year (month*)>
<!ELEMENT month ANY>

<!ATTLIST country value CDATA #REQUIRED>
<!ATTLIST state value CDATA #REQUIRED>
<!ATTLIST data type CDATA #REQUIRED>
<!ATTLIST year value CDATA #REQUIRED>
<!ATTLIST month value CDATA #REQUIRED>
]>
		

2. Prior Work

The [Google-Public] makes large datasets easy to explore, visualise and communicate. As the charts and maps animate over time, the changes in the world become easier to understand. But it's not that user-friendly, since it can only visualise some built-in data. In another words, users could not find out the pattern from some raw but interesting data, which is not accepted by Google. And [IBM-Many] is another data visualisation tool produced by IBM. It enable a variety of visualisation of data, users can add interesting data set into its database, where the developer would fetch the data to release the corresponding chart or graphics. However, it's not helpful for users to analyse and predict, due to the data uploaded by users is severely restricted. Users must upload a data set contains only data value in a table format, it means no other assistant information could fit in, which may be critical for analysis and prediction. And these two tools both are just able to visualise only one type of data set in one graphics.

3. Architecture

./Architecture.png

Figure 1: The Architecture of GVG

4. Implementation

Pre-rendering - building GDML data file

The GDML data could be easily fetched from Government's website, for instance: United States Department of Labor at http://www.dol.gov . The data could be pulled from its database and saved into files (in txt or xls format). According to these data, a GDML file could be created, including key information and all statistical data for each state, such as: country-name, state-name, data-type, data-time and data-value. A validated GDML, which conforms to the GDML DTD, is ready to be uploaded and displayed in GVG application.

An example of GDML file is the following:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE GDML SYSTEM "C:\Users\Kenny\My Documents\Aptana RadRails Workspace\SVG\GDML.dtd">
<GDML>
<country value="US">
<state value="CA">
<description>California is the most populous state in the United States, and the third largest by land area.
</description>
	<data type="Unemployment Rate">
		<year value="2009">
			<month value="01">9.7%</month>
			<month value="02">10.2%</month>
			           .
					   .
					   .
			<month value="12">12.3%</month>
		</year>
		<year value="2010">
			<month value="01">12.5%</month>
			<month value="02">12.5%</month>
		</year>
	</data>
</state>
</country>
</GDML>
			
			

Rendering SVG via Javascript

In GVG, a map of the country in SVG format is included, consisting of all administrative regions indicated by its abbreviation. To produce this kind of map, Inkscape could be used, where user could upload a map in PDF format to produce an SVG map. And during the onload() event of this svg map, the application will load the GDML data file into a JavaScript XML DOM object to perform the data loading function. The application accesses the DOM object and creates an associative object for each administrative region in the country, in which each statistical data sets would be stored in different arrays as members of the object based on the category of data. The objects' names are associated with the administrative region in its abbreviation (e.g. CA, TX, etc.), so then a function called render() would be fired to render every region in map according to objects created above. Take United States of America as an example, first, it would locate each single state in map using SVG DOM function of getElementById(), and then according to the data-type selected to display, it would fetch the data value from the corresponding object, at last, base-on those value, each state in map would be applied gradient fills to rendered differently and clearly show varied situation and status.

An example of the constructor of the "State" object

State = function (name){
	this.name = name;		// the name of the object as string
}
State.prototype.UER = new Array();  	//UER = unemployment Rate
Stan.prototype.getName = function(){ 	 //return the name of state
		return this.name;
}
	.
	.
	.
		
			

The process to perform data constructing.

Using XPath to retrieve all <state> nodes in the GDML file and create an associated state object

Iterate through all <data> nodes in each state

Retrieve the data type by selecting its attribute @type

Iterate through all nodes <month> under the nodes of <year>

Assign the value to the proper array of data in the state object using set function (e.g.. setUER())

Push the state object created above to the globe variable States (an array of all states)

the following javascript function perform the procedure above

function init(evt){	
		...
	gdml = loadXMLDoc("./GDML.xml");
	path="/GDML/country[@value='US']/state[@value='CA']/data[@type='Unemployment Rate']/year[@value='2009']/month/text()"
	// code for IE
	if (window.ActiveXObject)
	{
		...
	}
	// code for Mozilla, Firefox, Opera, etc.
	else if (document.implementation && document.implementation.createDocument)
	{
		var nodes=gdml.evaluate(path, gdml, null, XPathResult.ANY_TYPE,null);
		var result=nodes.iterateNext();
		var i = 1;
		while (result)
  		{
  			ca.setUER(i++,result.nodeValue);
  			result=nodes.iterateNext();
  		}
	}
	states = new Array();
	states.push(ca); //push the State object into an array
}
		
			

./USA.png

Figure 2: Rendering the map of USA

Javascript Event Handling

Now all states are identified and rendered. Dragging or clicking on the time slider bar below the map would change gradient fills representing different situation or status in time, according to the value of data in different time period. When the mouse is released as the time slider moves, the render() function would be called again to render each state as the value of selected data change. While the slider moves and the gradient fills are changed, a clear trend would be clearly shown.

The following gradient tag was used to produce the gradient-fill of each region.

  	<linearGradient id="orange_red" x1="0%" y1="100%" x2="0%" y2="0%" visibility="hidden">
		<stop offset="0%" style="stop-color:rgb(255,0,0)"/>
		<stop offset="0%" style="stop-color:rgb(255,255,0)"/>		
	</linearGradient>
		
			

the following javascript function perform the rendering function as the slider moves:

function render(month){
	if (month != 0){
		for(i in states){		
			var temp = document.getElementById(states[i].getName());
			var p = states[i].getUER(month);
			document.getElementsByTagName("stop")[0].setAttribute("offset", p);//alert(temp);
			temp.setAttribute('style','fill:url(#orange_red)');
		}
	}
	else {
		...
	}
}
		
			
./ca_out.png

Figure 3: An example of the output

In addition, clicking on each state would activate a view for detailed information for that location. On the right side of the map, there is a rectangle box holding text data, javascript would update those detailed information according to the mouse event onClick().

function detail(evt){
	ele = evt.target;
	gdml = loadXMLDoc("./GDML.xml");
	path="/GDML/country[@value='US']/state[@value='"+ele.getAttribute('id')+"']/description/text()";
	
	// code for IE
	if (window.ActiveXObject)
	{
		...
	}
	// code for Mozilla, Firefox, Opera, etc.
	else if (document.implementation && document.implementation.createDocument)
	{
		var nodes=gdml.evaluate(path, gdml, null, XPathResult.ANY_TYPE,null);	// get the descriptive information from GDML file
		var result=nodes.iterateNext();
		var r = ca.getUER(sliderValue);
		document.getElementById('21').firstChild.nodeValue = r;		
		document.getElementById('23').firstChild.nodeValue = ele.getAttribute("id");
		if (result)
  		{
			str = result.nodeValue.toString();
			l = str.length;
			splits = parseInt(l/40) + 1;
			emptyCity();		//empty the detail information box
			for (i=0; i<splits; i++){		//slice the long string to fit in
				s = str.slice(i*40,(i+1)*40);
				showCity(s,(i+1).toString());  		//show details
			}
  		}
	}	
}
		
		
./detail_info.png

Figure 4: Detail information

5. Future Enhancement

6. Summary

GVG could deal with all statistical government data that we would like to understand better. These data may be as straightforward as a sales spreadsheet or fantasy football stats chart, or as vague as a cluttered email inbox. But a remarkable amount of it has social meaning beyond ourselves. When we share it in GDML format and visualise it via GVG, we understand it in new ways, which certainty would help us to promote our society easier and better.

Bibliography

Publications

[rathert-knowledge] Nikolas Rathert. Vladimir Geroimenko. Chaomei Chen. Visualising Information Using SVG and X3D. In Knowledge Visualisation Using Dynamic SVG Charts. Springer. 2005. P.245-255

[Probets01] Steve Probets. Julius Mong. David Evans. David Brailsford. Vector Graphics: From PostScript and Flash to SVG. Proceedings of the 2001 ACM Symposium on Document engineering. 2001. P.135-143

[eick94] Steve Eick. Data visualization sliders. Proceedings of the 7th annual ACM symposium on User interface software and technology. 1994. P.119-120

[robinson08] Robinson D. Yu H. Zeller P. Felten W. Government data and the invisible hand. Yale JL & Tech. 2008. P.160

[aulenback-svg] Shane Aulenback. Vladimir Geroimenko. Chaomei Chen. SVG as the visual interface to Web services. In Visualising Information Using SVG and X3D. Springer. 2005. P.85-98

Web Resources

[Google-Public] Google Public Data Explorer. http://www.google.com/publicdata/directory. Google Inc. 2010.

[IBM-Many] IBM Many Eyes. http://manyeyes.alphaworks.ibm.com/manyeyes/. IBM.

[DOL] United States Department of Labor. http://www.dol.gov. U.S. Department of Labor.

[carto:net-Slider] SVG slider. http://www.carto.net/svg/gui/slider. carto:net.

[Wikipedia-Map] Map of USA form Wikipedia. http://en.wikipedia.org/wiki/File:Map_of_USA_with_state_names.svg. Wikipedia.

[Refsnes-DTD] DTD Tutorial. http://www.w3schools.com/dtd/default.asp. Refsnes Data.