Master Data Management, Data Governance, Data Quality

Master Data Management (MDM) is the technology, tools, and processes an organization needs to create and maintain consistent and accurate inventory of its data.

Master data usually refers to non-transactional data within the organization and can include customers, suppliers, employees, products, partners, accounts etc. Continue reading Master Data Management, Data Governance, Data Quality

Business Intelligence

Business Intelligence

One of the biggest problems organisations have nowadays is the fact they are rich in data but poor in insights. They may be collecting data from different sources – channels, processes, customers etc. – but they don’t really know how to transform that into valuable information and then strategies. To increase their ability to analyse data and get insights companies should apply Business Intelligence.Business int

Continue reading Business Intelligence

Trying R

Try R

R is a tool for statistics and data modeling. It can be used for descriptive statistics, inferential statistics and plotting charts.

Try R course that I have completed ( contains 7 sections listed below. Every section is broken down to several parts giving you some context of how R can be used, and you can practice on provided examples – you need to complete each part before passing to the next one.

R completion Continue reading Trying R

Irish Population Heatmap

Irish Population Heatmap presenting population per county based on 2011 census data.

Ireland population heatmap

Irish Population Heatmap

How the heatmap was achieved

To create a heatmap you need to log in to Google Drive and add new files into “Google Fusion Tables”. You need two sets of data in order to create the heatmap:

  • the population of Ireland per county (obtained on the Central Statistics Office here),
  • the kml map showing counties of Ireland (obtained here)

The data showing population of Irish counties was copied to Excel and needed to be tidied up a bit so that it could be used. First of all the data contained population by province, and then by county which in some cases was then broken down to the main city and the rest of the county – I saved the province data in the separate tab in excel, and since I only needed the population of the whole county I removed the unnecessary data; since Tipperary was broken down into South and North, I combined the two; county Laois was misspelled and needed to be corrected in order to match the kml file data. There was also an error in males population of Waterford, which was 54K instead of 56K which needed to be corrected.

After tidying up the file I could upload it into the Google Fusion tables along with the kml file and then the two were merged in the Google Fusion, using File > Merge, and selecting relevant columns.

Once the data in two files was merged I could edit the map in the Map of Geometry Tab, using Tools > Change map, and then Change Feature Styles in order to set up the colours, buckets, adding the legend etc. At first the automatically set buckets were not showing a good heatmap, with too many counties on one bucket and too few in other – that needed to be adjusted so that the heatmap could be read easier.

What could be gleamed from the heatmap

From the heatmap you can see which parts of Ireland are the most and which are the least populated. Clearly the highest populated counties are Dublin, Cork and Galway, and the map shows that also their neighboring counties are those with higher population while the counties in the midlands generally have less population. Interestingly Donegal is one of the counties with higher population.

These details could give us the idea where roads are needed.

Cork which is the most populated county after Dublin has a motorway to Dublin, but it might be a good idea to build a motorway between Cork and Galway, the third biggest county by its population. Also if you are going from Cork to Belfast, you need to head to Dublin, take M50 and from there M1 – it would be more convenient if there was a better connection between south and north of the country, and additionally you could avoid heavy traffic around Dublin. Also Donegal, a county with relatively high population, does not have a good roadway with other parts of the country, so a better connection with Galway and Dublin could be a good idea.

What other ideas/concepts could be represented in the heat map

The above data and analysis is based on the population per county, but does not include actual density (number of people per sq kilometer) which could give us a different picture – the midland counties have smaller population, but their areas are also smaller than those of Cork or Donegal, so depending on what information you want to obtain, you might want to look at things differently.

The heatmaps can be used for creating intensity maps with other data, like people’s earnings or investments, density (as mentioned above), specific infrastructure or facilities in various areas which for example could be useful when making decisions on investments in new locations.

Further analysis using Excel Pivot Tables and Charts

Pivot Chart which breaks down the population by gender

Males and females


Pivot Table which shows the total population by province

Population by province


Pivot table showing what % of the population does each county account for in 2011