SZÉKELYDATA: Székelyföld Gazdasága Infografikák – Economy of Székelyland Infographics

Kedves barátaim! Mától indítottunk a közösen egy adatblog-ot. Megtaláljtátok a címen és a Transindex mai címoldalán: SZÉKELYDATA – Erdély, Székelyföld és a nagyvilág a Big Data korszakában. Az első poszt itt olvasható Székelyföld legnagyobb cégeiről.

Hey all! This post has been moved to my brand new Hungarian language data-blog about Transylvania and Székelyland at You can find it now at Occasionally I will translate hot posts to English and cross-post them here too!

Székelyföldi cégek térképe iparágak és jövedelem alapján

Székelyföldi cégek térképe iparágak és jövedelem alapján

How Social Media Won the 2014 Romanian Presidential Elections for Klaus Iohannis?

A Hungarian version of this post also exists.

The 2014 Romanian presidential elections have been (most certainly) won by Klaus Iohannis. Compared to the other candidate, Victor Ponta, he has been generally regarded as a silent and pragmatic leader. And he didn’t even have a Facebook page until recently. However, when after the first round of the elections it seemed that Ponta will almost certainly win, the Romanian online communities – driven mainly by the social media-aged population and intellectuals in large cities of Transylvania and the capital Bucharest – started a massive pro-Iohannis campaign. I have experienced this first hand, with more than 50% of my Facebook friends being from Romania – every Romanian who opened his Facebook or Twitter in the past two weeks knows what I’m talking about. I believe that this (among a gazillion other things of course 🙂 ),  has eventually lead to the victory of Iohannis. Let us look at the social media statistics!

Ponta Iohannis share

Percentage of votes by poll dates. November 2 and 16 (stretching into the 17) are actual voting percentages. (sources 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)


Read More

Converting binstar Linux packages to Windows for Anaconda and Python

If you are one of those (growing number of) weirdos who are using the Anaconda distribution of Python on Windows instead of Linux for data-science, you might find this post useful. It happened to me countless times that for some function I wanted to perform, I needed to install an extra Python library. The usual process involves using the default conda package manager by Anaconda and typing

C:\> conda install mybeautifullibrary

and thats about it. But unfortunately in reality somehow something always breaks… Read More

Ice Stupa Dynamics – Part 1

A good friend of mine has been pioneering an innovative solution to the water scarcity problem of the villages of the high deserts of Ladakh in the Indian Himalayas. The idea is to grow artificial glaciers – ice stupas, named after the Buddhist sanctuaries called stupas – to store the flow-away water during the winter-time for the spring, when water resources are in scarcity for the villagers.

The basic principle is remarkably simple, yet the true physics of glacier-growing represent a multifaceted challenge. I have decided to create a series of descriptive simulation models to illustrate you how will the ice stupas grow and melt. The models are written in the AnyLogic design language, built around JAVA and provided that you have the JAVA plugin installed in your browser, you can try to play with the current model here.

This model demonstrates how temperature will affect the growth and shrinkage of the artificial glacier, which will be essentially a conic ice mountain. Artificial glacier growth is controlled by two parameters, the glacial feed pipe diameter and the glacial flow speed, which is proportional to the altitude difference between the feed pipe head at the real glacier and the location of the artificial glacier. Read More

Sankey Diagram Generator

Sankey Demo

Check out the Sankey Diagram Generator I have just made. It supports self loops, moving around nodes in both horizontal and vertical directions and loading and saving diagrams! You can also change the opacity and the density of the links.

Use the Load/Save button to edit/create complex Sankey’s.

The source code for the Sankey displayed above is:

{"nodes":[{"name":"Oil"},{"name":"Natural Gas"},{"name":"Coal"},{"name":"Fossil Fuels"},{"name":"Electricity"},{"name":"Energy"}],"links":[{"source":0,"target":3,"value":15},{"source":1,"target":3,"value":20},{"source":2,"target":3,"value":25},{"source":2,"target":4,"value":25},{"source":3,"target":5,"value":60},{"source":4,"target":5,"value":25},{"source":4,"target":4,"value":5}]} 

You can also use the keyword layer to create nodes fixed along the x-axis. In the above example, if you use “layer”: 3 for the node Fossil Fuels and “layer”:4 for Electricity, they will not be aligned, with the latter being placed to the right of the former.

{"nodes":[{"name":"Oil"},{"name":"Natural Gas"},{"name":"Coal"},{"name":"Fossil Fuels","layer":3},{"name":"Electricity","layer":4},{"name":"Energy"}],"links":[{"source":0,"target":3,"value":15},{"source":1,"target":3,"value":20},{"source":2,"target":3,"value":25},{"source":2,"target":4,"value":25},{"source":3,"target":5,"value":60},{"source":4,"target":5,"value":25},{"source":4,"target":4,"value":5}]} 


You can also fix the size of a node using the keyword value.

{"nodes":[{"name":"Oil"},{"name":"Natural Gas"},{"name":"Coal"},{"name":"Fossil Fuels","layer":3,"value":10},{"name":"Electricity","layer":4},{"name":"Energy"}],"links":[{"source":0,"target":3,"value":15},{"source":1,"target":3,"value":20},{"source":2,"target":3,"value":25},{"source":2,"target":4,"value":25},{"source":3,"target":5,"value":60},{"source":4,"target":5,"value":25},{"source":4,"target":4,"value":5}]} 


UDPATE: Using the keyword fill, you can also color the nodes (and automatically their links).

{"nodes":[{"name":"Oil"},{"name":"Natural Gas"},{"name":"Coal","fill":"black"},{"name":"Fossil Fuels","layer":3,"value":10},{"name":"Electricity","layer":4},{"name":"Energy"}],"links":[{"source":0,"target":3,"value":15},{"source":1,"target":3,"value":20},{"source":2,"target":3,"value":25},{"source":2,"target":4,"value":25},{"source":3,"target":5,"value":60},{"source":4,"target":5,"value":25},{"source":4,"target":4,"value":5}]} 


UDPATE 2: Today the Sankey Diagram Generator got a major update: I have been working on the load and save functions to include the layout. Another minor update is that now you can toggle the node labels (both the text and values) on/off. On top giving you the option to save the Sankey structure and layout, now you can also save the diagram as a PNG image.

Now, when trying to save the Sankey code, a checkbox shows up next to the Done button, giving you the option to save the Sankey layout for loading later. This includes the node and link positions, as well as the settings for opacity and density. This is a major milestone as it has been a headache to redesign Sankeys, as previously only the structure was saved but not the layout. On the save screen now you also have the option to download the diagram as image.

As a result of these modifications, the Sankey save string changed a little bit in structure. In order to preserve background compatibility, the Sankey structure code – which made up the entirety of the save string up until now – was put under the key “sankey“, the parameters on whether to display labels, density and opacity under the key “params” and finally, the layout, if selected, under “fixedlayout“. Subsequently, when loading back the Sankey save string you are given the option to try to read the layout from the string. If you do not provide the layout, or you choose to ignore it (via a checkbox next to the Done button), the algorithm computes the layout for you automatically, as before.

UDPATE 3: Some users have requested to be able to create Sankeys with multiple flows between the same two nodes. This can be interpreted as having parallel links. While with a small number of parallel links, this is not a problem, the relaxation algorithm fails to lay out the links correctly in case of many. This algorithm is at the core of the rendering and therefore hard to change as it is designed to optimize the layout in general (minimize the total link path length in the connected component). However, I have included an experimental feature to correctly display Sankeys with many parallel links – this will not necessary offer the best layout for regular Sankeys though. In the sankey.js, there is a function called computeLinkDepths which sorts the links in ascending order at the sources and subsequently at the targets. This leads to some ordering conflicts (two source nodes going to the same target, with links of different value, will not both have their top link at the top at the target), which are then solved be the least in number. In case of parallel links, this creates a messy layout. To solve this, I have included a toggle for parallel rendering. This is a bit of an advanced feature and I encourage you to try to understand the sankey.js code structure before you turn this on. How to turn it on? First, make sure that in your input string all parallel links are sorted by value and grouped by source node. An example of this would be:


.Then open up a console in your browser and turn on parallel rendering by typing the following:


Then hit the Draw Sankey button to redraw the diagram and you should be seeing your Sankey with many parallel links loaded correctly. You can turn it off either by refreshing the page or setting it back to false in the console. Have fun!

UDPATE 4: Added option for adjusting decimals for nodes and links, as well a counter for the node editor on the right, making it easier to create larger diagrams.

Made with D3.js & Dragdealer. If you would like to show your support for my work, please consider a small donation. For a more advanced, applied implementation of this tool, see the Food Energy Flows Exploratorium.

Donate for more datawizardry!