Extract data from an image or a PDF

Download PDF

It may happen to access a PDF file over the internet with a graph and no data available in numeric format..

this is an example from the European Parliamentary Research Service Blog:

Annual public expenditure on tertiary education as a % of GDP, 2008


The problem is that it is impossible to use the depicted data: you have to print the image, use a ruler, do the math and get some approximate value…. unless…

Mr. Ankit Rohatgi has developed a fantastic tool to capture data out of an image: http://arohatgi.info/WebPlotDigitizer/app/?

It works quite well and it is easy to use

Here some snapshot of the process to acquire data out of the proposed graph:


You have to define the type of graph and provide the axis so that the tool is able to do the math alone…


you highlight the point you want to capture…

and press the button to obtain..


and you can export data: It is now simple to assign the correct labels to the Bars, you can even do it directly in the tool

Now data is available for use it in your models, documents etc.

Bar0, 2.406811582426196
Bar1, 1.8897990839305203
Bar2, 1.8656743437801189
Bar3, 1.820263794272149
Bar4, 1.5101476346872085
Bar5, 1.4981932674139093
Bar6, 1.4831993927156741
Bar7, 1.3678184565473874
Bar8, 1.322410992833757
Bar9, 1.2526627833659236
Bar10, 1.2255077931736058
Bar11, 1.2196447839275366
Bar12, 1.2168243679007436
Bar13, 1.1227323268843656
Bar14, 1.1047112879385552
Bar15, 1.0714556823359878
Bar16, 1.0351636551028254
Bar17, 1.0323679254307496
Bar18, 1.026495658801663
Bar19, 1.0206326495555942
Bar20, 0.9995566742131481
Bar21, 0.951112788868922
Bar22, 0.9452497796228547
Bar23, 0.927216397499683
Bar24, 0.8635502886760615
Bar25, 0.8394378917030194
Bar26, 0.839650811512483
Bar27, 0.7820853180991101

The following two tabs change content below.

Andrea Terzaghi

Geneva (CH)
Ingegnere, pellegrino del mondo e della conoscenza, curioso di tutto

Ultimi post di Andrea Terzaghi (vedi tutti)