Access to Eurostat Datasource¶

Create DataStore for Eurostat by passing eurostat at initialization.

In [1]: import pyopendata as pyod

In [2]: store = pyod.DataStore('eurostat')

In [3]: store
Out[3]: EurostatStore (http://www.ec.europa.eu/eurostat/SDMX/diss-web/rest)

Get Employed doctorate holders in non managerial and non professional occupations by fields of science data. The result will be a DataFrame which has DatetimeIndex as index and MultiIndex of attributes or countries as column. The target URL is:

http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=cdh_e_fos&lang=en

We can read above URL as:

Resource ID: cdh_e_fos

In [4]: resource = store.get('cdh_e_fos')

In [5]: resource
Out[5]: EurostatResource (http://www.ec.europa.eu/eurostat/SDMX/diss-web/rest/data/cdh_e_fos/?)

In [6]: df = resource.read();

In [7]: df
Out[7]: 
UNIT               Percentage                             \
Y_GRAD                  Total                              
FOS07        Natural sciences                              
GEO                   Austria  Belgium  Bulgaria  Cyprus   
FREQ                   Annual   Annual    Annual  Annual   
TIME_PERIOD                                                
2006-01-01                NaN      NaN         0     100   
2009-01-01                NaN      NaN       NaN     NaN   

UNIT                                                           \
Y_GRAD                                                          
FOS07                                                           
GEO          Germany (until 1990 former territory of the FRG)   
FREQ                                                   Annual   
TIME_PERIOD                                                     
2006-01-01                                              36.65   
2009-01-01                                                NaN   

UNIT                               ...                                         \
Y_GRAD                             ...                         1990 and after   
FOS07                              ...                          Not specified   
GEO                                ...                                 Russia   
FREQ                               ...                                 Annual   
TIME_PERIOD                        ...                                          
2006-01-01                         ...                                    NaN   
2009-01-01                         ...                                    NaN   

UNIT                                                  
Y_GRAD                                                
FOS07                                                 
GEO          Sweden  Slovenia  Turkey  United States  
FREQ         Annual    Annual  Annual         Annual  
TIME_PERIOD                                           
2006-01-01      NaN       NaN     NaN            NaN  
2009-01-01      NaN       NaN     NaN            NaN  

[2 rows x 336 columns]

You can access to specific data by slicing column.

In [8]: usa = df['Percentage']['Total']['Natural sciences']['United States']

In [9]: usa
Out[9]: 
FREQ         Annual
TIME_PERIOD        
2006-01-01    46.31
2009-01-01    39.65