Why this application?
At Antidot, we wanted to contribute to this trend, showing that there is a huge interest on developing new ways to mesh data from different sources through semantic web standards, and the ability of tools like our Antidot Information Factory solution to perform that task easily in an industrial approach.
And because France remains year after year the first destination for worldwide tourists, because our regions are full of architectural treasures and heritage,we chose to build a search application that allows you to explore nearly 44,000 historical French Monuments.
You can check it here with a French-speaking user interface.
Some technical explanations
Our « Monuments historiques » application he tool was performed by exploiting 7 open seven data sources:
- the list of Protected building available on data.gouv.fr. This data source describes 43,720 monuments in a CSV file.
- the list of passenger stations of the national rail network with geocoded informations as provided by data.gouv.fr. This datasource describes 3,065 stations in a XLS file. It is exploited to locate buildings near a train station.
- the list of the Paris metro stations with geocoded informations provided by OpenStreetMap. This data source describes 301 stations and is exploited to locate buildings near a Metro station.
- data from the Official Geographic Code (COG) from INSEE. This data source describes 22 regions, 99 departments, more than 4,000 townships and administrative centers in an RDF graph.
- photos of historical monuments from Wikipedia proposed by Wikimedia Commons. This data source, particularly fueled by the Wiki loves monuments competition, provides 122,828 pictures for 12,586 monuments designated by their PA code: this is a unique code for each monument and present in the list mentioned above in 1.
- The description of the historical monuments from Wikipedia provided by DBpedia This data source in RDF describes 3.64 million objetcs, including 413,000 places. This source is directly accessible from the information from Wikimedia Commons.
- Yahoo! location information via Yahoo! PlaceFinder. This source can geotag the monuments from their address, for those which are not already geo-tagged in Wikimedia Commons or DBpedia
The data processing workflow, created for this application with Antidot Information Factory is:
- Cleansing, normalizing and RDF transformation of CSV and XLS files from data.gouv.fr using Google Refine.
- Data collecting from Wikimedia Commons: an Antidot Information Factory processing workflow collects information with Wikimedia API and transform it into RDF : Antidot Information Factory allowed to build this workflow without programming a single line of code, simply assembling modules taken in a library of 50 ready-to-use building blocks.
- RData collecting from OpenStreetMap API for metro stations.
- Collecting all needed geocoding information from Yahoo! PlaceFinder API, for non-natively geotagged places.
- Meshing all the data: the output is a RDF graph containing more than 4.5 million triples, with about 450,000 infered from sources.
- This triple store is then the unique input for the indexing module of our Antidot Finder Suite search engine.
- with full text search
- in a given region, department or city
- by type of monument: church, castle, statue, industrial site
- by historical period: prehistoric, medieval, Renaissance etc.
- by type of owner: person or private corporation, municipality, state…
with all possible combination of these criteria as very easy to use « faceted search »
This application was carried out in four days by one person, without involving developers and simple configuration of our solution Antidot Information Factory. This shows, if still needed, the power and accuracy of the Semantic Web approach and technologies as promoted by W3C.
English traduction based on Àlex Hinojo’s work. : Àlex is a GLAMwiki Partnership Ambassador in Barcelona, Spain. Thank you Àlex!