Data
It’s hard to find well detailed data about the Real Estate market in Spain. Usually the data prices are only for houses (not car parks, offices nor sites) and the municipality is minimum territorial entity of any property data.
The city of Madrid, for example, is a municipality.
No data for garages. No real estate data prices for districts or neighbourhoods of Madrid found.
idealista.com is the leader of the real estate on-line marketplace in Spain. They offer some data and a limited API.
Poor information.
I decided to take a snapshot of Idealista garage market for a random day in February. I used Scrapy (for scraping) and Django (for handy queries) python frameworks. I wrote a small note about how I’ve integrated Django and Scrapy and I will try to explore deeper this topic later. I also used pandas library to clean the data in a DataFrame format, grouping by Municipality, district and neighbourhood code to calculate the median. I also added the official geocode for each neighbourhood.
You can download the csv from:
rent-garage-madrid-feb.csv
where price
is the median price for each neighbourhood.
Visualization
Once you have the data, it’s useful to represent it in a map to understand it more easily. This is the output:
How?
The field geocode is the string join between the district’s and the neighbourhood’s code. That’s how Spanish National Statistical Institute and Madrid CityHall Open Data represent it.
And I found a really nice and well maintained TopoJSON of Madrid with the neighbourhoods and districts borders made by martgnz.
I use d3.js to visualize the median prices of each neighbourhood by a gray scale (lighter lower, darker higher prices).
As we expected the prices in the city center are the highest:
- high purchasing power
- high population density
- old buildings without parking
- more pedestrian streets.
Conclusions
Offer prices & sample size
This experiment does not represent the real garage rental prices in Madrid. They are a snapshot only of the offer prices for a day. So the data is incomplete and more sample size is required.
We could avoid this problem if we take an snapshot every day for a month of the Idealista.com data and delete the garages that have been on line for long periods of time. This would avoid the highest garage prices that are not going to be rented and also we could make our sample size bigger.
Interactive map
Is a nice map but it doesn’t give us much information about the exact median price by neighbourhood, neither the district borders (grouping inside the neighbourhoods) or neighbourhood names (usually people don’t recognize the neighbourhood only by shape…).
I’ll try to avoid this problem in the next post, Median rental room prices in Spanish municipalities.