Techtalk Thursday: Data collection and GDPR
11 june 2020 | 5 min read
Artificial intelligence and data science are very interesting topics to learn about. However, they can be quite difficult to understand. That’s why we’ve decided to have a chat with our local data wizard Sam to ask him some technical questions. This time we decided to focus on data collection and how it’s affected by GDPR.
Here are the questions Sam will be answering for us today:
- What is data collection?
- What is the difference between data controllers and data processors?
- What are our responsibilities as data processors?
- What data do we use at Co-libry?
- What is Co-libry doing to keep this data safe/anonymous?
- How is this data stored?
- What is open data? And how is it used by Co-libry?
What is Data collection?
Data collection is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes.
Take real estate portals, for example, they could capture all clicks or sessions a user generates on a portal, what listings the user views, what ads they look at, how long they think about it, how often they visit the website and more.
The data is all anonymous and cannot be tracked back to the user.
It is also important to note that here at Co-libry, we are data processors and not data controllers.
What is the difference between Data controllers and data processors?
The data controller determines the purposes for which and how personal data is processed.
Employees of a data processor process personal data within that organization to fulfill the tasks imposed by data controllers. This means that the data processor uses personal data only on behalf of the controller.
A processor is also allowed to use the data for his own purpose, they can only act commissioned by the controller.
The data processor is usually a third party that is external to the company. However, in the case of groups of undertakings, one undertaking may act as a processor for another undertaking.
For us, this means that we carry out everything on behalf of the customer who is the data controller.
What are our responsibilities as processors?
We must guarantee that we are capable of processing personal data under GDPR legislation. Specifically, this means that we:
-Are transparent about who we are, why we have to process the data, who’s going to have access to the data and that we have a Data protection officer (DPO)
-We do guarantee the right to access and the right to data portability. This means that we tell the individuals if their data is being processed, inform them about the processes and that we will give them a copy if they ask for it
– We do guarantee the right to be forgotten
– We do guarantee the right to correct and the right to object
– We do guarantee data protection by design and default
– We do guarantee a proving notification in case of a data breach
What data do we use at Co-libry?
We require data that enables us to personalize the user experience on a portal.
An absolute minimum would be the listings a user viewed. Based on this information we can already create a profile that we use for certain services. Of course, the more data we get, the more we can personalize.
Typically, we deduce from their clicks and search behavior in which phase they are in their customer journey. In terms of real estate, we defined exploration, specific search, the moment of truth, and the aftermath as specific phases.
A user is going to have different preferences per phase. For this, we need the clicks and their search behavior as well.
Furthermore, if we can see how long they dwell on certain pictures, we can also deduce that a certain person is more intrigued by pictures of, for example, the living room while another user is more interested in the garden.
This helps us to always show them the pictures they are interested in first.
What is Co-libry doing to keep this data safe/anonymous?
As strange as it may sound, the data we work with is already practically anonymous. It contains no names, demographics, phone numbers, etc.
What we do make anonymous are IP addresses. As you probably know, you can use an IP address to track someone. An IP address typically works a bit like decimal coordinates, the more decimal digits you leave out, the less accurate your location is. An IP address is built like xxx.xxx.xxx.xxx for example. 22.214.171.124. When you blur the last digits, your IP address becomes less accurate. So 218.58.216.xxx will not allow you to find out which exact location this is.
How is this data stored?
Most data is stored in the cloud, typically behind tight security and data encryption. We also have some data locally but this is on a hard drive with data encryption and password protection, to which only the data protection officer has access.
In terms of access, the DPO is extremely careful about who he gives access to the data. For example, our marketing manager does not need access to this data, so he doesn’t get access either.
When we have an intern working with us, he will not have access to the database but only to certain dumps that Sam will carefully pick for him.
What is open data? And how is it used by Co-libry?
Open data are data sources that are publicly available. An example is the statistics of the population groups in Belgium per district.
At the moment we only use open data to enrich real estate listings. For example schools in the neighborhood, transport in the neighborhood, restaurants,… We used open data to start our computer vision. For this, we trained the basic model with a dataset of pictures that were already available. Now at a later stage, we’ve supplemented this open data with our data sources.
Here at Co-libry, we only process the data that our clients are collecting. This data is always anonymous and cannot be tracked back to a specific user. The collected data is used to enhance the user experience and to create personalized content for the users.
Throughout our years of experience working with data from multiple real estate portals and websites, we have identified the four biggest data struggles and compiled everything we know about those issues in this article.
E-commerce has made giant leaps over the last couple of years in terms of personalization. But how can real estate portals learn from this achievement?
In this article, you’ll read everything about the latest PropTech trends and how Proptech is drastically innovating the real estate industry.