885,086 customer records exposed in search engine

Guest blog post by Hallvard Nygård (@hallny)

Published: Mon, September 24, 2018, 21:20
Category:
Security
Tag:
Information leak
Written by: Hallvard Nygård

tl;dr 🔗

PostNord put all customers (name, phone and address) into a search database that was publicly available (known API key and referer). The database was used for easy selection of name and address when returning items. It was possible to opt out (mark address as secret). The customers had to register on the PostNord site, but was not told that the address would be publicly available. Previously PostNord had a dark UX pattern that tricked users into registering when they really wanted to track their package.

Who: PostNord. Also reported to NetOnNet.
Severity level: Low
Reported: September 2018
Reception and handling: Good
Status: Fixed
Reward: No reward. Thanks for the feedback.
Issue: The return package to web shop had a inline search field where you could search for name, address and phone for registered PostNord customers. Easy to download the whole database.

Background 🔗

A friend of mine - Thomas Kalve - sent me a tip about a interesting fuzzy search when trying to return a package to the online shop NetOnNet. When entering a phone number it was searching for a match and was also showing close matches. Not knowing that he had signed up for this search service, he sent me a tip.

The "return to web shop" page was backed by a search database containing 885,086 customer records. It used the search service Algolia which is a "search as a service" (SaaS) that is mainly branded towards product search. PostNord had added all customers to this search engine and made them searchable. It was possible opt out of this use of personal data.

PostNord told me they had 1.1 million users. I found 885,086 customers in the database. So around 200,000 had opted out.

The main issue with this database was that it was open and that PostNord did not declare this use and exposure of personal data. According to GDPR one should declare all uses of personal data. One should also make sure that the customer understands what he or she is accepting.

The form where you could say that an address is secret ("hemmelig adresse"), meaning opt out of the search database:

Approach (technical stuff) 🔗

The easiest way to use this leakage was to search for people in the form. One would need a return code (e.g. "netonnet") and knowledge where to find this page. Search could then be done in Chrome or another browser. If you had ha partial name or partial phone number, the search engine would help with the rest.

If you wanted a copy of the database, you could download the data over time. According to PostNord the service was protected by the following Algolia security features: HTTP referer restriction, rate limit and number of records retrieved limit. The first is, of course, super easy to spoof. The settings they used for the IP rate limit and number of records retrieved per result is not known to me. So how long this would take is also unknown. I did not test any limits on this system. Using my browser and Curl I would guess I saw around 100-200 customer records.

After doing a "copy to curl" in Chrome, I got a working and reproducible Curl command that I could run from a terminal. Stripping away unneeded headers I saw that the Referer header was required. I did not strip the query or change the JSON data.

curl 'https://swstrzr7ig-3.algolianet.com/1/indexes/address_books/query?x-algolia-agent=Algolia%20for%20vanilla%20JavaScript%203.28.0&x-algolia-application-id=SWSTRZR7IG&x-algolia-api-key=35b1d443b661ed9e65aa4e6c439030f1' \
    -H 'Referer: https://my.postnord.no/return/show' \
    --data '{"params":"query=45442095&hitsPerPage=5"}'

The request the following data (personal information has been changed):

{
    "hits":[
        {
            "id":775007,
            "name":"John Johnsen",
            "mobile":"45445095",
            "street_name":"John street 3",
            "additional_street_name":null,
            "city_name":"John town",
            "postal_zone":"4000",
            "country":"NO",
            "external_id":"808007",
            "source":"contacts",
            "user_id":1378007,
            "created_at":"2018-07-20 10:03:39",
            "updated_at":"2018-07-20 10:03:39",
            "objectID":"775007",
            "_highlightResult":{
            "name":{
                "value":"John Johnsen",
                "matchLevel":"none",
                "matchedWords":[

                ]
            },
            "mobile":{
                "value":"45445095",
                "matchLevel":"full",
                "fullyHighlighted":true,
                "matchedWords":[
                    "45442095"
                ]
            },
            "street_name":{
                "value":"John street 3",
                "matchLevel":"none",
                "matchedWords":[

                ]
            },
            // (.... More highlights without match ....)
            }
        },
        {...},
        {...},
        {...},
        {...}
    ],
    "nbHits":270,
    "page":0,
    "nbPages":54,
    "hitsPerPage":5,
    "processingTimeMS":8,
    "exhaustiveNbHits":true,
    "query":"45442095",
    "params":"query=45442095&hitsPerPage=5"
}

The documentation for the Algolia API is available online. By using the API I could see a other indices and tried a query without parameters. The query returned 30 results. The first results were created 2018-04-13 14:35:54. The nbHits was 885086, meaning there were 885,086 customer records in the database. Querying for indices I got the same number of records but a different created date. I'm guessing this system was set up between November 2017 and April 2018.

{
    "items":[
        {
            "name":"address_books",
            "createdAt":"2017-11-16T08:23:33.157Z",
            "updatedAt":"2018-09-19T19:23:17.094Z",
            "entries":885093,
            "dataSize":225565146,
            "fileSize":691313817,
            "lastBuildTimeS":87,
            "numberOfPendingTasks":1,
            "pendingTask":true
        },
        {"name":"biggest_decline",   "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)},
        {"name":"biggest_incline",   "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)},
        {"name":"popular",           "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)},
        {"name":"price_amount_asc",  "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:25:15.172Z", (...)},
        {"name":"price_amount_desc", "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:25:15.172Z", (...)},
        {"name":"products",          "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)},
        {"name":"rating_desc",       "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)},
        {"name":"review_score_desc", "createdAt":"2017-12-15T10:32:32.897Z", "updatedAt":"2018-09-19T19:23:35.868Z", (...)}
    ],
    "nbPages":1
}
Dark UX pattern 🔗

I do not have screenshot proof of this, but a previous version of the "track package" page (which was leaking personal data back in May 2018) applied a dark UX pattern where the user was shown a registration form instead before the tracking information. A link to this page was sent in an SMS to the recipient of a package. There was a small skip registration link or button (E.g. "No thanks, I just want to see where my package is").

A user would typically do like this:

  1. Click on "track package link" in SMS.
  2. Get PostNord registration form.
  3. Register user profile.
  4. See package status / pick up code.

When I saw this last, I did the following:

  1. Click on "track package link" in SMS.
  2. Get PostNord registration form.
  3. Skip registration by clicking a link.
  4. See package status / pick up code.

Security issues 🔗

This is was not a security issue the way Roy and I normally find and write about. The issue here was mainly that the customers did not accept this. PostNord did not have consent to make the customer records available for download. Customers did accept some terms and conditions and PostNord did have a privacy policy. None of them said anything about putting the records online.

The use of the search API was intentionally as part of their solution. It did not have any security issues in the normal sense, but this usage with personal data constitutes a leak of personal information.

Reception and handling 🔗

As last time the issue was quickly resolved.

Day 0 🔗

I reported the issue to NetOnNet and PostNord. A copy was also sent to The Norwegian Data Protection Authority (DPA, Datatilsynet). It was reported to NetOnNet since the return code used to access the page was from them. The report e-mail was title as "Whole NetOnNet customer database available" as that was the first assumption and that it was something that would get attention. It later showed that it was not their customer database, but a separate PostNord database.

Day 1 🔗

I talked to PostNord on the phone and got an e-mail summarizing the handling during following day. According to PostNord, the service was temporality taken down at 08:50. Taking the service down was the right response. My friend confirmed that the service was offline at 10:13.

PostNord will give the Data Protection Authority their version of the leak as part of the mandatory deviation notification.

Get notified when there are new posts! :-)