Research by: Feixiang He, Andrey Polkovnichenko
Check Point Research has recently discovered a group of Android applications massively harvesting contact information on mobile phones without the user’s consent. The data stealing logic hides inside a data analytics Software Development Kit (SDK) seen in up to 12 different mobile applications and has so far been downloaded over 111 million times.
Dubbed ‘Operation Sheep’, this massive data stealing campaign is the first known campaign seen in the wild to exploit the Man-in-the-Disk vulnerability revealed by Check Point Research earlier last year.
The Attack Vector
The SDK, named SWAnalytics, is integrated into seemingly innocent Android applications published on major 3rd party Chinese app stores such as Tencent MyApp, Wandoujia, Huawei App Store, and Xiaomi App Store. After app installation, whenever SWAnalytics senses victims opening up infected applications or rebooting their phones, it silently uploads their entire contacts list to Hangzhou Shun Wang Technologies controlled servers.
From our first malicious sample encounter back in mid-September until now, we have observed 12 infected applications, the majority of which are in the system utility category. In the Tencent MyApp store alone, 8 of 12 infected applications have a staggering total number of over 111 million downloads. In theory, Shun Wang Technologies could have collected a third of China’s population names and contact numbers if not more.
With no clear declaration of usage from Shun Wang, nor proper regulatory supervision, such data could circulate into underground markets for further exploit, ranging from rogue marketing, targeted telephone scams or even friend referral program abuse during November’s Single’s Day and December’s Asian online shopping fest.
This paper will cover the discovery of this campaign, dubbed ‘Operation Sheep’, and an analysis of SWAnalytics. We will also discuss the value of contact data from a cyber criminal’s perspective, and assess the fallout effect of Operation Sheep on the global mobile data security landscape.
Operation Sheep’s Attack Flow
Emergence of the Attack
In mid-September, an app named ‘Network Speed Master’ stood out on our radar with its rather unusual behavior patterns. Upon initial analysis, we observed frequent access to the user’s contacts data together with a sudden network transmission surge when the app is opened or the mobile phone is rebooted. As a system utility application that simply tests network connection speed, it reaches far beyond reasonable actions. Deeply intrigued, we took a closer look.
Figure 1: ‘Network Speed Master’ on Tencent MyApp
We began with the app’s long Android permissions requirement, which usually signals excessive data collection.
Figure 2: The Android permissions Network Speed Master asks for
Although it is worrying to see that most of the permissions were not related to the app’s main features, it was still too early to conclude that Network Speed Master is a malware in and of itself.
Although, many legitimate advertisement SDKs rely on bespoke permissions to profile users in order to study user preferences and promote personalized advertisement content, we became much more skeptical when the “CoreReceiver” module appeared in the app’s manifest. This module monitors a wide range of device activities including application installation / remove / update, phone restart and battery charge.
Figure 3: CoreService and CoreReceiver event triggers
Following the user contacts data flow, we quickly landed on the actual contact data stealing module.
It indeed dumps the user’s entire contact list in a name-number pair format (e.g. “John Oliver: 1234567890”), concatenates pairs into a giant string, and then finally sends it out to a remote server.
Examining user contacts to understand a user’s social network is a popular marketing research method found in many advertisement SDKs. In order to protect user privacy, however, a legitimate SDK usually only collects contact names or numbers. Some even adopt a cryptographic approach to generate unique hash of name-number combinations so that the actual data is not collected at all.
Massively uploading identifiable contact details, though, is unacceptable and in many countries constitutes a violation of data privacy regulations.
The Exception to the Rule
Interestingly, the SDK developer seems to ignore old Android devices below Marshmallow on purpose. According to Google, Android Marshmallow and above represent 70% of Android devices. From an operational perspective, it seems to be a time and money saving trade-off that SWAnalytics made by avoiding to prolong compatibility support in order to keep its malicious code simple.
Furthermore, we also observed a mysterious data stealing exemption. If it is a Meitu Phone, SWAnalytics will not collect the user’s contacts data at all.
Network Speed Master’s Code
Upon establishing its malicious nature, we expanded our analysis scope to Network Speed Master’s overall code base.
Figure 4: The user contacts stealing code which interestingly only targets newer Androids.
It turns out that contacts data isn’t the only unusual data SWAnalytics is interested in. SWAnalytics also has a “smart” approach to collect the user’s QQ login data. (QQ is one of most popular instant messaging application in China. Both QQ and WeChat belong to Tencent.)
Figure 5: Entry of data stealing code with Meitu phone exemption
The Man-in-the-Disk Connection
With default settings, SWAnalytics will scan through an Android device’s external storage, looking for directory “tencent/MobileQQ/WebViewCheck”. This folder hosts QQ login data chache. By listing sub-folders, SWAnalytics is able to infer QQ accounts which have never been used on the device.
In Slava Makkaveev’s Man-in-the-Disk research, he discussed the risk of hosting sensitive application data in publicly accessible external storage. Operation Sheep is the first campaign we have observed in the wild that abuses similar concept since our MitD publication.
Figure 6: Code to scan QQ login details
To make this data harvesting operation flexible, SWAnalytics equips the ability to receive and process configuration files from a remote Command-and-Control server. Whenever users reboot their device or open up Network Speed Master, SWAnalytics will fetch the latest configuration file from “http[:]//mbl[.]shunwang[.]com/cfg/config[.]json”.
Figure 7: The latest configuration file from the C&C server
“gps_timeout” defines the time interval (in milliseconds) between the device’s GPS coordinates tracking attempts. Currently it leaks user geolocation every five seconds, which is a concerning behavior. “system_app” is a Boolean value which can be either 0 (false, do not include system applications in the device application list and send to C&C server) or 1 (true, include system applications). “info_address” is the directory SWAnalytics uses to search for QQ login details. “bs_time” defines time interval in seconds to leak all captured device data. It is also the check interval to make sure the data stealing process is alive. Currently, the data stealing process wakes up every 15 minutes.
Figure 8: the code to process the configuration file fetched from C&C server
The final part of Operation Sheep is to send data to Shun Wang’s C&C servers. Given the vast amount of data SWAnalytics leaks, it breaks down messages to C&C servers into five categories denoted by four digit numbers:
|Device information including geo-location, MAC address, installed application list, phone brand and model
|User contacts and QQ login list
|Current running applications
|UMENG_KEY (popular Chinese advertisement SDK Umeng) heartbeat
|Running process list and PID (process ID)
Each data message sent to the C&C server follows the format of:
To make sure the data is only readable to Shun Wang servers, SWAnalytics applies DES encryption twice. It uses the master encryption key “#dc?*-y1” to encrypt data before it’s sent to C&C server. In order to hide the master key, SWAnalytics uses another hard-coded passcode “123456” to encrypt its master key.
Figure 9: master key encryption
Figure 10: data transmission encryption
Upon concluding Network Speed Master’s malicious behaviors, it was time to look for the big picture. During several months of investigation and continuous monitoring, we discovered 12 applications infested with SWAnalytics published in various channels.
Because of the high segmentation of Android application publishing channels in China, this table is not an exhaustive list. Our research looked into 360 Qihoo Appstore, Baidu Appstore, Huawei Appstore, Tencent MyApp, Wandoujia, and Xiaomi Appstore. By the time of this publication, there’s no sign indicating SWAnalytics has penetrated Google Play yet.
|Major publish channels
|Confirmed infested version
|Incoming Call Flashlight
|Hangzhou Hui Mao
|All 6 stores
|2.3.1 to 2.3.4
|Network Speed Master
|All 6 stores
|2.7.7 to latest
|All 6 stores
|1.3.7 to 1.3.9.
|Wi-Fi Password Key
|All except Huawei and Xiaomi
|1.2.8 to 1.3.1
|Wi-Fi Signal Amplifier
|All 6 stores
|3.7.0 to 3.7.5
|Hangzhou Shun Wang
|All except Baidu and Xiaomi
|2.6.5 to latest
|Super Battery Charge
|All except 360 and Xiaomi
|2.3.6 to latest
(and 2 variants)
|Only published on Shun Wang website
|6.4.6 to latest
|All 6 stores
|182 to latest
|Currently only available on Shun Wang website
In order to understand SWAnalytics’ impact, we turned to public download volume data available on Chandashi, one of the app store optimization vendors specialized in Chinese mobile application markets.
We derived a monthly new installation volume of six popular applications of interest. Data points span from September 2018 to January 2019 where we observed over 17 million downloads in just five months. Most of the infected applications maintained a solid month-on-month growth rate with already huge user bases, and although the new download volume does not necessarily result in an exact number of unique and freshly infected user numbers, we believe it does signal large scale compromise and a growing threat to users.
Figure 11: data derived from chandashi.com, excluding Xiaomi Appstore.
The Value of Personal Contact Information
Compared to financial data and government issued ID document information, personal contact information is often treated as less sensitive data. According to popular belief, it requires extra effort to exploit such data while potential profits do not match a hacker’s effort. Hence it is unlikely to be targeted. However, the landscape is changing with deep specialization in underground markets and new “business models” available to profit from such personal contact data.
In 2018, intense competition among internet companies created a rush of new user acquisitions. It resulted in waves of cellphone number-based friend referral programs. In China alone, we have seen underground market “sheep shavers” ported SMS rogue marketing strategy to spread Alipay Red Packet referral URL links. When receivers click on these referral links, both the sender and receiver receives a small amount of credit in their Alipay accounts.
In 2019 new year trend predictions, multiple media outlets such as the BBC’s ‘Business Matters’ program and Forbes mentioned the online retail sector could expect great competition. In South East Asia alone, for example, Amazon, Alibaba backed Lazada and Shopee will each directly face-off against each other in regional markets.
Indeed, during 2018’s year-end December sales promotion, Lazada already adopted the Alipay style referral program. In the same way, similar marketing campaigns are expected from competing vendors in 2019. As a result, this malicious trend in the mobile arena will fuel sheep shavers’ interest in carrying out similar cyber operations in regional markets outside of China.
Figure 12: Lazada referral reward during December 2018’s end of year sales promotion
In a heuristic view, Operation Sheep presents a rising security threat from 3rd party SDKs. In Operation Sheep’s case, Shun Wang likely harvests end user contact lists without application developer acknowledgement. This behavior does not benefit end users nor application developers. Rigorous code review is needed to prevent such pilfering behavior in so-called data analytics SDKs.
During our investigation, BuzzFeed News reported alleged advertisement fraud from Cheetah Mobile SDK too. According to Cheetah Mobile’s follow-up investigation, fraudulent behaviors came from two 3rd party SDKs (Batmobi, Duapps) integrated inside Cheetah SDK. Both these cases should be loud reminders to application developers in 2019 that before integrating SDKs into their mobile applications developers need to be aware of potential risks of undocumented and malicious behaviors implemented in 3rd party SDKs.
Users are strongly advised to uninstall bespoke applications immediately. Stolen data is transmitted via HTTP protocol. It means not only that Shun Wang gets the victim’s data but that a malicious 3rd party can also intercept and acquire it.
Operation Sheep Naming Explained
“顺手牵羊” a Chinese idiom which can be directly translated in English as “taking the opportunity to pilfer a sheep”, originates from an ancient Chinese essay named Thirty-Six Stratagems. In modern days, it refers to stealing properties of small value when the owner is not paying attention or at disadvantage.
“薅羊毛” is a Chinese slang that translates in English as “shaving a sheep in a subtle way that it won’t be noticed”. It refers to the practice of acquiring small unlawful income in a quick manner and on a large scale, hoping to accumulate considerable gain. In the internet age, it describes a multi-million dollar worth underground market of profiting by abusing various reward systems or freebie marketing promotions provided by vendors to reward brand loyalty or expand a customer base.