The spring of 2022 saw a spike in activity of Bumblebee loader, a recent threat that has garnered a lot of attention due to its many links to several well-known malware families. In this piece we outline the conclusions of our research into this piece of malware:
Bumblebee is in constant evolution, which is best demonstrated by the fact that the loader system has undergone a radical change twice in the range of a few days — first from the use of ISO format files to VHD format files containing a powershell script, then back again.
Changes in the behavior of Bumblebee’s servers that occurred around June 2022 indicate that the attackers may have shifted their focus from extensive testing of their malware to reach as many victims as possible.
Although the threat contains a field called group_name, it may not be a good indicator for clustering-related activity: samples with different group_name values have been exhibiting similar behavior, which may indicate a single actor operating many group_names. The same is not true for encryption keys: different encryption keys generally imply different behavior, as expected.
Bumblebee payloads vary greatly based on the type of victim. Infected standalone computers will likely be hit with banking trojans or infostealers, whereas organizational networks can expect to be hit with more advanced post-exploitation tools such as CobaltStrike.
The Bumblebee loader usually comes in the form of a DLL-like binary packed with a custom packer. The method by which this DLL is delivered seems to be subject to change on the whims of the threat’s adventurous developers: while the prevailing method is to embed the packed DLL directly inside another file (usually an ISO), during a short stint in June the malware’s operators experimented with using VHD files that executed PowerShell downloading and decrypting the packed DLL itself (packed with a very different packer), as documented by Deep Instinct. This trend seems to have died out and now the DLL can be found directly embedded in the 1st-stage file again, whether an ISO or a VHD.
Once unpacked, Bumblebee will perform checks to avoid being executed in sandboxing or analyst environments; most of the code responsible for this is open source, lifted directly from the Al-Khaser project. If these checks pass, Bumblebee proceeds to load its configuration into memory. This is done by loading four pointers from its .data section which point to four different buffers in a contiguous encrypted configuration struct. The first of these points to an 80-byte section that stores an RC4 ascii key (much shorter in all cases we’ve observed). The other three pointers point to two 80-byte sections and a 1024-byte section, all of which contain data that is then decrypted using the above-mentioned RC4 key.
Once decrypted, the first 80-byte buffer in most of the samples to date has simply contained the number “444”; the malware makes no use of this number so its significance is not clear. The second buffer contains an ASCII code which is called group_name by the malware. Finally, the 1024-byte block contains a list of command and control servers (most of them are usually fake).
Bumblebee computes a machine-specific pseudorandom victim ID (internally named client_id) via the usual method of concatenating some immutable machine parameters (in this case, machine name and GUUID) and then calculating a hash of the result (in this case, an MD5 digest).
Using this data and some other elements collected from the victim system, Bumblebee builds a C&C check-in in JSON format, such as the one below:
"sys_version":"Microsoft Windows 10 pro \\nUser name: LUCAS-PC\\nDomain name: WORKGROUP",
This string is encrypted using the same RC4 key used earlier for the configuration, and repeatedly sent to its C2 server with random delays between 25 seconds and 3 minutes regardless of whether the server responds or it’s down. The response from the command and control server is also in JSON format and also encrypted with the same RC4 key (we appreciate this elegant design and encourage malware authors to aspire to this standard of legibility). The content of the response itself naturally varies, and can be for example an empty response:
In the case of receiving a payload, the structure of the response will contain a list of elements in the tasks section of the json, each with a command and a payload. Each of the elements will contain, among others, a task field with the name of the command to be executed, and a base64 encoded payload inside a section called task_data.
Botnet Behavior Analysis
Until early July we have observed a very curious behavior of the command and control servers. Once a client_id was generated for an infected victim and sent to a command and control server, that command and control server would stop accepting other different client_id codes from that same victim external IP. This means that if several computers in an organization, accessing the internet with the same public IP were infected, the C2 server will only accept the first one infected. But several weeks ago this feature was abruptly turned off, drastically increasing the number of established connections to infected victims at the expense of… whatever this feature was supposed to achieve (possibly it was indicative of a testing phase for the malware, which has now ended).
This behavior motivated us to pay special attention to the behavior of Bumblebee in different execution environments. Notably, despite having a field called group_name hardcoded in every sample, this value is sent in each request to the command and control server. Further, the above-described “one client_id per IP address” policy curiously seemed to apply across different group_names — but not across different RC4 encryption keys, which seems to imply the use of several group_names by what is effectively the same botnet, possibly to mark different campaigns or different sets of victims. As a result, grouping activity by encryption key seems to be a more coherent approach than grouping by group_name.
This hypothesis is further supported by the fact that we’ve observed several samples with the same RC4 key and different group_name acting identically and dropping the same threats within a very close time range, while samples that differ in their used RC4 key exhibit completely different behavior.
The fact that command and control servers with different IP addresses contacted by different samples using the same RC4 key are returning the same payloads and blocking the same client_id for their victims also suggests that these IP addresses actually only act as fronts for a main command and control server to which all Bumblebee connections are relayed.
Another interesting element of the behavior of these botnets is how the toolset dropped by Bumblebee into victim machines differs depending on the kind of target. To deploy a threat, of the 5 commands supported by bumblebee, 3 lead to code being downloaded from the C2 server and executed:
DEX: deploys an executable to disk and runs it.
DIJ: Injects a library into a process and executes it.
SHI: injects and executes shellcode into a process.
As part of our ongoing monitoring of various Bumblebee botnets, we have been monitoring differences in behavior based on factors such as type of network or geolocation. While the victim’s geographical location didn’t seem to have any effect on the malware behavior, we observed a very stark difference between the way Bumblebee behaves after infecting machines that are part of a domain (a logical group of network that share the same Active Directory server), as opposed to machines isolated from a company network that are connected to a workgroup (a Microsoft term to denote a peer to peer local area network).
If the victim is connected to WORKGROUP, in most cases it receives the DEX command (Download and Execute), which causes it to drop and run a file from the disk. These payloads are usually common stealers like Vidar Stealer, or banking trojans:
On the other hand, if the victim is connected to an AD domain, it generally receives DIJ (Download and Inject) or SHI (Download shellcode and Inject) commands.
In these cases, the resulting threats have been payloads from more advanced post-exploitation frameworks, such as CobaltStrike, Sliver or Meterpreter.
In these cases, it has also been observed that regardless of the IP of the command and control server and the group_name field, samples with the same RC4 key drop the same Cobalt Strike beacons with the same Team servers, which has proven to be a very useful means of relating different samples to each other as part of the same botnet.
One last interesting feature of the payloads dropped by Bumblebee is that both the binaries downloaded using the DEX command and those downloaded with the DIJ command are in many cases packaged using the same Bumblebee packer.
Analyzing the behavior of the command and control servers used by Bumblebee operators, we have observed how they have tweaked the way their infection chains behave, sometimes in ways that served to drastically expand the number of active victims and volume of C2 traffic.
For the moment, behavior until the deployment of the 2nd-stage payload is very similar even across different Bumblebee botnets, but further behavior starting with the choice of 2nd-stage payload sharply diverges based on RC4 key used. This behavior can also serve to group activity into different clusters, on top of using the RC4 key itself.
Unlike other threats that use third-party packers and off-the-crimeware-shelf antivirus evasion tools, Bumblebee uses its own packer both for the threat itself and for some of the samples it deploys on victims’ computers, just like other advanced malware families such as Trickbot. While this allows Bumblebee operators greater flexibility in changing behavior and adding features, the use of unique custom tools also serves as a method to quickly identify Bumblebee activity in the wild.
Check Point’s security products are designed to prevent any cyber attack and protect against threats such as described in this blog