自动化收集是指攻击者通过脚本、命令行工具或API接口等手段,在已控制系统中自动获取目标数据的技术。传统自动化收集通常表现为集中式、高频次的文件遍历与数据传输,可通过监控异常文件访问模式(如短时间内大量文件打开操作)、检测非常用命令行参数组合或识别异常云API调用频率等手段进行防御。防御措施包括实施细粒度文件访问审计、建立进程行为基线以及部署云操作异常检测系统等。
为规避传统检测机制,攻击者发展出深度隐蔽的自动化收集技术,通过行为特征伪装、传输通道加密和时序模式混淆等策略,将数据窃取行为融入目标系统的正常运维流程,形成"低特征、高持续"的新型数据收集范式。
当前自动化收集匿迹技术的核心在于构建多维度的行为合法性证明体系。攻击者通过精确模拟系统管理工具的操作特征,在进程行为层面实现恶意收集程序与合法运维工具的不可区分性;利用目标系统已授权的加密通道传输数据,使外传流量在协议层面具备完整合法性证明;通过时序离散化策略将大规模数据窃取分解为符合系统作业调度规律的微操作,规避基于操作频次与规模的检测阈值。三类技术的共性特征表现为:深度利用系统信任机制(如合法进程身份、加密通信授权)、精准适配环境行为基线(如文件操作模式、作业调度规律)以及动态规避单维度检测规则(如流量特征、时序周期)。这种多维匿迹策略使得传统基于单一特征(如特定命令行参数、固定IP连接)的检测方法失效,迫使防御方转向复杂环境下的行为关联分析。
匿迹技术的演进要求防御体系构建多维度行为建模能力,重点加强加密流量元数据分析、细粒度进程行为画像以及跨时序作业模式识别等技术研发,同时需建立云环境API操作的全生命周期监控机制,实现对隐蔽数据收集行为的深度感知与阻断。
| 效应类型 | 是否存在 |
|---|---|
| 特征伪装 | ✅ |
| 行为透明 | ❌ |
| 数据遮蔽 | ✅ |
| 时空释痕 | ✅ |
攻击者通过精确模拟系统管理工具的文件操作模式和进程行为特征,使自动化收集行为在命令行参数、系统调用序列等维度与合法操作保持高度一致。例如将数据窃取指令嵌入日志压缩脚本,或伪装成数据库备份进程,利用系统信任机制掩盖恶意行为。
采用端到端加密传输技术,通过HTTPS、SSH等合法加密通道传输窃取数据,使防御方无法通过流量内容分析识别敏感信息外传。加密不仅保护数据内容,更重要的是利用加密流量的普遍性隐藏恶意传输行为。
通过时序离散化策略将集中式数据收集分解为长周期、低频次的微操作,单次行为特征低于检测阈值。结合动态任务调度机制,使数据窃取作业的时间分布与系统维护窗口高度吻合,破坏基于操作频次与时间规律的异常检测。
| ID | Name | Description |
|---|---|---|
| G1030 | Agrius |
Agrius used a custom tool, |
| S0622 | AppleSeed |
AppleSeed has automatically collected data from USB drives, keystrokes, and screen images before exfiltration.[2] |
| G0006 | APT1 |
APT1 used a batch script to perform a series of discovery techniques and saves it to a text file.[3] |
| G0007 | APT28 |
APT28 used a publicly available tool to gather and compress multiple documents on the DCCC and DNC networks.[4] |
| C0040 | APT41 DUST |
APT41 DUST used tools such as SQLULDR2 and PINEGROVE to gather local system and database information.[5] |
| S0438 | Attor |
Attor has automatically collected data about the compromised system.[6] |
| S0128 | BADNEWS |
BADNEWS monitors USB devices and copies files with certain extensions to a predefined directory.[7] |
| S0239 | Bankshot |
Bankshot recursively generates a list of files within a directory and sends them back to the control server.[8] |
| S1043 | ccf32 |
ccf32 can be used to automatically collect files from a compromised host.[9] |
| G0114 | Chimera |
Chimera has used custom DLLs for continuous retrieval of data from memory.[10] |
| S0244 | Comnie |
Comnie executes a batch script to store discovery information in %TEMP%\info.dat and then uploads the temporarily file to the remote C2 server.[11] |
| G0142 | Confucius |
Confucius has used a file stealer to steal documents and images with the following extensions: txt, pdf, png, jpg, doc, xls, xlm, odp, ods, odt, rtf, ppt, xlsx, xlsm, docx, pptx, and jpeg.[12] |
| S0538 | Crutch |
Crutch can automatically monitor removable drives in a loop and copy interesting files.[13] |
| S1111 | DarkGate |
DarkGate searches for stored credentials associated with cryptocurrency wallets and notifies the command and control server when identified.[14] |
| G1003 | Ember Bear |
Ember Bear engages in mass collection from compromised systems during intrusions.[15] |
| S0363 | Empire |
Empire can automatically gather the username, domain name, machine name, and other information from a compromised system.[16] |
| G0053 | FIN5 |
FIN5 scans processes on all victim systems in the environment and uses automated scripts to pull back the results.[17] |
| G0037 | FIN6 |
FIN6 has used a script to iterate through a list of compromised PoS systems, copy and remove data to a log file, and to bind to events from the submit payment button.[18][19] |
| C0001 | Frankenstein |
During Frankenstein, the threat actors used Empire to automatically gather the username, domain name, machine name, and other system information.[16] |
| S1044 | FunnyDream |
FunnyDream can monitor files for changes and automatically collect them.[9] |
| G0047 | Gamaredon Group |
Gamaredon Group has deployed scripts on compromised systems that automatically scan for interesting documents.[20] |
| S0597 | GoldFinder |
GoldFinder logged and stored information related to the route or hops a packet took from a compromised machine to a hardcoded C2 server, including the target C2 URL, HTTP response/status code, HTTP response headers and values, and data received from the C2 node.[21] |
| S0170 | Helminth |
A Helminth VBScript receives a batch script to execute a set of commands in a command prompt.[22] |
| S0260 | InvisiMole |
InvisiMole can sort and collect specific documents as well as generate a list of all files on a newly inserted drive and store them in an encrypted file.[23][24] |
| G0004 | Ke3chang |
Ke3chang has performed frequent and scheduled data collection from victim networks.[25] |
| S0395 | LightNeuron |
LightNeuron can be configured to automatically collect files under a specified directory.[26] |
| S1101 | LoFiSe |
LoFiSe can collect all the files from the working directory every three hours and place them into a password-protected archive for further exfiltration.[27] |
| G0045 | menuPass |
menuPass has used the Csvde tool to collect Active Directory files and data.[28] |
| S0443 | MESSAGETAP |
MESSAGETAP checks two files, keyword_parm.txt and parm.txt, for instructions on how to target and save data parsed and extracted from SMS message data from the network traffic. If an SMS message contained either a phone number, IMSI number, or keyword that matched the predefined list, it is saved to a CSV file for later theft by the threat actor.[29] |
| S0455 | Metamorfo |
Metamorfo has automatically collected mouse clicks, continuous screenshots on the machine, and set timers to collect the contents of the clipboard and website browsing.[30] |
| S0339 | Micropsia |
Micropsia executes an RAR tool to recursively archive files based on a predefined list of file extensions (.xls, .xlsx, .csv, .odt, .doc, .docx, .ppt, .pptx, .pdf, .mdb, .accdb, .accde, *.txt).[31] |
| G0129 | Mustang Panda |
Mustang Panda used custom batch scripts to collect files automatically from a targeted system.[32] |
| S0699 | Mythic |
Mythic supports scripting of file downloads from agents.[33] |
| S0198 | NETWIRE | |
| S1131 | NPPSPY |
NPPSPY collection is automatically recorded to a specified file on the victim machine.[35] |
| G0049 | OilRig | |
| C0014 | Operation Wocao |
During Operation Wocao, threat actors used a script to collect information about the infected system.[37] |
| S1017 | OutSteel |
OutSteel can automatically scan for and collect files with specific extensions.[38] |
| S1109 | PACEMAKER |
PACEMAKER can enter a loop to read |
| S1091 | Pacu |
Pacu can automatically collect data, such as CloudFormation templates, EC2 user data, AWS Inspector reports, and IAM credential reports.[40] |
| G0040 | Patchwork |
Patchwork developed a file stealer to search C:\ and collect files with certain extensions. Patchwork also executed a script to enumerate all drives, store them as a list, and upload generated files to the C2 server.[7] |
| S0428 | PoetRAT |
PoetRAT used file system monitoring to track modification and enable automatic exfiltration.[41] |
| S0378 | PoshC2 |
PoshC2 contains a module for recursively parsing through files and directories to gather valid credit card numbers.[42] |
| S0238 | Proxysvc |
Proxysvc automatically collects data about the victim and sends it to the control server.[43] |
| S1148 | Raccoon Stealer |
Raccoon Stealer collects files and directories from victim systems based on configuration data downloaded from command and control servers.[44][45][46] |
| S0458 | Ramsay |
Ramsay can conduct an initial scan for Microsoft Word documents on the local system, removable media, and connected network drives, before tagging and collecting them. It can continue tagging documents to collect with follow up scans.[47] |
| G1039 | RedCurl | |
| S0684 | ROADTools |
ROADTools automatically gathers data from Azure AD environments using the Azure Graph API.[50] |
| S1078 | RotaJakiro |
Depending on the Linux distribution, RotaJakiro executes a set of commands to collect device information and sends the collected information to the C2 server.[51] |
| S0090 | Rover |
Rover automatically collects files from the local system and removable drives based on a predefined list of file extensions on a regular timeframe.[52] |
| S0148 | RTM |
RTM monitors browsing activity and automatically captures screenshots if a victim browses to a URL matching one of a list of strings.[53][54] |
| S0445 | ShimRatReporter |
ShimRatReporter gathered information automatically, without instruction from a C2, related to the user and host machine that is compiled into a report and sent to the operators.[55] |
| G0121 | Sidewinder |
Sidewinder has used tools to automatically collect system and network configuration information.[56] |
| S0491 | StrongPity |
StrongPity has a file searcher component that can automatically collect and archive files based on a predefined list of file extensions.[57] |
| S0098 | T9000 |
T9000 searches removable storage devices for files with a pre-defined list of file extensions (e.g. * .doc, .ppt, .xls, .docx, .pptx, *.xlsx). Any matching files are encrypted and written to a local user directory.[58] |
| S0467 | TajMahal |
TajMahal has the ability to index and compress files into a send queue for exfiltration.[59] |
| G0027 | Threat Group-3390 |
Threat Group-3390 ran a command to compile an archive of file types of interest from the victim user's directories.[60] |
| G0081 | Tropic Trooper |
Tropic Trooper has collected information automatically using the adversary's USBferry attack.[61] |
| S0136 | USBStealer |
For all non-removable drives on a victim, USBStealer executes automated collection of certain files for later exfiltration.[62] |
| S0476 | Valak |
Valak can download a module to search for and build a report of harvested credential data.[63] |
| S0257 | VERMIN |
VERMIN saves each collected file with the automatically generated format {0:dd-MM-yyyy}.txt .[64] |
| S0466 | WindTail |
WindTail can identify and add files that possess specific file extensions to an array for archiving.[65] |
| G1035 | Winter Vivern |
Winter Vivern delivered a PowerShell script capable of recursively scanning victim machines looking for various file types before exfiltrating identified files via HTTP.[66] |
| S0251 | Zebrocy |
Zebrocy scans the system and automatically collects files with the following extensions: .doc, .docx, ,.xls, .xlsx, .pdf, .pptx, .rar, .zip, .jpg, .jpeg, .bmp, .tiff, .kum, .tlg, .sbx, .cr, .hse, .hsf, and .lhz.[67][68] |
| ID | Mitigation | Description |
|---|---|---|
| M1041 | Encrypt Sensitive Information |
Encryption and off-system storage of sensitive information may be one way to mitigate collection of files, but may not stop an adversary from acquiring the information if an intrusion persists over a long period of time and the adversary is able to discover and access the data through other means. Strong passwords should be used on certain encrypted documents that use them to prevent offline cracking through Brute Force techniques. |
| M1029 | Remote Data Storage |
Encryption and off-system storage of sensitive information may be one way to mitigate collection of files, but may not stop an adversary from acquiring the information if an intrusion persists over a long period of time and the adversary is able to discover and access the data through other means. |
| ID | Data Source | Data Component | Detects |
|---|---|---|---|
| DS0017 | Command | Command Execution |
Monitor executed commands and arguments for actions that could be taken to collect internal data. |
| DS0022 | File | File Access |
Monitor for unexpected files (e.g., .pdf, .docx, .jpg, etc.) viewed for collecting internal data. |
| DS0012 | Script | Script Execution |
Monitor for any attempts to enable scripts running on a system would be considered suspicious. If scripts are not commonly used on a system, but enabled, scripts running out of cycle from patching or other administrator functions are suspicious. Scripts should be captured from the file system when possible to determine their actions and intent. |
| DS0002 | User Account | User Account Authentication |
Monitor Azure AD (Entra ID) Sign In logs for suspicious Applications authenticating to the Graph API or other sensitive Resources using User Agents attributed to scripting interpreters such as python or Powershell. Analytic 1 - Suspicious applications, unusual user agents (e.g., python, PowerShell), anomalous IP addresses, and unmanaged devices
|