As part of our Understanding Cybersecurity Series (UCS) knowledge mobilization program, we design, develop, and release open-source cybersecurity analytics and data analyzers to support advanced research and practical Cyber Threat Intelligence (CTI) applications. Our contribution to open-source projects stems from our belief in the open-source culture, which we consider a driving force for accessible development and a better world. We are convinced that open-source projects promote innovation, collaboration, and healthy competition, making them valuable to the community.
Quick Links
- Message Queuing Telemetry Transport (MQTT) Network Traffic Analyzer ()
- IoT ZWave network Traffic Analyzer ()
- DeFi Transaction Analyzer and Feature Extractor ()
- Volatility Memory Analyzer
18. UDP Network Traffic Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), UDPFlowLyzer is a specialized UDP network traffic flow analyzer developed to extract comprehensive flow-level features from PCAP and PCAPNG network captures. The tool analyzes UDP communications and generates a rich set of statistical, temporal, volumetric, and behavioral characteristics, including UDP header attributes, packet timing metrics, traffic volume indicators, burst patterns, entropy measurements, percentile-based statistics, and advanced flow analytics. With configurable flow analysis capabilities and support for large-scale traffic processing, UDPFlowLyzer produces structured feature datasets that can be directly integrated into intrusion detection systems, traffic classification frameworks, anomaly detection models, and cybersecurity research pipelines.
Related published papers:
Jafari, S.; Shafi, M. and Lashkari, A. H., Unveiling Hierarchical Machine Learning UDP鈥換UIC Intrusion Detection: Protocol-Aware Flow Analysis and a New Generated DDoS Dataset, International Conference on Security and Cryptography (SECRYPT) 2026, Portugal
More Information & Download Source Code:
17. QUIC Network Traffic Flow AnaLyzer ()
As part of the Understanding Cybersecurity Series (UCS), QUICFlowLyzer is a lightweight, header-only QUIC network traffic analyzer that extracts packet- and flow-level features from PCAP and PCAPNG network captures without requiring payload decryption. The tool parses QUIC protocol header fields to generate structured flow records and feature tables suitable for network monitoring, traffic characterization, cybersecurity research, and machine learning applications. QUICFlowLyzer supports the analysis of individual PCAP files, large collections of traffic captures through batch processing, and VXLAN-decapsulated traffic environments. Extracted QUIC-specific features are exported in CSV format, enabling efficient integration with data analytics, intrusion detection, traffic classification, and behavioral modeling workflows.
Related published papers:
Jafari, S.; Shafi, M. and Lashkari, A. H., Unveiling Hierarchical Machine Learning UDP鈥換UIC Intrusion Detection: Protocol-Aware Flow Analysis and a New Generated DDoS Dataset, International Conference on Security and Cryptography (SECRYPT) 2026, Portugal
More Information & Download Source Code:
16. Message Queuing Telemetry Transport (MQTT) Network Traffic Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), MQTTFlowLyzer is an open-source Python library developed to extract features from MQTT-based network traffic. It analyzes all MQTT packets in a pcap file and generates a dataset that is essential for developing an Intrusion Detection System (IDS). MQTTNetLyzer is a session-based analyzer. In essence, it extracts packet-level features and aggregates them to generate a session. There are several criteria for the aggregation. All packets in a session must have the same source/destination IP address, source/destination port number, client ID, and pub-sub topic. A session is initiated by a CONNECT packet and ended by either a DISCONNECT packet or by implemented constraints, such as maximum session duration and inactivity timeouts. MQTTFlowLyzer is a bidirectional session generator, with the forward direction determined by the CONNECT packet. MQTTFlowLyzer also extracts relevant features from an MQTT packet's Network and Transport layers. Additionally, MQTTFlowLyzer calculates new features other than raw features and adds them to the session. It also calculates statistical features for some features, such as mean, median, variance, and skewness, in both directions. It can extract 321 features from a pcap file.
Related published papers:
"", Arefeh Kouhi and Arash Habibi Lashkaria, The Journal of Supercomputing, Vol. 82, article number 334, 2026
More Information & Download Source Code:
15. Vehicular Controller Area Network (CAN) Signal Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), VehCANSigLyzer extracts both timing-based and signal-level features from raw CAN traffic to detect abnormal message injection behavior. It derives two timing features, time_interval, the difference between consecutive CAN frames, and aid_time_interval, the difference between frames sharing the same arbitration ID (AID), to capture disruptions in normal transmission patterns. In addition, signal-level features are decoded from each frame鈥檚 data field using the cantools library and an appropriate vehicle DBC file (e.g., hyundai_kia_generic.dbc from the open-source opendbc repository). The final feature set includes the arbitration ID (converted to decimal), the two timing-based features, and over 500 decoded signal features, providing a comprehensive representation of temporal and semantic CAN bus behavior for robust anomaly detection..
Related published papers:
More Information & Download Source Code:
14. IoT ZWave network Traffic Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), IoT-ZwaveNetLyzer is an open-source Python project developed for analyzing Z-Wave network traffic in IoT environments. It generates bidirectional traffic flows and extracts over 400 statistical and protocol-level features, such as signal strength (RSSI), packet speed, acknowledgment ratios, and channel usage patterns. By profiling both forward and backward communication streams, it enables a fine-grained understanding of device behavior and communication efficiency across smart home and industrial IoT networks. Designed for scalability and transparency, IoT-ZwaveNetLyzer supports both Linux and Windows environments and can be customized through simple configuration files, enabling flexible use in research and experimentation. The analyzer is a crucial tool for building and validating IoT intrusion detection systems, providing a structured framework to characterize device behavior, network stability, and traffic anomalies in both real and simulated environments.
Related published papers:
"", Mohammad Moein Shafi and Arash Habibi Lashkari, The Journal of Internet of Things, October 2025
More Information & Download Source Code:
13. Blockchain Security: DeFi Transaction Analyzer and Feature Extractor ()
As part of the Understanding Cybersecurity Series (UCS), DeFiTransLyzer is an open-source Python framework developed to extract and analyze features from Ethereum wallets and transactions for DeFi research. It includes a Wallet Analyzer to summarize wallet behavior (gas usage, balances, error rates, address interactions) and a Transaction Analyzer to parse transaction metrics (e.g., event logs and token transfers). Together, they provide a structured toolkit for profiling blockchain activity to support vulnerability research, behavioral modeling, and DeFi security studies, especially for AI-based solutions.
Related published papers:
"", Arash Habibi Lashkari, Sepideh Hajihosseinkhani, Joshua Duarte, Isabella Lopez, Ziba Habibi Lashkari, Sergio Rios-Aguilar, Blockchain: Research and Applications, Available online 3 September 2025
More Information & Download Source Code:
12. Smart Contracts Vulnerable Segment Analytics ()
As part of the Understanding Cybersecurity Series (UCS), SCsVulSegLytix is a learning-based, analytics framework for detecting and extracting vulnerable segments in smart contracts (SCs). It leverages a Transformer model - namely, Bidirectional Encoder Representations from Transformers (BERT) - trained with contract-level labels to extract vulnerable and secure segments from contracts. Thanks to its novel use of a post hoc interpretability technique, it highlights vulnerable segments without requiring expensive line-level annotations during training. It also improves graph-based methods by avoiding their costly pre-processing phase. Covering a broad range of SC vulnerabilities, SCsVulSegLytix outperforms prior methods regarding accuracy and computational complexity. Its goal is to aid developers and security auditors in accurately analyzing SC security by providing a fine-grained view of vulnerability locations without sacrificing efficiency or ease of use.
More Information & Download Source Code:
11. Smart Contracts Vulnerability Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), SCsVolLyzer is an open-source Python project that extracts more than 240 features to profile Smart Contracts (SCs) for vulnerability detection on the Ethereum Blockchain Platform. It is an advanced open-source Python project designed to enhance the profiling of Smart Contracts for improved vulnerability detection.
This version stands out by classifying features into compiler-based and non-compiler-based categories, enabling a broader scope for feature extraction than SCsVulLyzer V1.0. Compiler-based features, such as the Abstract Syntax Tree (AST) and the Application Binary Interface (ABI), are derived post-compilation, while non-compiler-based features leverage natural language processing techniques tailored to identify critical keywords in the source code. Additionally, the tool introduces three new feature categories, Contract Information, Source Code Information, and Solidity Information- that quantify aspect metrics such as function counts, statements, loops, and lines of code. These advancements allow for more granular and in-depth analysis of smart contracts, enhancing the overall utility of SCsVulLyzer V2.0. Another notable enhancement in this version is the introduction of 'bytecode entropy', a measure of randomness in the bytecode that indicates unpredictability and complexity. This metric is particularly valuable in fields like cryptography and anomaly detection.
Related published papers:
"", Sepideh HajiHosseinKhani, Arash Habibi Lashkari, Ali Mizani Oskui, Blockchain: Research and Applications, December 2024, 100253
More Information & Download Source Code:
10. Application Layer Flow Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), ALFlowLyzer generates bidirectional flows from the Application Layer of network traffic, in which the first packet determines the forward (source to destination) and backward (destination to source) directions. Hence, the statistical time-related features can be calculated separately in the forward and backward directions. Additional functionalities include selecting features from the list of existing features, adding new features, and controlling the duration of flow timeout.
Related published papers:
Unveiling Malicious DNS Behavior Profiling and Generating Benchmark Dataset through Application Layer Traffic Analysis, MohammadMoein Shafi, Arash Habibi Lashkari, Hardhik Mohanty, Computers and Electrical Engineering, Volume 118, Part B, September 2024, 109436
More Information & Download Source Code:
9. Network and Transportation Layers Flow Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), NTLFlowLyzer generates bidirectional flows from the Network and Transportation Layers of network traffic, where the first packet determines the forward (source-to-destination) and backward (destination-to-source) directions. Hence, the statistical time-related features can be calculated separately in the forward and backward directions. Additional features include selecting from the list of existing features, adding new features, and controlling the flow timeout duration.
Related published papers:
MohammadMoein Shafi, Arash Habibi Lashkari, Arousha Haghighian Roudsari, "NLFlowLyzer: Toward generating an intrusion detection dataset and intruders behavior profiling through network layer traffic analysis and pattern extraction", Computers & Security, 2024, 104160, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2024.104160.
More Information & Download Source Code:
8. Benign User Profiler ()
As part of the Understanding Cybersecurity Series (UCS), BUP is responsible for profiling the abstract behavior of human interactions and generating naturalistic, benign background traffic. Profiles can be applied to a diverse range of network protocols with different topologies because they represent the abstract properties of human and attack behavior. Once a benign profile is derived from users, an agent or human operator can generate realistic benign events on the network. Organizations and researchers can use this approach to easily generate realistic benign data; therefore, there is no need to anonymize data sets.
Related published papers:
MohammadMoein Shafi, Arash Habibi Lashkari, Vicente Rodriguez, and Ron Nevo, 鈥漈oward Generating a New Realistic Cloud_based Distributed Denial of Service (DDoS) Dataset and Intrusion Traffic Characterization鈥, Information, Vol. 15, 2024
More Information & Download Source Code:
7. Smart Contracts Vulnerability Analyzer ()
As part of the Understanding Cybersecurity Series (UCS), SCsVulLyzer is a Python-based tool that analyzes and extractsthe key metrics from Ethereum smart contracts written in Solidity. It employs a suite of functions to dissect the contract's source code, compiling it to obtain its abstract syntax tree (AST), bytecode, and opcodes. The analyzer calculates entropy of the bytecode to assess its randomness and security, determines the frequency of certain opcodes to understand the contract's complexity, and evaluates the usage of key Solidity keywords to gauge coding patterns. This modular and extensible tool provides a comprehensive snapshot of a smart contract's structure and behavihelping developers and auditors optimize and securering Ethereum blockchain applications.
Related published papers:
Sepideh Hajihosseinkhani, Arash Habibi Lashkari, Ali Mizani Oskui, 鈥淯nveiling Vulnerable Smart Contracts: Toward Profiling Vulnerable Smart Contracts using Genetic Algorithm and Generating Benchmark Dataset鈥, Blockchain: Research and Applications, Vol. 4, December 2023
More Information & Download Source Code:
6. Authorship Attribution Analyzer ()
The source code of a program often contains attributes and peculiarities that can be used to identify it, as they reflect individual coding styles, much like a writer's specific, identifiable handwriting. These stylistic or pattern variations range from very basic artifacts in the code layout and comments to very fine or subtle habits in the program's control flow or syntax. The challenging task of identifying the author of the source code based on these attributes is called Source Code Authorship Attribution (SCAA). AuthAttLyzer is a source code analyzer that can extract several features, including N-grams, Word-based embeddings, and Abstract Syntax Tree (AST) features.
Related published papers:
Abhishek Chopra , Nikhill Vombatkere , Arash Habibi Lashkari,鈥滱uthAttLyzer: A Robust defensive distillation-based Authorship Attribution framework鈥, The 12th International Conference on Communication and Network Security (ICCNS), China, 2022
More Information & Download Source Code:
5. PDF Malware Analyzer ()
Over the years, PDF has been the most widely used document format due to its portability and reliability. Unfortunately, PDF popularity and its advanced features have allowed attackers to exploit them in numerous ways. There are various critical PDF features that an attacker can misuse to deliver a malicious payload. This program extracts 31 different features from a set of pdf files specified by the user and writes them on a csv file. The resulting csv file can be further studied for variety of purposes, most importantly for detecting malicious pdf files.
Related published papers:
Maryam Issakhani, Princy Victor, Ali Tekeoglu, and Arash Habibi Lashkari1, 鈥淧DF Malware Detection Based on Stacking Learning鈥, The International Conference on Information Systems Security and Privacy, February 2022
More Information & Download Source Code:
4. IMAP Bot AnaLyzer ()
Credential stuffing is an attack that uses stolen account credentials, usually sourced from data breaches. It is a technique that exploits the fact that many people use the same username and password across multiple accounts. Credential stuffing has become a major concern for the Internet Mail Access Protocol (IMAP), a popular method for accessing electronic mail and news messages maintained on a remote server. A significant vulnerability in IMAP and other legacy email protocols is that they cannot support MFA and rely solely on a username and password for authentication, leaving them susceptible to credential stuffing. As bots generally carry out credential stuffing attacks, a promising countermeasure is to identify and block them before they can log in. Our objective is to use two types of behavioral biometrics - mouse dynamics and keystroke dynamics - for profiling humans and bots to distinguish between them. In this project, we introduced a supervised learning bot detection system using mouse and keystroke dynamics and compared the classification of the Random Forest(RF), Decision Tree(DT), Support Vector Machine(SVM), and K-Nearest Neighbors(KNN) machine learning algorithms to identify which model achieves the best overall result.
Related published papers:
鈥淒etecting IMAP Credential Stu铿僴g Bots Using Behavioural Biometrics鈥, Ashley Barkworth, Rehnuma Tabassum and Arash Habibi Lashkari, 12th International Conference on Communication and Network Security (ICCNS2022), China
More Information & Download Source Code:
3. Volatility Memory Analyzer ()
Memory forensics is a fundamental step that inspects malicious activities during live malware infection. Memory analysis not only captures malware footprints but also collects several essential features that may be used to extract hidden original code from obfuscated malware. There are significant efforts in analyzing volatile memory using several tools and approaches. These approaches fetch relevant information from the kernel and user space of the operating system to investigate running malware. However, the fetching process will accelerate if the most dominating features required for malware classification are readily available. Volatility Memory Analyzer (VolMemLyzer) is a python code to extract more than 36 features to analyze the malicious activities in a memory snapshot using Volatility tool.
Related published papers:
Arash Habibi Lashkari, Beiqi Li, Tristan Lucas Carrier, Gurdip Kaur, "", Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), IEEE 978-1-7281-6937-8/20, Canada, ON, McMaster University, 2021
More Information & Download Source Code:
2. DNS over HTTPS (DoH) Analyzer ()
Set of tools to capture HTTPS traffic, extract statistical and time-series features from it, and analyze them with a focus on detecting and characterizing DoH (DNS-over-HTTPS) traffic.
Related published papers:
Mohammadreza MontazeriShatoori, Logan Davidson, Gurdip Kaur and Arash Habibi Lashkari, "", The 5th Cyber Science and Technology Congress (2020) (CyberSciTech 2020), Vancouver, Canada, August 2020
More Information & Download Source Code:
1. Static and Dynamic Android App Analyzer ()
This research focuses on classifying android samples using static and dynamic analysis. The first version of this package covers the data collection and static feature extraction. The second version focuses on developing a classification model using AI for static features. The third version has the dynamic analysis module and related features to improve the classifier.
Related published papers:
Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, Francois Gagnon, and Fr茅d茅ric Massicotte, "", 10th International Conference on Communication and Network Security, Tokyo, Japan, November 2020,
David Sean Keyes, Beiqi Li, Gurdip Kaur, Arash Habibi Lashkari, Francois Gagnon, Fr麓ed麓eric Massicotte, "", Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), IEEE 978-1-7281-6937-8/20, Canada, ON, McMaster University, 2021
More Information & Download Source Code:
