The specific research questions to be addressed will include: (1) How did instructors perceive the content’s efficacy? (2) What factors influence an instructor’s participation in curricular change? (3) What are the obstacles or considerations for AI integration into cybersecurity curricula? (4) How do students perceive content efficacy? (5) What (if any) influence do the modules have on student interest and engagement? This project answers the call for advances in education research at the intersection of cybersecurity and AI through a fully interdependent and integrated approach that draws on the expertise of the team. It also leverages widely accepted theoretical frameworks and methods to evaluate and assess the effectiveness of the work to ensure high impact and potential for future scale-up.
This project is supported by a special initiative of the Secure and Trustworthy Cyberspace (SaTC) program to foster new, previously unexplored, collaborations between the fields of cybersecurity, artificial intelligence, and education. The SaTC program aligns with the Federal Cybersecurity Research and Development Strategic Plan and the National Privacy Research Strategy to protect and preserve the growing social and economic benefits of cyber systems while ensuring security and privacy.
Project Description
Enabling Artificial Intelligence into Cybersecurity Education: A Comprehensive Data-driven Approach
Motivations & Scope
Cybersecurity researchers and practitioners have determined that modern cybersecurity methods are increasingly using AI techniques. However, the cybersecurity curriculum has not been updated to integrate such topics or techniques. Therefore, we find it imperative to find relevant topics in AI that are most commonly used within cybersecurity in order to form an explicit AI module that integrates AI concepts into cybersecurity.
We ask the questions: What are the most common correlations between AI and cybersecurity topics? What are suitable AI topics to integrate in a module within the context of cybersecurity?
We follow a methodology that considers over 2,000 research papers from top-tier cybersecurity conferences and journals (e.g., NDSS, USENIX, ACM CCS, IEEE S\&P, and IEEE TIFS). We extracted AI-related keywords and used Natural Language Processing (NLP) techniques to identify AI concepts. Furthermore, we extracted cybersecurity keywords from a cybersecurity pilot course, incorporating additional keywords from cybersecurity academic textbooks. We used the extracted keywords to create a co-occurrence matrix. Finally, we created a specific AI module using the co-occurrence matrix for a cybersecurity course within an academic institution.
Data Collection
For our data collection, we gathered over 2,000 cybersecurity research papers from top cybersecurity conferences and journals (e.g., NDSS, USENIX, ACM CCS, and IEEE S&P). We gathered papers concerning computer security topics. After we collected our raw data, we adjusted a pre-trained model to filter the papers that have a higher probability of using AI topics. We then divide the papers into three distinct categories: AI-positive, AI-neutral, and AI-negative. For our analysis and results, we only considered the AI-positive papers.
Data Processing Pipeline
We process our collection of raw data by passing it through the following pipeline. The scripts mentioned below are responsible for data cleaning, processing, and analysis. This pipeline can be used with a different collection of research papers than our own for analysis.
- AI Parser: This script processes the PDF files into plaintext RAW files.
- AI Cleaner: This script cleans the RAW files by removing stopwords and unnecessary data. It then exports CLEAN files.
- AI Helper: This code contains helper methods used by the main and processor scripts.
- AI Check: This code uses an API key to check if a text is related to AI.
- AI Terms: This code contains several lists consisting of AI and cybersecurity terms.
- Latent Dirichlet Algorithm (LDA): This analysis reads the CLEAN files to check if they contain AI terms. If a CLEAN file contains AI terms, the algorithm counts the number of AI and cybersecurity keywords and exports them into an Excel file.
Once the analysis is complete, an Excel file will be created named “Results.xlsx.” The files contain a list of documents containing machine learning-related terms after analysis is complete, and the machine learning terms used in the paper.
These Python scripts form our methodology and analysis of cybersecurity research papers regarding their implementation of Artificial Intelligence content and applications. In the figure below we illustrate our complete methodology in this paper.
Designing and Evaluating Curricular Modules for Integration of AI into Cybersecurity Education
Motivations & Scope
Artificial Intelligence (AI) has become a fundamental tool for cybersecurity researchers and practitioners. It is frequently used to address major security problems such as supply chain attacks, ransomware threats, and social engineering. Yet, the current cybersecurity curriculum still suffers from the absence of AI resources, particularly the detailed understanding of the appropriate AI mechanisms.
To address this, we consider a methodology where we design an AI lecture module that can be integrated into any cybersecurity course. We then present the module to several cybersecurity courses in our institution and assess their performance before and after the lecture. Our AI lecture is composed of a pre-lecture survey, the AI module, live AI examples, and a post-lecture survey.
Data Collection
For our data collection, we distributed two surveys within a lecture titled “A Lecture on Artificial Intelligence, Machine Learning, and Deep Learning: From Theory to Practice”. We collected data using the Qualtrics platform and distributed the surveys via anonymous links or anonymous QR codes. The first survey contains questions about demographics, cybersecurity, AI models, and their corresponding performance metrics. The second survey contains questions regarding demographics, AI metrics, AI models, Deep Learning, and AI training.
Analysis Pipeline
In our data analysis, we use several text preprocessing and Natural Language Processing techniques to ensure that our analysis is accurate and without bias. To that end, we implement 4 steps to gather our data and convert the survey responses into quantifiable data. These steps include our data collection, lecture extraction, survey analysis, and feedback analysis.
- Lecture Extraction: This code extracts all the text in a PDF file using PyPDF2. Convert our lecture (or your own AI module) into a PDF and rename the file to Final_AI_ML_Lecture.pdf.
- Topic Extraction: This code extracts topics and displays them as a topic distribution. Rename all survey CSV files to match the ones used in the code.
- Survey Analysis: This code analyzes and scores the student performance before and after the lecture. Rename all survey CSV files as necessary (including the CSV with correct answers). Remove any extra columns provided by Qualtrics, stopping at the column with participant IDs. Remove any rows that contain unnecessary data provided by Qualtrics, stopping at the question number row. Make sure to download the extracted lecture text file.
- Feedback Analysis: This code assesses feedback analysis using sentimentality models. Make sure to remove any unnecessary data provided by Qualtrics as done in the previous step. Rename any CSV files as necessary.
Through the implementation of these four steps and scripts, we analyze the reliability and efficacy of our Artificial Intelligence lecture module in delivering knowledge to cybersecurity students. Below we visualize our implemented methodology.

Figure: Diagram of our implemented methodology. Note that while not included here, we perform a feedback analysis separately.
AI and Cybersecurity Analysis
Co-occurrence Matrix Visualization
We include a way to visualize the results from the “Results.xlsx” Excel file. This method uses Jupyter notebook.
Run Concurrence.ipynb to visualize results from the Excel file.
These results are organized as “Computer Security Terms” and “Machine Learning Terms”.
- Co-occurrence Matrix: View full matrix.
Insight: Our findings reveal the various correlations between Cybersecurity and AI keywords present in recent research. We find the most commonly shown correlations and organize them as a matrix to represent the various co-occurrences. From this matrix, we find that the most frequently seen co-occurrences are between the AI and Cybersecurity keywords, model and evaluation, respectively. By using this matrix, we can create a dedicated lecture module that can be implemented into any cybersecurity course to enhance the quality of the course and bolster the students’ knowledge of AI.
Lecture Survey Analysis
Topic Distribution
These results are the topic distribution based on the pre-lecture and post-lecture surveys.
- Pre-lecture Topic Distribution: View Distribution
- Post-lecture Topic Distribution: View Distribution
Survey Grading
In this work, we consider only the survey-to-lecture analysis in our paper regarding student performance. However, here we show the results for the scores that students achieved after grading their answers for correctness.
- Survey Results:
Feedback Analysis
These results are based on the last questions present in the post-lecture survey regarding the lecture.
- Feedback Results:
Insight: Our findings show the efficacy of our AI lecture module. Students began using more AI-centric keywords after the lecture module, and they achieved high scores in the survey grading. Further analysis also reveals that the students saw an improvement in lecture-to-survey scores of up to 30% in the post-lecture results. Additionally, students received the lecture module well. In our feedback analysis, we find that we had high positive sentiment analysis scores and low negative scores. The relatively high neutral analysis scores appear to be due to the comments regarding lecture engagement by students. We will heavily consider their feedback to improve our lecture for future reproducibility.
Project Team Members
Publications:
- Yassine Mekdad, Alejandro Perez Pestaña, Abbas Acar, Mark A. Finlayson, Monique Ross, and A. Selcuk Uluagac. “Integrating Artificial Intelligence into Cybersecurity Education: A Comprehensive Data-Driven Approach.” TBD (2025) [pdf] [bibtex]
- Fernando Brito, Yassine Mekdad, Monique Ross, Mark A. Finlayson, and A. Selcuk Uluagac. “Enhancing Cybersecurity Education with Artificial Intelligence Content.” Proceedings of the 56th ACM Technical Symposium on Computer Science Education V. 1. (2025) [pdf] [bibtex]
Presentations and Talks:
- TBD [poster]








