Constructing Flow Graphs from Procedural Cybersecurity Texts

Pal, Kuntal Kumar; Kashihara, Kazuaki; Banerjee, Pratyay; Mishra, Swaroop; Wang, Ruoyu; Baral, Chitta

Computer Science > Computation and Language

arXiv:2105.14357 (cs)

[Submitted on 29 May 2021]

Title:Constructing Flow Graphs from Procedural Cybersecurity Texts

Authors:Kuntal Kumar Pal, Kazuaki Kashihara, Pratyay Banerjee, Swaroop Mishra, Ruoyu Wang, Chitta Baral

View PDF

Abstract:Following procedural texts written in natural languages is challenging. We must read the whole text to identify the relevant information or identify the instruction flows to complete a task, which is prone to failures. If such texts are structured, we can readily visualize instruction-flows, reason or infer a particular step, or even build automated systems to help novice agents achieve a goal. However, this structure recovery task is a challenge because of such texts' diverse nature. This paper proposes to identify relevant information from such texts and generate information flows between sentences. We built a large annotated procedural text dataset (CTFW) in the cybersecurity domain (3154 documents). This dataset contains valuable instructions regarding software vulnerability analysis experiences. We performed extensive experiments on CTFW with our LM-GNN model variants in multiple settings. To show the generalizability of both this task and our method, we also experimented with procedural texts from two other domains (Maintenance Manual and Cooking), which are substantially different from cybersecurity. Our experiments show that Graph Convolution Network with BERT sentence embeddings outperforms BERT in all three domains

Comments:	13 pages, 5 pages, accepted in the Findings of ACL 2021
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Cite as:	arXiv:2105.14357 [cs.CL]
	(or arXiv:2105.14357v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2105.14357

Submission history

From: Kuntal Kumar Pal [view email]
[v1] Sat, 29 May 2021 19:06:35 UTC (5,598 KB)

Computer Science > Computation and Language

Title:Constructing Flow Graphs from Procedural Cybersecurity Texts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Constructing Flow Graphs from Procedural Cybersecurity Texts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators