Open-source Softwares

I believe in the power of open source to drive innovation, collaboration, and transparency. By making my work of research and useful tools publicly available and working on open source projects, I hope to facilitate the exchange of ideas and collaboration and to encourage the sharing of knowledge and expertise within the research and development community, and especially to help advance the state of the art of Natural Language and Machine Learning technologies.

Botnet detection with graph neural networks (cybersecurity application)
[Library][Published Data]

  • Large-scale and large-size graph datasets for learning and benchmarking topological botnet detection for cybersecurity
  • Collection of varying botnet topologies (both simulated and real) on real network connections and traffics
  • Deep graph neural networks with varying configurations (aggregations, residuals, etc.) that can be applied beyond

AMR Parsing with neural transition-based approaches
[Library]

  • Collection of state-of-the-art parsers for text-to-AMR, with structured alignments
  • Hybrid approach combining neural networks with state machines and controlled decoding (neurosymbolic AI)
  • Fast and efficient small models with effective inductive bias, and accurate bigger models with adapted pre-trained language models

Dataflow graph (executable programs) generation for task-oriented dialogue
[Library]

  • General model with copying mechanism for text-to-graph predictions without alignment
  • Highly efficient and accurate end-to-end models for Dataflow graphs based on Transformers
  • Applications in task-oriented dialogue and online semantic parsing with the graphs being executed as programs

Unsupervised text summarization with language models only
[Library]

  • Summarization as optimization to promote both content faithfulness and text fluency
  • Semantic matching of texts with intermediate representations of pre-trained language models, calibrating word likelihoods with neighborhood aggregation
  • Highly effective summarization on the fly with alignment information for interpretability