Email-Sec° project is aimed at the development of a holistic analysis framework, based on machine learning methods, which evaluates the potential risk posed by an email, in order to detect maliciousness. The framework analyzes all of the email's components, which include the header, body and attachments using machine learning methods.
Machine learning algorithms can provide a more comprehensive and enhanced evaluation of the potential risk posed by an email by discovering hidden patterns and traces for malicious activity. The potential risk score will be translated to classification (benign/malicious) by comparing to a determined threshold.
The Framework will also employ Natural Language Processing (NLP) and Hidden Markov Model (HMM). The NLP module is aimed at the analysis of the email’s textual content in order to find linguistic patterns that might contribute to the discrimination between ham and malicious emails. The HMM module is aimed at the analysis of URLs and email addresses and may contribute to their classification as benign or malicious.