EBV is one of the most common viruses in humans. It is a double-stranded DNA virus. Epstein-Barr virus (EBV) contributes to about 1.5% of all cases of human cancer worldwide, and viral genes are expressed in the malignant cells. EBV also very efficiently causes the proliferation of infected human B lymphocytes.
Method:
We use Metamap to find biological semantics in the literature. TF and TF-IDF are used to evaluate the importance of entities respectively. TF represents word frequency, and IDF in TF-IDF represents inverse document frequency, which can be used to filter high-frequency words that cannot represent a certain field. Here, the TF-IDF calculation is to construct the semantics of a carcinogenic viruses and eight carcinogenic viruses into text respectively for calculation.