Chinese Journal of Intelligent Science and Technology ›› 2023, Vol. 5 ›› Issue (3): 330-342.doi: 10.11959/j.issn.2096-6652.202327

• Special Column: Intelligent Technology and Social Computing • Previous Articles     Next Articles

Analysis and prediction of GitHub company influence based on machine learning

Mingyu WANG1, Qingyuan GONG2, Jingjing QU3, Xin WANG1   

  1. 1 School of Computer Science, Fudan University, Shanghai 200438, China
    2 Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200438, China
    3 Shanghai Artificial Intelligent Laboratory, Shanghai 201210, China
  • Revised:2023-08-02 Online:2023-09-01 Published:2023-09-26
  • Supported by:
    The National Natural Science Foundation of China(62102094))

Abstract:

The influence of a company is not only related to its industry competitiveness, but also affects its public reputation and future development.However, there has been no unified standard for evaluating the influence of a company.GitHub is a representative open-source platform for software development code repositories.Existing research typically used the total number of stars a company receives for projects posted on GitHub to measure its influence, but this approach is difficult to measure the potential of small, micro, and nascent companies.The paper predicted the future influence level of a company by introducing the scientist's influence measure h-index, using GitHub as the information source, and modeling the company network.Features was extracted features based on this network to build the classifier, which predicted the future influence level of the company.The SHAP model explanation technique was further applied on this basis to identify the important features that determined the influence of a company.The experimental results showed that the XGBoost model achieved an accuracy of 0.92 and an average AUC of 0.93 on the real-world GitHub dataset.In summary, the proposed method could accurately and reliably predict the influence of companies.

Key words: online developer community, social network, machine learning, SHAP

CLC Number: 

No Suggested Reading articles found!