Approximate High Dimensional Graph Mining With Matrix Polar Factorization: A Twitter Application

Document Type

Conference Proceeding

Source of Publication

2021 IEEE International Conference on Big Data

Publication Date



At the dawn of the Internet era graph analytics play an important role in high- and low-level network policymaking across a wide array of fields so diverse as transportation network design, supply chain engineering and logistics, social media analysis, and computer communication networks, to name just a few. This can be attributed not only to the size of the original graph but also to the nature of the problem parameters. For instance, algorithmic solutions depend heavily on the approximation criterion selection. Moreover, iterative or heuristic solutions are often sought as it is a high dimensional problem given the high number of vertices and edges involved as well as their complex interaction. Replacing under constraints a directed graph with an undirected one having the same vertex set is often sought in applications such as data visualization, community structure discovery, and connection-based vertex centrality metrics. Polar decomposition is a key matrix factorization which represents a matrix as a product of a symmetric positive (semi)definite factor and an orthogonal one. The former can be an undirected approximation of the original adjacency matrix. The proposed graph approximation has been tested with three Twitter graphs with encouraging results with respect to density, Fiedler number, and certain vertex centrality metrics based on matrix power series. The dataset was hosted in an online MongoDB instance.




Institute of Electrical and Electronics Engineers (IEEE)


Computer Sciences


Measurement, Symmetric matrices, Social networking (online), Blogs, Supply chains, Directed graphs, Transportation

Indexed in Scopus


Open Access