需要关于此作业的帮助?欢迎联系我

INFS7450 Social Media Analytics Project 1 Fast Computation of User Centrality Measures

Semester 1, 2023

Goal: This project aims to implement a number of efficient algorithms to compute various centrality measures for user nodes such as PageRank and Betweenness. Students are required to finish this project individually.

Dataset: In this project, you will be working with publicly available Facebook social network data. The Facebook data has been anonymized by replacing the Facebook-internal ids for each user with a new value. The data contains 4039 nodes, and 88234 edges in total. Each line of the data represents an undirected link starting from one node to another.

The dataset is available from UQ blackboard. See /Assessment/INFS7450 Project One.

Tasks:

  1. Calculate the Betweenness Centrality for nodes in the Facebook dataset. (8 marks) Overview: write code to load the Facebook social network data and construct an undirected and unweighted graph. Based on the constructed graph, you are required to write a program to calculate the betweenness centralities for the graph vertices. Input: The provided Facebook social network data. Output: The top-10 nodes with the highest betweenness centralities.
  2. Calculate PageRank Centrality for nodes in the Facebook dataset. (7 marks) Overview: write code to load the Facebook social network data and construct an undirected and unweighted graph. Based on the constructed graph, you are required to write a program to calculate the PageRank (with 𝛼=0.85,𝛽=0.15) centralities for the graph vertices. Input: The provided Facebook social network data. Output: The top-10 nodes with the highest PageRank centralities. Requirements:
  3. You may use third-party libraries, such as NetworkX to read, load and manipulate the Facebook network dataset. However, you must write your own code to implement the function of node centrality calculation rather than using the third-part or built-in functions. (You can use any functions in NetworkX other than the functions for centrality calculation.)
  4. You are not allowed to use any generative models such as ChatGPT.
  5. You can refer to the codes provided in the tutorial, but are not allowed to directly reuse or copy them.
  6. You are not allowed to look at the code of any other student. All submitted codes and reports will be subject to electronic plagiarism.

Programming Languages: Python and NetworkX are recommended. However, you have your own choices of preferred programming languages including, but not limited to, Python, MATLAB, Java, C, C++, etc.

Deliverables (!!VERY IMPORTANT):

  1. A report (.pdf). See the given appendix for an example template. Submit your report in the PDF format, not in the word format!

  2. A source code file. For python users, organize your code in this way: