Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about ABRW #2

Open
DreamerDW opened this issue Nov 21, 2019 · 2 comments
Open

question about ABRW #2

DreamerDW opened this issue Nov 21, 2019 · 2 comments

Comments

@DreamerDW
Copy link

Brilliant idea!
When I tried to apply ABRW to get the embedding of dataset DBLP, always reminded me of MemoryError. what is the problem? I'm very interested in your project,looking forward to your reply.

Thank you very much!
Best!

@DreamerDW
Copy link
Author

obtaining biased transition matrix where each row sums up to 1.0...
Traceback (most recent call last):
File "src/main.py", line 217, in
main(parse_args())
File "src/main.py", line 136, in main
walk_length=args.walk_length, window=args.window_size, workers=args.workers)
File "E:\paper\code\ABRW-master\src\libnrl\abrw.py", line 29, in init
self.T = self.get_biased_transition_mat(A=self.g.get_adj_mat(), X=self.g.get_attr_mat())#计算最后用到的偏置转移矩阵
File "E:\paper\code\ABRW-master\src\libnrl\abrw.py", line 68, in get_biased_transition_mat
T_A = row_as_probdist(A, preserve_zeros) # norm adj/struc info mat; for isolated node, return all-zeros row or all-1/m row
File "E:\paper\code\ABRW-master\src\libnrl\utils.py", line 42, in row_as_probdist
mat += sparse.csr_matrix(zero_rows.astype(int)).T.dot(sparse.csr_matrix(np.repeat(1 / mat.shape[1], mat.shape[1])))
File "F:\Aanconda3\envs\ABRW\lib\site-packages\scipy\sparse\base.py", line 361, in dot
return self * other
File "F:\Aanconda3\envs\ABRW\lib\site-packages\scipy\sparse\base.py", line 479, in mul
return self._mul_sparse_matrix(other)
File "F:\Aanconda3\envs\ABRW\lib\site-packages\scipy\sparse\compressed.py", line 502, in _mul_sparse_matrix
indices = np.empty(nnz, dtype=idx_dtype)
MemoryError

@houchengbin
Copy link
Owner

Hi,

Thank you for your interest.
In this very old implementation, we did not apply the sparse matrix technique to reduce memory usage, and hence this implementation requires very large memory for running large-scale networks. As you have noticed that we need to calculate a transition matrix, which is an n-by-n matrix where n is the number of nodes. Therefore, for the DBLP dataset with about 60,000 nodes, we need about (8*(60000)^2)/(1024^3) ~= 27G memory for store this transition matrix if each floating-point number occupies 8 bytes.
You got the MemoryError message. This is because your computer ran out of memory. To solve this problem/bug, we highly recommend that you follow the below suggestion.

As mentioned in the main page of this repository /~https://github.com/houchengbin/ABRW#notice, we have put all future updates at /~https://github.com/houchengbin/OpenANE. You can check and use the method called "ABRW" in OpenANE therein.
To be short, the main difference between them is that, we improve both time efficiency (using Ball-Tree KNN) and space efficiency (using sparse matrix technique) in the new version at /~https://github.com/houchengbin/OpenANE/blob/c7f06f54e5e6241edd93144347af5c29ea2a8d23/src/libnrl/abrw.py#L122
Let me know if there is any further problems/questions.

Best wishes,
Chengbin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants