Skip to content

Commit

Permalink
restore gloo (#39163)
Browse files Browse the repository at this point in the history
  • Loading branch information
kuizhiqing authored Jan 25, 2022
1 parent 55418d3 commit faf517b
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions python/paddle/distributed/fleet/base/role_maker.py
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,14 @@ def init(rank, nodes, role):
self._worker_comm = gloo
# TODO (sandyhouse): initialize gloo for server and all

# the closing of kv server may cause gloo init failure
# since it depend on the full mesh connection
# e.g. 0 connected with 1,2,3 while 2-3 not connected yet
# TODO(kuizhiqing)
if start_http_server:
http_server_d["running"] = False
http_server.join()

def _get_rank_nodes(self, role):
nodes = 0
rank = -1
Expand Down

0 comments on commit faf517b

Please sign in to comment.