Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[➕ Feature]: workflow priority #2996

Open
00041275 opened this issue Jan 8, 2025 · 14 comments
Open

[➕ Feature]: workflow priority #2996

00041275 opened this issue Jan 8, 2025 · 14 comments
Labels
Feature A new feature

Comments

@00041275
Copy link
Contributor

00041275 commented Jan 8, 2025

add to workflow priority rule

@dosubot dosubot bot added the Feature A new feature label Jan 8, 2025
@shahargl
Copy link
Member

shahargl commented Jan 8, 2025

hey @00041275 - can you elaborate on what's the expected outcome?

@00041275
Copy link
Contributor Author

The thing is that the logic of prioritization is embedded in mapping, extracting. This is for the necessary/mandatory is done first, and then the other. In my case, I see cases of unnecessary events, where I would like to transfer the alert to the dismissed state before I enrich it in another workflow.

@00041275
Copy link
Contributor Author

hey @00041275 - can you elaborate on what's the expected outcome?

up

@Matvey-Kuk
Copy link
Contributor

@00041275, do you mean that some steps should be executed in "guaranteed time" or "retried" if they fail, but others shouldn't?

@00041275
Copy link
Contributor Author

@00041275, do you mean that some steps should be executed in "guaranteed time" or "retried" if they fail, but others shouldn't?

no, I just don't wanna call next workflow with enrichment if alert was dissmissed by previous "dismiss workflow"
for example, I have 3 workflow,

  1. check all I need and may be dismiss it
  2. enrich data (if no dismiss)
  3. check external system to resolve alert (if firing more then 1 minute)

@shahargl
Copy link
Member

Got ya. Yea it’s valid use case. Let me think if I have some quick win for that. Will update here.

@00041275
Copy link
Contributor Author

Got ya. Yea it’s valid use case. Let me think if I have some quick win for that. Will update here.

Can you give me feedback about workflow priority? Do you have quick win?

@Matvey-Kuk
Copy link
Contributor

@00041275 I could think of a workflow running with a delay and checking alert's enrichments in a condition.

Workflow A:

  • If smth, enrich alert with field=value.

Workflow B:

  • Sleep 5 sec.
  • If field!=value, proceed
  • Do smth.

Far from ideal, but may help.

@shahargl
Copy link
Member

@00041275 unfortunately couldn’t find a quick win, it requires some change in the way we trigger workflows now. I have some idea about design for that so if you want to collaborate on that I’ll be more than happy to

@00041275
Copy link
Contributor Author

@00041275 I could think of a workflow running with a delay and checking alert's enrichments in a condition.

Workflow A:

  • If smth, enrich alert with field=value.

Workflow B:

  • Sleep 5 sec.
  • If field!=value, proceed
  • Do smth.

Far from ideal, but may help.

This is a good compromise, thanks

@00041275
Copy link
Contributor Author

@00041275 unfortunately couldn’t find a quick win, it requires some change in the way we trigger workflows now. I have some idea about design for that so if you want to collaborate on that I’ll be more than happy to

please write your idea

@shahargl
Copy link
Member

shahargl commented Jan 17, 2025

Ok so, as always, I was thinking about doing it very simple - just add some depends attribute to workflow, and then somehow "wait" for it in _handle_event_workflows.

However, this naive approach falls down in two places:

  1. if workflow (2) depends on workflow (1), and workflow (1) enriches the alert, then somehow we need to update workflow's (2) alert so it will get the new alert
  2. this adds "state" to the workflowscheduler which we try to avoid.

So, the next approach was to do smth like this:

  1. add a new "workflowexecution status"

today we have

class WorkflowStatus(enum.Enum):
    SUCCESS = "success"
    ERROR = "error"
    PROVIDERS_NOT_CONFIGURED = "providers_not_configured"

so add

class WorkflowStatus(enum.Enum):
    SUCCESS = "success"
    ERROR = "error"
    PROVIDERS_NOT_CONFIGURED = "providers_not_configured"
    WAITING = "waiting"

and then, adjust _handle_event_workflows, change

        with self.lock:
            workflows_to_run, self.workflows_to_run = self.workflows_to_run, []

to smth like:

        with self.lock:
            workflows_to_run, self.workflows_to_run = self.workflows_to_run, []
        # find dependencies
        workflow_that_should_wait = [... for workflow in workflows_to_run]
        # create workflow executions with "waiting" status
        for wf in workflow_that_should_wait:
             create_workflow_execution(..., status=waiting, dependecies=dependencies)

then, we need to add a column to workflowexecution to hold all dependencies, and also adjust the main loop from:

def _start(self):
        self.logger.info("Starting workflows scheduler")
        while not self._stop:
            # get all workflows that should run now
            self.logger.debug(
                "Starting workflow scheduler iteration",
                extra={"current_number_of_workflows": len(self.futures)},
            )
            try:
                self._handle_interval_workflows()
                self._handle_event_workflows()
            except Exception:
                # This is the "mainloop" of the scheduler, we don't want to crash it
                # But any exception here should be investigated
                self.logger.exception("Error getting workflows that should run")
                pass
            self.logger.debug("Sleeping until next iteration")
            time.sleep(1)
        self.logger.info("Workflows scheduler stopped")

to

def _start(self):
        self.logger.info("Starting workflows scheduler")
        while not self._stop:
            # get all workflows that should run now
            self.logger.debug(
                "Starting workflow scheduler iteration",
                extra={"current_number_of_workflows": len(self.futures)},
            )
            try:
                self._handle_interval_workflows()
                self._handle_event_workflows()
                # add handle waiting
               self._handle_waiting_workflows()
            except Exception:
                # This is the "mainloop" of the scheduler, we don't want to crash it
                # But any exception here should be investigated
                self.logger.exception("Error getting workflows that should run")
                pass
            self.logger.debug("Sleeping until next iteration")
            time.sleep(1)
        self.logger.info("Workflows scheduler stopped")

@00041275
Copy link
Contributor Author

ok, try to test it and I think we can do 1 more : add sleep provider for idea @Matvey-Kuk without call python provider

@shahargl
Copy link
Member

I'm not sure maybe @Matvey-Kuk can answer - how will it update the alert?

If the first workflow enriches the alert with some attribute the second workflow needs - how will it work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature A new feature
Projects
None yet
Development

No branches or pull requests

3 participants